Slow query when apply @filter or order to predicates

Report a Dgraph Bug

What version of Dgraph are you using?

Docker dgraph/dgraph:v20.11.2
$ dgraph version
 
[Decoder]: Using assembly version of decoder
Page Size: 4096

Dgraph version   : v20.11.2
Dgraph codename  : tchalla-2
Dgraph SHA-256   : 0153cb8d3941ad5ad107e395b347e8d930a0b4ead6f4524521f7a525a9699167
Commit SHA-1     : 94f3a0430
Commit timestamp : 2021-02-23 13:07:17 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

Have you tried reproducing the issue with the latest release?

Yes, the latest release is v20.11.2

What is the hardware spec (RAM, OS)?

Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz
32GB Memory and 100GB SSD

Ubuntu 20.04.2 LTS
Docker version 20.10.5, build 55c4c88

Steps to reproduce the issue (command/config used to run Dgraph).

GraphQL schema that relevant

type OriginChapter {
  uId: ID!
  # ... other fields
  medias: [Media!] @hasInverse(field: origin_chapter)
}

type Media {
  uId: ID!
  # ... other fields
  publish_status: Int @search
  seq_no: Int! @search
  origin_chapter: OriginChapter
}

Total nodes count:
OriginChapter: 723,158
Media: 17,651,494
The OriginChapter has predicate media linked to 13 Media nodes

Queries for testing:

query {
  TEST_SLOW_QUERY(func: uid(0x21ce63))
  {
    uid
    medias: OriginChapter.medias # "total_ns": 2,628,078
    # medias: OriginChapter.medias (orderasc: Media.seq_no) # "total_ns": 81,951,741
    # medias: OriginChapter.medias @filter(eq(Media.publish_status, 1)) # "total_ns": 1,798,923,163
    # medias: OriginChapter.medias (orderasc: Media.seq_no) @filter(eq(Media.publish_status, 1)) #"total_ns": 1,941,999,605
    {
      uid
    }
  }
}

Expected behaviour and actual result.

  • No filter and sort: 2ms
  • No Filter with sort: 81ms => with only 13 linked nodes, this still seem too high for me
  • Filter without sort: 1799ms => too slow for this simple query
  • Filter without sort: 1941ms => too slow for this simple query

Jaeger tracing logs

no_filter_without_sort.json (7.6 KB) no_filter_with_sort.json (8.7 KB) filter_without_sort.json (12.6 KB) filter_with_sort.json (13.6 KB)

BTW, Jaeger tracing seem not showing sort span in Trace Timeline which take me a while to figure which causing slow query

After further testing, I’ve found that indexing seq_no and publish_status cause this worse performance
Look like other issue Filtering is slow on large amount of data - #6 by diggy

Removing these index help increase performance, but the problem that I need to query and apply the filter in the publish_status over GraphQL, removing its index mean removing filter on these field.

I think I am going to accept this as an issue.

1 Like

@chewxy any plan or updates for this issue ?

@hardik do you have any updates?

@taina we have started investigating this, from initial observations it seems how our indexes are utilized might be the reason behind it as you also pointed out. At this point we are trying to figure out how we can improve that process.

1 Like