Significant Performance Degradation with More Conditions

Andrew · April 9, 2018, 10:48pm

It seems that there’s performance degradation as more conditions/filters are added. For example:

{
  search(func: has(something)) @filter(eq(A, "valA")) {
    expand(_all_)
  }
}

The above returns one node after approximately 2.4ms. However, when an additional filter is applied:

{
  search(func: has(something)) @filter(eq(A, "valA") AND eq(B, "valB")) {
    expand(_all_)
}

The above now returns the same single node after approximately 5.7s, that’s more than 2,000x slower. Using that same example above without eq(A, "valA") and only filtering using eq(B, "valB") returns also after approximately 5.7s, this leads me to believe that the query will only be as fast as the slowest of the filters applied.

How are conditions evaluated?
What can I do to improve these types of queries?
Is this a problem with DGraph that needs addressing?

In my local testing, I’m actually filtering on five different predicate values (A, B, C, D, E) and it’s taking approximately 38s to return. The data points of my local data:

26,000 nodes
Predicates are indexed as hash

pawan · April 9, 2018, 11:28pm

Can you try replacing the hash index with exact and see if that’s solves the problem. I suppose because of hash we are fetching lot of keys which can be avoided.

Andrew · April 9, 2018, 11:49pm

Wow, that made a world of a difference. Queries with a single filter run in about the same time as before (~2.4ms) while queries with five filters complete in about 17ms.

In the docs, regarding indexing, there’s a note that states:

The most performant index for eq is hash. Only use term or fulltext if you also require term or full text search. If you’re already using term, there is no need to use hash or exact as well.

If exact is this much more performant than hash for eq, why is hash recommended?

pawan · April 10, 2018, 1:15am

That was recommended because in v1.0.4 we were loading all keys from the underlying store badger into memory (loadtoram) and keys with hash index have lesser size. Though, given that in master we have changed loadtoram to memorymap and hash index degrades performance because second lookup is required to verify the equality, we should update the docs.

spsneo · October 23, 2018, 6:31am

I also faced the same problem. Updating docs would be very helpful.

Topic		Replies	Views
Query performance with lots of conditions as anyofterms, hash etc Dgraph	1	823	April 10, 2018
Filter performance Dgraph	15	863	March 27, 2020
Has(pred) VS eq(pred, true) Dgraph	1	453	October 16, 2018
Query is very slow while adding le function for float predicate in filter Dgraph area:performance	6	1184	November 15, 2022
Filtering is slow on large amount of data Dgraph dgraph , status:accepted , priority:p1 , popular , area:performance	5	1152	June 15, 2020

Significant Performance Degradation with More Conditions

Related topics