Significant Performance Degradation with More Conditions

It seems that there’s performance degradation as more conditions/filters are added. For example:

{
  search(func: has(something)) @filter(eq(A, "valA")) {
    expand(_all_)
  }
}

The above returns one node after approximately 2.4ms. However, when an additional filter is applied:

{
  search(func: has(something)) @filter(eq(A, "valA") AND eq(B, "valB")) {
    expand(_all_)
}

The above now returns the same single node after approximately 5.7s, that’s more than 2,000x slower. Using that same example above without eq(A, "valA") and only filtering using eq(B, "valB") returns also after approximately 5.7s, this leads me to believe that the query will only be as fast as the slowest of the filters applied.

How are conditions evaluated?
What can I do to improve these types of queries?
Is this a problem with DGraph that needs addressing?

In my local testing, I’m actually filtering on five different predicate values (A, B, C, D, E) and it’s taking approximately 38s to return. The data points of my local data:

26,000 nodes
Predicates are indexed as hash

Can you try replacing the hash index with exact and see if that’s solves the problem. I suppose because of hash we are fetching lot of keys which can be avoided.

Wow, that made a world of a difference. Queries with a single filter run in about the same time as before (~2.4ms) while queries with five filters complete in about 17ms.

In the docs, regarding indexing, there’s a note that states:

The most performant index for eq is hash. Only use term or fulltext if you also require term or full text search. If you’re already using term, there is no need to use hash or exact as well.

If exact is this much more performant than hash for eq, why is hash recommended?

1 Like

That was recommended because in v1.0.4 we were loading all keys from the underlying store badger into memory (loadtoram) and keys with hash index have lesser size. Though, given that in master we have changed loadtoram to memorymap and hash index degrades performance because second lookup is required to verify the equality, we should update the docs.

2 Likes

I also faced the same problem. Updating docs would be very helpful.