How to improve query response time?

selmeci · November 8, 2019, 12:50pm

Hi,

i have simple dataset.
Nodes with element.hash predicate (count 765637).
I would like to find nodes which miss any of this predicates sort.by.created.at, sort.by.element.type, sort.by.modified.at and get first 10 of them.

My query is:

{
  page(func: has(element.hash), first: 10) @filter(NOT (has(sort.by.created.at) AND has(sort.by.element.type) AND has(sort.by.modified.at))) {
    uid
  }

}

Response time is always about 12~14sek.
I’m running dGraph v1.0.17 on 12 core CPU, 48 GB RAM and 960 SSD.

Thank you for any suggestion.

MichelDiz · November 8, 2019, 2:50pm

Upgrade to 1.1.0+

Change this to “type(Element)”

‘Has’ is a very wide range function. Using it on Root Query is asking for slowness. It does not take param ‘first’ into consideration. Recently there have been some improvements to the ‘has’ function. So upgrade is a good choice.

Cheers.

selmeci · November 8, 2019, 4:54pm

Upgrade to 1.1.x is not solution for now. We are in production environment and it’s not easy task to change whole codebase and queries.

Can you please add this information about has function and pagination to documentation? In the example first: 5 is used without any warning about possible performance issue (https://docs.dgraph.io/v1.0.17/query-language/#has)

Also until now, you have suggest to use has function as way for giving nodes a type (https://docs.dgraph.io/v1.0.17/howto/#giving-nodes-a-type) without any warning about this possible issue. @mrjn Are you consider to improve has function also in v1.0.x?

MichelDiz · November 8, 2019, 5:34pm

Yeah, cuz that was the old way to give nodes a type (it was in docs). As we now have the Type System. We no longer recommend this approach. And the Type System was created precisely to eliminate this approach.

Like I said, recently ‘has’ got some work on it. Now I can’t tell if versions before 1.1.X received the changes.

I can not because it is not official. What I am talking about is an idea of my personal experience. I don’t know how ‘has’ works internally for sure because I’m not a golang developer. I have a notion of what happens over the top. So I can not add anything without having the necessary knowledge.

What I can imagine from logic is that ‘has’ goes through all datasets indefinitely until it finds all entities that have that predicate. Imagine if you have millions.

As you can possibly add more and more entities with that predicate during usage. The ‘has’ never would be predictable. And it must always go through every dataset. And then it applies the pagination function with parameter ‘first’. So it takes time.

This is all theoretical, I can be completely wrong.

It is noteworthy that ‘Has’ does not follow indexing.

MichelDiz · November 8, 2019, 7:46pm

Check this example
https://github.com/dgraph-io/dgraph/pull/3970

selmeci · November 11, 2019, 7:55am

Good news

Is it going to be release in v1.0.18?

MichelDiz · November 11, 2019, 4:05pm

We have green light, but need to organize a cherry pick and a release for this tho.

https://github.com/dgraph-io/dgraph/pull/4264

Cheers.

Topic		Replies	Views
Two equivalent queries, one is slow and the other one is fast Dgraph	3	823	January 6, 2020
Significant Performance Degradation with More Conditions Dgraph	4	680	October 23, 2018
Query performance of large database (over 12g edges) Dgraph	5	1782	July 2, 2019
Is query performance degrade by "joins"? Dgraph	5	746	December 2, 2018
Slow query times for has() function (on play.dgraph.io) Dgraph dgraph , optimization , kind:bug , dql	9	891	April 26, 2021

How to improve query response time?

Related topics