How to improve query response time?


#1

Hi,

i have simple dataset.
Nodes with element.hash predicate (count 765637).
I would like to find nodes which miss any of this predicates sort.by.created.at, sort.by.element.type, sort.by.modified.at and get first 10 of them.

My query is:

{
  page(func: has(element.hash), first: 10) @filter(NOT (has(sort.by.created.at) AND has(sort.by.element.type) AND has(sort.by.modified.at))) {
    uid
  }

}

Response time is always about 12~14sek.
I’m running dGraph v1.0.17 on 12 core CPU, 48 GB RAM and 960 SSD.

Thank you for any suggestion.


(Michel Conrado) #2

Upgrade to 1.1.0+

Change this to “type(Element)”

‘Has’ is a very wide range function. Using it on Root Query is asking for slowness. It does not take param ‘first’ into consideration. Recently there have been some improvements to the ‘has’ function. So upgrade is a good choice.

Cheers.


#3

Upgrade to 1.1.x is not solution for now. We are in production environment and it’s not easy task to change whole codebase and queries.

Can you please add this information about has function and pagination to documentation? In the example first: 5 is used without any warning about possible performance issue (https://docs.dgraph.io/v1.0.17/query-language/#has)

Also until now, you have suggest to use has function as way for giving nodes a type (https://docs.dgraph.io/v1.0.17/howto/#giving-nodes-a-type) without any warning about this possible issue. @mrjn Are you consider to improve has function also in v1.0.x?


(Michel Conrado) #4

Yeah, cuz that was the old way to give nodes a type (it was in docs). As we now have the Type System. We no longer recommend this approach. And the Type System was created precisely to eliminate this approach.

Like I said, recently ‘has’ got some work on it. Now I can’t tell if versions before 1.1.X received the changes.

I can not because it is not official. What I am talking about is an idea of ​​my personal experience. I don’t know how ‘has’ works internally for sure because I’m not a golang developer. I have a notion of what happens over the top. So I can not add anything without having the necessary knowledge.

What I can imagine from logic is that ‘has’ goes through all datasets indefinitely until it finds all entities that have that predicate. Imagine if you have millions.

As you can possibly add more and more entities with that predicate during usage. The ‘has’ never would be predictable. And it must always go through every dataset. And then it applies the pagination function with parameter ‘first’. So it takes time.

This is all theoretical, I can be completely wrong.

It is noteworthy that ‘Has’ does not follow indexing.


(Michel Conrado) #5

Check this example


#6

Good news :slight_smile:

Is it going to be release in v1.0.18?


(Michel Conrado) #7

We have green light, but need to organize a cherry pick and a release for this tho.

Cheers.