Optimize query variables based on requested data

This query hangs:

{
  v as var(func: has(name))
  result(func: uid(v), first: 2) {
    uid
  }
}

This one doesn’t(obviously):

{
  result(func: has(name), first: 2) {
    uid
  }
}

I guess first query hangs since Dgraph has to find first all nodes that match the func(has name) in the var, and then it returns first 2 results.

It would be great if Dgraph would evaluate vars based on blocks that return the value. So that in this case it would have to find only 2 nodes.

Above written example is just for simplicity but there are real cases when this could be problematic.

For example if we have pagination hence first and offset/after is dynamic(maybe using graphql vars). In such case we wouldn’t be able to share same func for multiple queries. For example:

query test($f1: int, $f2: int, $o1: int, $o2: int) {
  matches as var(func: term(name, 'text'))
  q1(func: uid(matches), first: $f1, offset: $o1) {
    ...
  }
  q2(func: uid(matches), first: $f2, offset: $o2) {
    ...
  }
}

Above query will hang if too many nodes are found with term ‘text’.

Right now would have to write:

query test($f1: int, $f2: int, $o1: int, $o2: int) {
  q1(func: term(name, 'text'), first: $f1, offset: $o1) {
    ...
  }
  q2(func: term(name, 'text'), first: $f2, offset: $o2) {
    ...
  }
}

Which is slower than it can be since it has to find nodes with term ‘text’ twice.

1 Like

Same thing is if you have two queries with different filters. Right now reusing func is a bad idea…

Hey @zura,

That’s a great suggestion.

@pawan are we working on anything similar to this? If not, shall we create a feature requst?

1 Like

This is an interesting idea @zura and can be worked upon as part of improving the query planning. We are not planning to work on this right now but might take it up in the next quarter.

2 Likes