Thoughts on adding nil values in response of query ordered by indexed predicate

Problem

As reported by @ahsan here, dgraph returns inconsistent response of sort queries. Consider the query below

data(func: has(age)){
    t as uid
}
task(func: uid(t), orderasc: class){
    uid 
    name
}

If class is an indexed predicate response might be different compared to the case when class is non-indexed predicate.
For ordering by indexed predicates, we call sortWithIndex and sortWithoutIndex. Response from the fastest call is returned to the user.
sortWithIndex uses badger iteration to accumulate uids in sorted order. Uids that have nil values for orderby predicate aren’t encountered during badger iteration, so they aren’t included in the result. sortWithoutIndex takes uids from uidMatrix and fetches their value for orderby predicate from badger and sorts them later. Uids having nil value for orderby predicate are included in this response.
The difference in sorting mechanisms of sortWithIndex and sortWithoutIndex leads to inconsistent responses.

Solution approaches

  1. Remove sortWithIndex. Let sortWithoutIndex run in all cases.
  2. Add nil values manually in result returned by sortWithIndex.

Explanation

Second approach makes no sense. In this, there is no way to find out which uids have nil values, other than actually fetching values for all the uids from badger. If we are going to fetch all the values from badger we might as well use sortWithoutIndex.

More thoughts on this ?? @pawan @mrjn @martinmr

Had raised this earlier in the expected behaviour of sort queries.

I guess it is easier to modify sortWithoutIndex to drop nil values and still keep the performance benefit which we get from sortWithIndex and bring the results in-line.

FWIW SQL DBs though, do return nodes with nil values:

1 Like