Offset appears not to work with multiple order statements

Using this query:

bills(func: has(billsType), orderdesc: date, offset: 5, first: 5) {
  expand(_all_)
}

Everything works as expected (I get result 5 to 10). But when I have:

bills(func: has(billsType), orderdesc: date, orderdesc: time, offset: 5, first: 5) {
   expand(_all_)
}

I get the same results whether offset is set to 0 or to 5. Have I made a mistake with the syntax for orderdesc? Do I need to group the order statements somehow for offset to take both into account?

UPDATE REF: Offset fails with multiple order statements Ā· Issue #3366 Ā· dgraph-io/dgraph Ā· GitHub

One rule for sorting is to have any value indexed. If not so, Dgraph will ignore them. Add any dummy date to the other nodes.

1 Like

Hey Michel, thanks for replying!

Both ā€˜dateā€™ and ā€˜timeā€™ values are indexed. Offset works fine if I use only one or the other for ordering, but not when I use both. It really seems to be the combination of using both that causes the issue.

Whatā€™s more, the ordering is fine; the bills are ordered first by date, and then by time (as expected). It is ā€˜offsetā€™ that isnā€™t working. If I use offset: 0, first: 10 I do indeed get the first ten values. But if I change the value of offset nothing happens. That is unless I go over a certain value, which happens to correspond to the number of bills that have the same date.

This suggests to me that, though the results are ordered correctly, offset only discards the first N items of the results ordered only by the first orderdesc statement (the ā€˜dateā€™ statement).

The idea of ā€‹ā€‹pagination is simple. If you have 10 nodes and use ā€œoffset: 0, first: 5ā€ you will have two pages. Because the pagination will divide 10 nodes by 5. If you use ā€œoffset: 5*, first: 5ā€ you will be on the second page. And if you use ā€œoffset: 10*, first: 5ā€ will not return anything, for there are no more pages.

Remembering that the paging system is zero based. So the first page is offset: 0

Maybe I did not understand your issue, if my answer is not satisfactory please provide examples and steps to reproduce.

*Iā€™ve fixed some concepts in this response.

Hey there Michel!

I think you havenā€™t understood. First of all, what you say is inaccurate. Using ā€œoffset: 1, first: 5ā€ will show me results 2-6 as opposed to 1-5. To display the second page I would use ā€œoffset: 5, first: 5ā€.

The problem is that offset doesnā€™t work when using multiple order statements. To demonstrate this Iā€™ll give you a test mutation:

{
	"set": [
    {
      "date": "2019-04-30",
      "time": "1970-01-01T09:45:12",
      "num": 1
		},
    {
      "date": "2019-04-30",
      "time": "1970-01-01T12:02:39",
      "num": 2
    },
    {
      "date": "2019-04-30",
      "time": "1970-01-01T14:20:06",
      "num": 3
    },
    {
      "date": "2019-04-30",
      "time": "1970-01-01T19:29:57",
      "num": 4
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T09:05:50",
      "num": 5
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T11:24:51",
      "num": 6
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T14:25:03",
      "num": 7
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T16:18:40",
      "num": 8
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T19:41:37",
      "num": 9
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T21:30:16",
      "num": 10
    },
    {
      "date": "2019-05-01",
      "time": "1970-01-01T22:15:49",
      "num": 11
    }
  ]
}

Donā€™t forget to set the index for ā€œdateā€ and ā€œtimeā€ predicates to ā€œdatetimeā€.

So first of all letā€™s just order by ā€œnumā€ and see that offset is working fine:

query {
  fun(func: has(date), orderdesc: num, offset: 0, first: 5) {
    num
  }
}

Provides the result:

"data": {
    "fun": [
      {
        "num": 11
      },
      {
        "num": 10
      },
      {
        "num": 9
      },
      {
        "num": 8
      },
      {
        "num": 7
      }
    ]
  }

Now letā€™s use ā€œoffset: 2, first: 5ā€ which I in theory should give us, in descending order, nums 9-5, which indeed it does:

query {
  fun(func: has(date), orderdesc: num, offset: 2, first: 5) {
    num
  }
}
"data": {
    "fun": [
      {
        "num": 9
      },
      {
        "num": 8
      },
      {
        "num": 7
      },
      {
        "num": 6
      },
      {
        "num": 5
      }
    ]
  }

So far everything is looking great.

So now letā€™s start ordering by date and time. I have them in separate predicates for reasons of searching. Anyway, if I just use ā€œorderdesc: dateā€ we get chaos!

query {
  fun(func: has(date), orderdesc: date) {
    num
  }
}
"data": {
    "fun": [
      {
        "num": 10
      },
      {
        "num": 9
      },
      {
        "num": 11
      },
      {
        "num": 5
      },
      {
        "num": 6
      },
      {
        "num": 7
      },
      {
        "num": 8
      },
      {
        "num": 2
      },
      {
        "num": 3
      },
      {
        "num": 4
      },
      {
        "num": 1
      }
    ]
  }

But this of course makes sense - a lot of the nodes have the same date. We need to sort by time as well:

query {
  fun(func: has(date), orderdesc: date, orderdesc: time) {
    num
  }
}
"data": {
    "fun": [
      {
        "num": 11
      },
      {
        "num": 10
      },
      {
        "num": 9
      },
      {
        "num": 8
      },
      {
        "num": 7
      },
      {
        "num": 6
      },
      {
        "num": 5
      },
      {
        "num": 4
      },
      {
        "num": 3
      },
      {
        "num": 2
      },
      {
        "num": 1
      }
    ]
  }

Now thatā€™s what we expect. But now when I use offset, curiously enough itā€™s fine when I have ā€œoffset: 0, first: 5ā€ because we get the expected result:

query {
  fun(func: has(date), orderdesc: date, orderdesc: time, offset: 0, first: 5) {
    num
  }
}
"data": {
    "fun": [
      {
        "num": 11
      },
      {
        "num": 10
      },
      {
        "num": 9
      },
      {
        "num": 8
      },
      {
        "num": 7
      }
    ]
  }

But look what happens when I change offset:

query {
  fun(func: has(date), orderdesc: date, orderdesc: time, offset: 2, first: 5) {
    num
  }
}
"data": {
    "fun": [
      {
        "num": 11
      },
      {
        "num": 8
      },
      {
        "num": 7
      },
      {
        "num": 6
      },
      {
        "num": 5
      }
    ]
  }

Or if we try another value for offset:

query {
  fun(func: has(date), orderdesc: date, orderdesc: time, offset: 6, first: 5) {
    num
  }
}
"data": {
    "fun": [
      {
        "num": 8
      },
      {
        "num": 4
      },
      {
        "num": 3
      },
      {
        "num": 2
      },
      {
        "num": 1
      }
    ]
  }

It appears to be random! But if you play around with it, things become clear.

Remember earlier when I ordered only by date? Well the ā€œoffsetā€ value that Iā€™m using is discarding the first N results of that result, but then everything afterwards, including the value of ā€œfirstā€ and the ā€œorderdesc: timeā€ is working as expected.

When I used ā€œoffset: 2, first:5ā€ with ā€œorderdesc: date, orderdesc: timeā€ we jump straight from 11 to 8 because 9 and 10 have been discarded by offset. How do I know 9 and 10 are discarded? Well when I do ā€œorderdesc:dateā€ alone those are the first two values returned.

Now because of the nature of dgraph the specifics you get if you run this may be different, but the behaviour is consistent. ā€œOffsetā€ appears not to take into account order calls subsequent to the first.

Either that or Iā€™m doing something completely wrong. But then why do the two orderdesc statements work correctly in every other aspect apart from ā€œoffsetā€?

Sorry if that explanation was very longwinded! I just though an exhaustive example would make the problem as clear as possible.

Youā€™re right. I confused myself. In fact we have to do the division and use the multiples manually.
If you divide a chain of nodes by 5 (first: 5 of 25 nodes), you have to use 0, 5, 10, 15, 20 to page - so you have 5 pages. If you use a chain of nodes divided by first: 2 the pages will be 0, 2, 4, 6, 8, 10 so on.

{
  q(func: has(position),   orderasc: pos,  first: 2,   offset: 2){
    uid
    position
    pos
  }
}
{
  "set": [
      {
        "position": "First",
        "pos": "1"
      },
      {
        "position": "second",
        "pos": "2"
      },
      {
        "position": "third",
        "pos": "3"
      },
      {
        "position": "fourth",
        "pos": "4"
      },
      {
        "position": "fifth",
        "pos": "5"
      },
      {
        "position": "sixth",
        "pos": "6"
      },
      {
        "position": "seventh",
        "pos": "7"
      },
      {
        "position": "eighth",
        "pos": "8"
      },
      {
        "position": "ninth",
        "pos": "9"
      },
      {
        "position": "tenth",
        "pos": "10"
      }
    ]
}

Now back to your case, Iā€™ll make some tests. Need to check if the issues would be related to datetime only. I guess is the case. Cuz I see the ā€œnumā€: 9, ā€œnumā€: 10 and ā€œnumā€: 11 with perfect date and time alignment. It feels like a bug.

In fact detailed explanations are the best :stuck_out_tongue:

Hi Michel,

I donā€™t thing that it is related only with datetime. I have similar problem with string predicates.

Selmeci is right. Iā€™ve just run the same test using strings rather than datetime. You can try yourself, using this data:

{
	"set": [
    {
      "string1": "A",
      "string2": "A",
      "num": 1
		},
    {
      "string1": "A",
      "string2": "B",
      "num": 2
    },
    {
      "string1": "A",
      "string2": "C",
      "num": 3
    },
    {
      "string1": "A",
      "string2": "D",
      "num": 4
    },
    {
      "string1": "B",
      "string2": "A",
      "num": 5
    },
    {
      "string1": "B",
      "string2": "B",
      "num": 6
    },
    {
      "string1": "B",
      "string2": "C",
      "num": 7
    },
    {
      "string1": "B",
      "string2": "D",
      "num": 8
    },
    {
      "string1": "B",
      "string2": "E",
      "num": 9
    },
    {
      "string1": "B",
      "string2": "F",
      "num": 10
    },
    {
      "string1": "B",
      "string2": "G",
      "num": 11
    }
  ]
}

Now these two queries will both provide the results in the same order:

query {
  fun(func: has(num), orderdesc: num) {
    expand(_all_)
  }
}
query {
  fun(func: has(num), orderdesc: string1, orderdesc: string2) {
    expand(_all_)
  }
}

But if I use offset on the second query, I get the same incorrect results as I laid out in my earlier post. And once again, the errors correspond to ā€œoffsetā€ discarding the first N items of only the first order command, and not the combined order commands.

Iā€™m just surprised this hasnā€™t come up earlier. I canā€™t be the first person to try to paginate data with multiple order statements?

I guess if the strings have the right order will be okay tho. Your example confuses Dgraphā€™s sorting.

Page 1

{
  fun(func: has(num), orderasc: string1, orderasc: string2, 
  first: 5, offset: 0) {
    expand(_all_)
  }
}

Result

{
  "data": {
    "fun": [
      {
        "num": 1,
        "string1": "L",
        "string2": "A",
        "uid": "0x7"
      },
      {
        "num": 2,
        "string1": "M",
        "string2": "B",
        "uid": "0x8"
      },
      {
        "num": 3,
        "string1": "N",
        "string2": "C",
        "uid": "0x1"
      },
      {
        "num": 4,
        "string1": "O",
        "string2": "D",
        "uid": "0x2"
      },
      {
        "num": 5,
        "string1": "P",
        "string2": "E",
        "uid": "0x3"
      }
    ]
  }
}

Page 2

{
  fun(func: has(num), orderasc: string1, orderasc: string2, 
  first: 5, offset: 5) {
    expand(_all_)
  }
}

Result

{
  "data": {
    "fun": [
      {
        "num": 6,
        "string1": "Q",
        "string2": "F"
      },
      {
        "num": 7,
        "string1": "R",
        "string2": "G"
      },
      {
        "num": 8,
        "string1": "S",
        "string2": "H"
      },
      {
        "num": 9,
        "string1": "T",
        "string2": "I"
      },
      {
        "num": 10,
        "string1": "U",
        "string2": "J"
      }
    ]
  }
}

The data youā€™ve used makes the second order command redundant. The second order command only comes into play when the first order command returns an equal result. You would need several entries that have equal values for ā€œstring1ā€ for the second order command to be used.

If you use the data example I provided, you can see this working as intended. Everything working as intended, apart from ā€œoffsetā€.

I made like this to show how it behaves. I did a continuous ordering via two different predicates. That way Iā€™m pretty sure Dgraph is applying the two parameters correctly.

By my tests everything is normal. Dgraph first resolves the order of ā€œstring1ā€ and then ā€œstring2ā€.

BTW, you have to put in mind that this sorting are basically string (alphabet) ordering. The numbers (added to represent position) would not always correspond to the alphabet ordering when you have 2 or more referentials. And It often has repeated values in some predicates and some others do not. This confuses the ordering. It will work by ā€œweightā€ and will depends on the size of the ā€œpageā€.

The biggest issue here is to harmonize multiple (concurrent) ordering sources. So the result is computed in a way that takes in consideration both/all sources of order.

Take this example:

In the example below I added ā€œnumā€ as sorting main weight. That way in the pagination of larger numbers (multiples) you will have ā€œnumā€ as the pivot of the ordering. If you decrease the size of the pagination (from 5 to 3 or 2 in ā€œfirstā€ param) you will have a bigger weight dispute between the three parameters. So it gets odd (But it is expected, since the predicates have different or repeated characters).

using these parameters below you will have the correct order. Precisely because ā€œnumā€ was added as an important weight for sorting .
page 1 = ā€œoffset: 0, first: 5ā€
page 2 = ā€œoffset: 5, first: 5ā€
page 3 = ā€œoffset: 10, first: 5ā€

{
  fun(func: has(num), 
  orderdesc: num, 
  orderdesc: string1, 
  orderdesc: string2,
  offset: 5, first: 5
) {
    expand(_all_)
  }
}

If you still think we have a Bug or something. Please feel free to open a issue so a Core Dev can investigate deeply.

Ok. Iā€™m 95% sure this is a bug, and I guess I should open an issue with the Core Dev, but I still consider myself a dgraph noob and I want to convince you first, Michel, because youā€™ve been so helpful all along.

Maybe this makes more sense with real data that makes sense, so how about this? We have users where their first name and last name are stored separately. Imagine we have two users, one called ā€œMichel Beesonā€ and one called ā€œMichel Conradā€. Now in dgraph, if we were to order only by first name, thereā€™s no guarantee that Michel Beeson should appear before Michel Conrad. We would have to say ā€œfirst, order by first name, and if the first name is the same, then order by last nameā€. So hereā€™s a mutation that creates a bunch of users, who are either called ā€œMichelā€ or ā€œSusanā€, but they have different surnames.

I will also add a predicate called ā€œnaturalOrderā€ which corresponds to the order that you would expect (ascending).

{
	"set": [
    {
      "firstName": "Michel",
      "lastName": "Antwerp",
      "naturalOrder": 1
		},
     {
      "firstName": "Michel",
      "lastName": "Beeson",
      "naturalOrder": 2
		},
     {
      "firstName": "Michel",
      "lastName": "Conrad",
      "naturalOrder": 3
		},
     {
      "firstName": "Michel",
      "lastName": "Fittspatrick",
      "naturalOrder": 4
		},
     {
      "firstName": "Michel",
      "lastName": "Wallace",
      "naturalOrder": 5
		},
     {
      "firstName": "Michel",
      "lastName": "Zabala",
      "naturalOrder": 6
		},
    {
      "firstName": "Susan",
      "lastName": "Button",
      "naturalOrder": 7
		},
    {
      "firstName": "Susan",
      "lastName": "Dillan",
      "naturalOrder": 8
		},
    {
      "firstName": "Susan",
      "lastName": "Grande",
      "naturalOrder": 9
		},
    {
      "firstName": "Susan",
      "lastName": "Jackson",
      "naturalOrder": 10
		},
    {
      "firstName": "Susan",
      "lastName": "Lawrence",
      "naturalOrder": 11
		},
    {
      "firstName": "Susan",
      "lastName": "Palmer",
      "naturalOrder": 12
		},
    {
      "firstName": "Susan",
      "lastName": "Ralley",
      "naturalOrder": 13
		},
    {
      "firstName": "Susan",
      "lastName": "Trenton",
      "naturalOrder": 14
		}
    ]
}

Now Iā€™m going to do some simple queries without pagination, just to be extra annoying and exhaustive. If I just do:

query {
  users(func: has(firstName)) {
    expand(_all_)
  }
}

I get:

"data": {
    "users": [
      {
        "lastName": "Beeson",
        "naturalOrder": 2,
        "firstName": "Michel"
      },
      {
        "lastName": "Wallace",
        "naturalOrder": 5,
        "firstName": "Michel"
      },
      {
        "lastName": "Button",
        "naturalOrder": 7,
        "firstName": "Susan"
      },
      {
        "lastName": "Dillan",
        "naturalOrder": 8,
        "firstName": "Susan"
      },
      {
        "lastName": "Conrad",
        "naturalOrder": 3,
        "firstName": "Michel"
      },
      {
        "lastName": "Jackson",
        "naturalOrder": 10,
        "firstName": "Susan"
      },
      {
        "lastName": "Trenton",
        "naturalOrder": 14,
        "firstName": "Susan"
      },
      {
        "lastName": "Zabala",
        "naturalOrder": 6,
        "firstName": "Michel"
      },
      {
        "lastName": "Grande",
        "naturalOrder": 9,
        "firstName": "Susan"
      },
      {
        "lastName": "Lawrence",
        "naturalOrder": 11,
        "firstName": "Susan"
      },
      {
        "lastName": "Antwerp",
        "naturalOrder": 1,
        "firstName": "Michel"
      },
      {
        "lastName": "Fittspatrick",
        "naturalOrder": 4,
        "firstName": "Michel"
      },
      {
        "lastName": "Palmer",
        "naturalOrder": 12,
        "firstName": "Susan"
      },
      {
        "lastName": "Ralley",
        "naturalOrder": 13,
        "firstName": "Susan"
      }
    ]
  }

Totally random and as it should be. So lets sort by firstName descending:

query {
  users(func: has(firstName), orderdesc: firstName) {
    expand(_all_)
  }
}

The results are:

"data": {
    "users": [
      {
        "lastName": "Grande",
        "naturalOrder": 9,
        "firstName": "Susan",
        "uid": "0x2724"
      },
      {
        "lastName": "Palmer",
        "naturalOrder": 12,
        "firstName": "Susan",
        "uid": "0x2728"
      },
      {
        "lastName": "Button",
        "naturalOrder": 7,
        "firstName": "Susan",
        "uid": "0x271e"
      },
      {
        "lastName": "Dillan",
        "naturalOrder": 8,
        "firstName": "Susan",
        "uid": "0x271f"
      },
      {
        "lastName": "Lawrence",
        "naturalOrder": 11,
        "firstName": "Susan",
        "uid": "0x2725"
      },
      {
        "lastName": "Jackson",
        "naturalOrder": 10,
        "firstName": "Susan",
        "uid": "0x2721"
      },
      {
        "lastName": "Trenton",
        "naturalOrder": 14,
        "firstName": "Susan",
        "uid": "0x2722"
      },
      {
        "lastName": "Ralley",
        "naturalOrder": 13,
        "firstName": "Susan",
        "uid": "0x2729"
      },
      {
        "lastName": "Zabala",
        "naturalOrder": 6,
        "firstName": "Michel",
        "uid": "0x2723"
      },
      {
        "lastName": "Conrad",
        "naturalOrder": 3,
        "firstName": "Michel",
        "uid": "0x2720"
      },
      {
        "lastName": "Antwerp",
        "naturalOrder": 1,
        "firstName": "Michel",
        "uid": "0x2726"
      },
      {
        "lastName": "Fittspatrick",
        "naturalOrder": 4,
        "firstName": "Michel",
        "uid": "0x2727"
      },
      {
        "lastName": "Wallace",
        "naturalOrder": 5,
        "firstName": "Michel",
        "uid": "0x271d"
      },
      {
        "lastName": "Beeson",
        "naturalOrder": 2,
        "firstName": "Michel",
        "uid": "0x271c"
      }
    ]
  }

Well we get the ā€œSusansā€ before the ā€œMichelsā€, but when it comes to their last names, the order is random. Thatā€™s not what we want. When the first names are the same, we want to sort by last name, so we use the following query:

query {
  users(func: has(firstName), orderdesc: firstName, orderdesc: lastName) {
    expand(_all_)
  }
}

And we get the result:

"data": {
    "users": [
      {
        "naturalOrder": 14,
        "firstName": "Susan",
        "lastName": "Trenton"
      },
      {
        "naturalOrder": 13,
        "firstName": "Susan",
        "lastName": "Ralley"
      },
      {
        "naturalOrder": 12,
        "firstName": "Susan",
        "lastName": "Palmer"
      },
      {
        "naturalOrder": 11,
        "firstName": "Susan",
        "lastName": "Lawrence"
      },
      {
        "naturalOrder": 10,
        "firstName": "Susan",
        "lastName": "Jackson"
      },
      {
        "naturalOrder": 9,
        "firstName": "Susan",
        "lastName": "Grande"
      },
      {
        "naturalOrder": 8,
        "firstName": "Susan",
        "lastName": "Dillan"
      },
      {
        "naturalOrder": 7,
        "firstName": "Susan",
        "lastName": "Button"
      },
      {
        "naturalOrder": 6,
        "firstName": "Michel",
        "lastName": "Zabala"
      },
      {
        "naturalOrder": 5,
        "firstName": "Michel",
        "lastName": "Wallace"
      },
      {
        "naturalOrder": 4,
        "firstName": "Michel",
        "lastName": "Fittspatrick"
      },
      {
        "naturalOrder": 3,
        "firstName": "Michel",
        "lastName": "Conrad"
      },
      {
        "naturalOrder": 2,
        "firstName": "Michel",
        "lastName": "Beeson"
      },
      {
        "naturalOrder": 1,
        "firstName": "Michel",
        "lastName": "Antwerp"
      }
    ]
  }

Aha! Perfect ordering! This is exactly what I want. I want to emphasize that the ordering is not confused. Dgraph has it exactly right. The bug that Iā€™m talking about is only an issue when it comes to pagination. So letā€™s try some pagination now. I wonā€™t bother with ā€œoffset: 0ā€ because then there is no problem. Letā€™s try this:

query {
  users(func: has(firstName), orderdesc: firstName, orderdesc: lastName, offset: 3, first: 8) {
    expand(_all_)
  }
}

The results are:

 "data": {
    "users": [
      {
        "lastName": "Trenton",
        "naturalOrder": 14,
        "firstName": "Susan"
      },
      {
        "lastName": "Ralley",
        "naturalOrder": 13,
        "firstName": "Susan"
      },
      {
        "lastName": "Lawrence",
        "naturalOrder": 11,
        "firstName": "Susan"
      },
      {
        "lastName": "Jackson",
        "naturalOrder": 10,
        "firstName": "Susan"
      },
      {
        "lastName": "Dillan",
        "naturalOrder": 8,
        "firstName": "Susan"
      },
      {
        "lastName": "Zabala",
        "naturalOrder": 6,
        "firstName": "Michel"
      },
      {
        "lastName": "Wallace",
        "naturalOrder": 5,
        "firstName": "Michel"
      },
      {
        "lastName": "Fittspatrick",
        "naturalOrder": 4,
        "firstName": "Michel"
      }
    ]
  }

What happened?! Exactly what I pointed out in earlier posts. Offset is discarding the first 3 results of ā€œorderdesc: firstNameā€ and ignoring the second order command. This is not an ordering issue; when I perform the query without pagination the order is correct. The problem is with offset.

Please try this with the dataset I have provided. Youā€™ll see that the issue is real. Tomorrow Iā€™ll happily open this as an issue with the Core Dev, just as soon as I figure out how to do that. But Michel, youā€™ve been a real super hero for me, so I feel itā€™s only right that I get you on my side.

1 Like

I can confirm that the problem is related to offset. I have a similar problem on my data sample.

Iā€™ve filed a bug report of the dgraph github page. Fingers crossed it works! Iā€™m still willing to believe that thereā€™s a mistake in my syntax

Any chance of some comment on this? Is it indeed a bug? If not, could you tell me how to paginate data ordered in the way I outline in my last example (firstName, lastName)?

Iā€™ll investigate carefully what you wrote.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.