Hey there. I came across this in the docs regarding sorting by facet:
https://dgraph.io/docs/query-language/#sorting-using-facets
Sorting is possible for a facet on a uid edge.
I was hoping someone could explain why sorting by facet is not supported for scalar predicates - if it’s hard to implement or whatever - and if there is a reason behind it, what would be the best way to work around it? I have found a few similar questions in the past (eg: [Question] How to store a list of values? · Issue #1034 · dgraph-io/dgraph · GitHub), but I’m still a bit unclear as to whether this is something that is intended to be unsupported or supported in the future.
To give an example: if I have a schema
lines [string] .
type Haiku {
lines
}
A haiku has 3 lines, and the order is important. So in order to query the lines I would expect to be able to insert a Haiku using the triples
_:haiku <lines> "I would like to be" (index=0) .
_:haiku <lines> "able to sort by facets" (index=1) .
_:haiku <lines> "even on scalars" (index=2) .
And then when I perform a query
haiku {
lines @facet(orderasc: index)
}
I would expect to get the lines in order. However the current behavior of the system is to return something like
"haiku": {
"lines": ["able to sort by facets", "I would like to be", "even on scalars" ],
"lines|index": {
"0": 1
"1": 0
"2": 2
}
}
Sorting could be done on the client side using a process of matching the lines|index object to the lines array, but it feels messy to have to implement this each time we run into this case, and if you deal with the response as unstructured json data the error handling leads to very verbose code and a lot of effort for not much gain (for reference, I am experimenting with writing a graphQL API in Rust/Juniper that wraps the dgraph graphQL± API while providing input validation, business logic etc).
In the other cases I have seen (such as the fruits example in the github issue link), re-examining the structure of the data proved useful - in that case, the facet made more sense as an predicate on a node rather than a facet on an edge, as the size of the fruit is a property of the fruit rather than the relationship. However, in the case of an ordered list (such as lines in a haiku), it feels wasteful to make a node for each line just for an index predicate. Alternatively, we might make three string predicates on the Haiku type: line1, line2 and line3. Again, this feels like an unnecessary workaround, and would not work for a different example where the size of the list is not constant. Another approach suggested was to serialize the list as a single value which, again, works for the Haiku example, but reduces the flexibility (eg: search for a haiku with “X” term in line 1 only), and is a poor workaround for other hypothetical examples where you might also want to append to the list or perform some aggregate (eg: sum) operation on a sequence of numbers.
To summarize: intuitively I would expect sorting by facets to work on scalar predicates as well as uids, and I would like to understand better the roadblocks which mean it isn’t possible. While there are workarounds possible, they each come with disadvantages which make them unattractive so I would also like to better understand what the recommended approach is when storing and retrieving sequential lists of scalar values.