Moved from GitHub dgraph/4160
Posted by campoy:
What version of Dgraph are you using?
master
Have you tried reproducing the issue with the latest release?
yes
What is the hardware spec (RAM, OS)?
n/a
Steps to reproduce the issue (command/config used to run Dgraph).
Given the dataset generated by this mutation:
{
set {
_:a <name> "Anne" .
_:b <name> "Brian" .
_:jp <name> "Jurassic Park" .
_:ij <name> "Indiana Jones" .
_:a <rated> _:jp (rating=5) .
_:a <rated> _:ij (rating=2) .
_:b <rated> _:ij (rating=2) .
}
}
If you run the following request:
{
q(func: has(rated)) {
name
rated @facets(r as rating)
partial_sum: sum(val(r))
}
sum() {
total_sum: sum(val(r))
}
}
Expected behaviour and actual result.
I’d expect partial_sum
to be 7 for Anne and 2 for Brian, then total_sum
would be 9.
Instead, the result is as follows:
{
"data": {
"q": [
{
"name": "Anne",
"rated": [
{
"rated|rating": 5
},
{
"rated|rating": 2
}
],
"partial_sum": 9
},
{
"name": "Brian",
"rated": [
{
"rated|rating": 2
}
],
"partial_sum": 4
}
],
"sum": [
{
"total_sum": 9
}
]
}
}
I have a theory about why we’re getting these weird numbers.
Variables attach values to uid, but in this case that’s not the right behavior, as the value of the variable should not be attached to the UID of the person nor the movie, but rather the combination of both linked by the predicate.
You can see the weird artifact by querying by this value on all of the nodes.
{
var(func: has(rated)) {
rated @facets(r as rating)
}
sum(func: has(name)) {
name
val(r)
}
}
returns
{
"data": {
"sum": [
{
"name": "Jurassic Park",
"val(r)": 5
},
{
"name": "Indiana Jones",
"val(r)": 4
},
{
"name": "Anne"
},
{
"name": "Brian"
}
]
}
}
This proves that the variable r
has been attached to the movie UIDs by adding all of the values in the facets pointing to them.
Once we understand this, it makes sense that the sum of the ratings for Anne is 9 instead of 7, as it’s the sum of the ratings for the two movies. Same goes for the ratings for Brian being 4 instead of 2.
Fixing this might be complicated, as it might imply making variables work as a map from <uid, uid> to value rather than to value.