pjebs commented :
My current workaround is: lemma-chain/find_chain.go at master · thehonestscoop/lemma-chain · GitHub
Is this not good enough?
pjebs commented :
My current workaround is: lemma-chain/find_chain.go at master · thehonestscoop/lemma-chain · GitHub
Is this not good enough?
pawanrawal commented :
Yeah, you can use MaxUint64 value but should use a lower value if you have an idea of the max expected depth. The problem with using MaxUint64 is that if your data does have loops, you’ll end up getting a lot of results (capped by query_edge_limit
flag of alpha) which you don’t care about.
campoy commented :
Ok, so I do not think this is a bug at all. Let me explain.
When setting loop: false
we’re asking Dgraph to stop expanding after a node that has already been seen is found.
This allows us to find loops by giving us the element that we already found. This repeated element is expanded, but no further elements will be expanded.
Let’s see an example!
Given this dataset:
{
set {
_:a <name> "a" .
_:b <name> "b" .
_:c <name> "c" .
_:d <name> "d" .
_:z <name> "z" .
_:a <id> "1"^^<xs:int> .
_:b <id> "2"^^<xs:int> .
_:c <id> "3"^^<xs:int> .
_:d <id> "4"^^<xs:int> .
_:a <knows> _:b .
_:b <knows> _:c .
_:c <knows> _:a .
_:c <knows> _:d .
_:a <type> _:z .
_:b <type> _:z .
_:c <type> _:z .
_:d <type> _:z .
}
}
This draws a graph as follows:
As you can see, a, b, and c form a loop. Also a, b, c, and d point to z with their “type” edges.
Let’s recurse starting from a
using name, knows, id, and type with the following query:
note: the name
predicate has an exact
predicate <name>: string @index(exact) .
{
q(func: eq(name, "a")) @recurse(loop: true, depth: 5) {
name
id
knows
type
}
}
The result is:
{
"data": {
"q": [
{
"name": "a",
"id": 1,
"knows": [
{
"name": "b",
"id": 2,
"knows": [
{
"name": "c",
"id": 3,
"knows": [
{
"name": "a",
"id": 1,
"knows": [
{
"name": "b",
"id": 2
}
],
"type": [
{
"name": "z"
}
]
},
{
"name": "d",
"id": 4,
"type": [
{
"name": "z"
}
]
}
],
"type": [
{
"name": "z"
}
]
}
],
"type": [
{
"name": "z"
}
]
}
],
"type": [
{
"name": "z"
}
]
}
]
},
"extensions": { }
}
This contains two paths a - b - c - a - b
and a - b - c - d
, and for each of these nodes all the predicates (name, id, knows, type) are expanded.
Notice that while the ending node d
still has likes
expanded to the name z
.
But the end of the path with the loop b
does not expand type
but it does expand name
and id
. Why is this?
The behavior is actually the same for name
and type
, but while expanding name
gives a value predicate of type string, type
goes to another uid
for which we don’t expand anything anymore since we’ve reached the maximum recursion depth. Since the expansion doesn’t occur, the object inside of type
(and knows
) is empty and therefore the predicate type
is dropped completely.
A very similar behavior occurs when setting loop: false
as the last node to be expanded is the one that was already included in the traversal. In this case, that’s a
.
{
q(func: eq(name, "a")) @recurse(loop: false) {
name
id
knows
type
}
}
This will return:
{
"data": {
"q": [
{
"name": "a",
"id": 1,
"knows": [
{
"name": "b",
"id": 2,
"knows": [
{
"name": "c",
"id": 3,
"knows": [
{
"name": "a",
"id": 1
},
{
"name": "d",
"id": 4,
"type": [
{
"name": "z"
}
]
}
],
"type": [
{
"name": "z"
}
]
}
],
"type": [
{
"name": "z"
}
]
}
],
"type": [
{
"name": "z"
}
]
}
]
},
"extensions": { }
}
Here you can see that while d
is expanded and you can see the name
for the node pointed by type
, a
is expanded but the nodes pointed by it via knows
and type
will not be expanded, generating an empty object, which will be removed as well as the predicates knows
and type
themselves. This is not different to the behavior with b
when the maximum recursion depth in the case above.
The behavior above matches what you see in your example. The first time you see the node is the first element in the loop and we expand it until we find it again, in which case we stop the expansion at that node (value predicates will appear, but object predicates will be expanded but empty therefore dropped).
I hope this helps understand the behavior.
I do no think this is a bug, although it might be useful to improve the documentation regarding the behavior of @recurse
when loop: false
is set and a loop is found or the maximum recursion depth is reached.
pjebs commented :
What you are saying is correct. What I suggested in recursive query not expanding · Issue #3634 · dgraph-io/dgraph · GitHub is that it may very well be working as intended - with the obvious improvement being to clarify the behaviour in documentation.
I am saying that the intended bahviour is not consistent with what a user would expect from such a query such as:
q = `
query withvar($hashid: string) {
chain(func: eq(node.hashid, $hashid)) @recurse(loop:false) {
uid
node.owner
user.name
node.hashid
node.xdata
node.parent @facets @facets(eq(facet, "required") or eq(facet, "recommended"))
}
}
`
I expect each result to satisfy the “footprint” as stated in the query of uid
, node.owner
, user.name
, node.hashid
, node.xdata
and most importantly node.parent
if available.
Currently if the parent node has been visited, it is not displaying it. It should return that value. Currently I am doing it client side, when it can be very easily performed in server-side. @pawanrawal can probably explain it better than I.
Further to this, if the node has been visited before, then node.parent
is not returned. I then have no ability to know if that node has a parent. This is obviously an error.
campoy commented :
I wonder if it would make sense for you to perform this query in a slightly different way then.
You could have the recursion to identify what are the nodes involved, and a different query in the same block to obtain the data related to each.
So following the dataset I presented above, instead of the query right below for which you’d be missing the type information on nodes already visited:
{
q(func: eq(name, "a")) @recurse(loop: false) {
name
id
knows
type
}
}
You could fetch the data in two queries as such:
{
tree(func: eq(name, "a")) @recurse(loop: false) {
nodes as uid
knows
}
nodes(func: uid(nodes)) {
name
id
type {
name
}
}
}
This would return both the structure of your tree and the information attached to each node (only once).
The result would be a smaller response, probably.
{
"data": {
"tree": [
{
"uid": "0x3",
"knows": [
{
"uid": "0x4",
"knows": [
{
"uid": "0x5",
"knows": [
{
"uid": "0x1"
},
{
"uid": "0x3"
}
]
}
]
}
]
}
],
"nodes": [
{
"name": "d",
"id": 4,
"type": [
{
"name": "z"
}
]
},
{
"name": "a",
"id": 1,
"type": [
{
"name": "z"
}
]
},
{
"name": "b",
"id": 2,
"type": [
{
"name": "z"
}
]
},
{
"name": "c",
"id": 3,
"type": [
{
"name": "z"
}
]
}
]
},
"extensions": { ... }
}
We will update the documentation, but first I want to make sure this solution meets your needs.
campoy commented :
After discussing this issue with @pawanrawal we’ve reached the conclusion that there’s a possible enhancement to be done here which would target your concerns.
I’m going to keep this issue open but change it to be an enhancement, as the current behavior matches the specs. That said, we’ll try to implement this in future releases.
The solution would be to stop recursion once a node has been seen in the current path. This is different to the current behavior where a node is not expanded once it has been seen at any point in the traversal.
Let’s see this with one example:
{
set {
_:a <name> "a" .
_:b <name> "b" .
_:c <name> "c" .
_:d <name> "d" .
_:e <name> "e" .
_:f <name> "f" .
_:z <name> "z" .
_:f <knows> _:a .
_:f <knows> _:d .
_:d <knows> _:e .
_:e <knows> _:a .
_:a <knows> _:b .
_:a <knows> _:e .
_:b <knows> _:c .
_:a <type> _:z .
_:b <type> _:z .
_:c <type> _:z .
_:d <type> _:z .
_:e <type> _:z .
_:f <type> _:z .
}
}
This would generate a graph like this:
If you got to start a recursive query from f
:
{
q(func: eq(name, "f")) @recurse {
name
knows
type
}
}
Currently, you would get something like this:
{
"data": {
"q": [
{
"knows": [
{
"knows": [
{
"knows": [
{
"name": "c",
"type": [
{
"name": "z"
}
]
}
],
"name": "b",
"type": [
{
"name": "z"
}
]
}
],
"name": "a",
"type": [
{
"name": "z"
}
]
},
{
"knows": [
{
"knows": [
{
"name": "a"
}
],
"name": "e"
}
],
"name": "d",
"type": [
{
"name": "z"
}
]
}
],
"name": "f",
"type": [
{
"name": "z"
}
]
}
]
}
}
Note how once we reached a
in the path starting with f -> d
, the recursion stops.
We propose enhancing the loop detection to keep on going as a
has not been seen in that path yet, so the output would look like the following:
{
"data": {
"q": [
{
"knows": [
{
"knows": [
{
"knows": [
{
"name": "c",
"type": [
{
"name": "z"
}
]
}
],
"name": "b",
"type": [
{
"name": "z"
}
]
}
],
"name": "a",
"type": [
{
"name": "z"
}
]
},
{
"knows": [
{
"knows": [
{
"knows": [
{
"knows": [
{
"name": "c",
"type": [
{
"name": "z"
}
]
}
],
"name": "b",
"type": [
{
"name": "z"
}
]
}
],
"name": "a",
"type": [
{
"name": "z"
}
]
}
],
"name": "e"
}
],
"name": "d",
"type": [
{
"name": "z"
}
]
}
],
"name": "f",
"type": [
{
"name": "z"
}
]
}
]
}
}
I think this would address your concerns, do you agree? @pjebs @power-f-GOD
Thanks for the report and for helping us understand your needs!
pjebs commented :
I’ll have a look at it when I get back to this project. I still think that the queries return “footprint” should always be returned when possible.