Multiple recursion happening on same node with @recurse(loop: False)

Good morning Dgraph team! I think I have found a pretty significant bug in recursion behavior.

What version of Dgraph are you using?

Dgraph Version
$ dgraph version

Dgraph version   : v21.12.0
Dgraph codename  : zion
Dgraph SHA-256   : 078c75df9fa1057447c8c8afc10ea57cb0a29dfb22f9e61d8c334882b4b4eb37
Commit SHA-1     : d62ed5f15
Commit timestamp : 2021-12-02 21:20:09 +0530
Branch           : HEAD
Go version       : go1.17.3
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph     , visit http://discuss.dgraph.io.
For fully-managed Dgraph Cloud   , visit https://dgraph.io/cloud.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2021 Dgraph Labs, Inc.

Have you tried reproducing the issue with the latest release?

Yes! As of posting I used the :latest docker image.

What is the hardware spec (RAM, OS)?

“Intel(R) Core™ i9-10885H CPU @ 2.40GHz”
Ubuntu 20.04

Steps to reproduce the issue (command/config used to run Dgraph).

Run Dgraph: docker run --rm -it -p 8000:8000 -p 8080:8080 -p 9080:9080 dgraph/standalone:latest

You should be able to just run this:

import pydgraph
import json

def make_client(address):
    client_stub = pydgraph.DgraphClientStub(
        address,
    )
    return pydgraph.DgraphClient(client_stub)

client = make_client("localhost:9080")

def write_database(nquads):
    clear_op = pydgraph.Operation(drop_all=True)
    client.alter(clear_op)

    schema = """
    <targets>: [uid] .
    <name>: string @index(exact) .
    """
    schema_op = pydgraph.Operation(schema=schema)
    client.alter(schema_op)

    create_txn = client.txn()
    try:
        create_txn.mutate(set_nquads=nquads, commit_now=True)
    finally:
        create_txn.discard()

get_alpha_query = """
{
  node(func: eq(name, "Alpha")) @recurse(loop: False) {
    name,
    targets,
  }
}
"""

def get_alpha():
    result = None
    get_alpha_txn = client.txn()
    try:
        result = json.loads(get_alpha_txn.query(get_alpha_query).json)
    finally:
        get_alpha_txn.discard()
    return result

ok_nquads = """
_:a <name> "Alpha" .
_:b <name> "Beta" .
_:c <name> "Gamma" .
_:d <name> "Delta" .
_:e <name> "Epsilon" .
_:t <name> "Theta" .

_:a <targets> _:b .
_:a <targets> _:c .
_:b <targets> _:d .
_:c <targets> _:t .
_:t <targets> _:d .
_:d <targets> _:e .
"""

not_ok_nquads = """
_:a <name> "Alpha" .
_:b <name> "Beta" .
_:c <name> "Gamma" .
_:d <name> "Delta" .
_:e <name> "Epsilon" .

_:a <targets> _:b .
_:a <targets> _:c .
_:b <targets> _:d .
_:c <targets> _:d .
_:d <targets> _:e .
"""

dramatic_nquads = """
_:a <name> "Alpha" .
_:b <name> "Beta" .
_:c <name> "Gamma" .
_:d <name> "Delta" .
_:e <name> "Epsilon" .
_:f <name> "Zeta" .
_:g <name> "Eta" .
_:h <name> "Theta" .
_:i <name> "Iota" .
_:x <name> "X" .

_:a <targets> _:b .
_:a <targets> _:c .
_:a <targets> _:d .
_:b <targets> _:e .
_:c <targets> _:e .
_:d <targets> _:e .
_:e <targets> _:f .
_:e <targets> _:g .
_:e <targets> _:h .
_:f <targets> _:i .
_:g <targets> _:i .
_:h <targets> _:i .
_:i <targets> _:x .
"""

write_database(ok_nquads)
print("OK case:")
print(json.dumps(get_alpha(), indent=2))

write_database(not_ok_nquads)
print("Not OK case:")
print(json.dumps(get_alpha(), indent=2))

write_database(dramatic_nquads)
print("Dramatic case:")
print(json.dumps(get_alpha(), indent=2))

Expected behaviour and actual result.

I expect @recurse(loop: False) to only recurse along each node a maximum of one time. It seems though, that when a node is reached in recursion by multiple paths that take an equal number of steps, they are both recursed - and because these create duplicate recursion paths, also with equal lengths, to downstream nodes, you can even get exponential growth of recursion with an option (loop: False) which is supposed to recurse just once for each found node.

{
  node(func: eq(name, "Alpha")) @recurse(loop: False) {
    name,
    targets,
  }
}

This graph (edges directed from left to right) recurses from A just fine:

  B ___
 /      \
A        D __ E
 \      /
  G __ T

{
  "node": [
    {
      "name": "Alpha",
      "targets": [
        {
          "name": "Beta",
          "targets": [
            {
              "name": "Delta",
              "targets": [
                {
                  "name": "Epsilon"
                }
              ]
            }
          ]
        },
        {
          "name": "Gamma",
          "targets": [
            {
              "name": "Theta",
              "targets": [
                {
                  "name": "Delta"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

But this graph, with equal length paths to D, recurses on D twice (E is just there to show if D is being recursed on)

  B
 / \
A   D __ E
 \ /
  G

{
  "node": [
    {
      "name": "Alpha",
      "targets": [
        {
          "name": "Beta",
          "targets": [
            {
              "name": "Delta",
              "targets": [
                {
                  "name": "Epsilon"
                }
              ]
            }
          ]
        },
        {
          "name": "Gamma",
          "targets": [
            {
              "name": "Delta",
              "targets": [
                {
                  "name": "Epsilon"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

My attached script contains another simple case where you can see the recursion grow exponentially - I work with Dgraph at my job and found this bug because I was encountering over-recursed results that were much larger than the actual number of unique nodes traversed.

My expectation from reading https://dgraph.io/docs/query-language/recurse-query/ is that this should never be possible, it seems to be a bug. My guess as someone that hasn’t seen the dgraph codebase is that recursion is done in “passes” and it looks here like results found in the same “pass” aren’t collected together and made unique, maybe because of some optimization for parallel execution.

3 Likes

I’ve ran into the same issue, and have been watching this thread hoping for the developers to respond. It does seem like a bug to me as well. I’m wondering though maybe this is more of a design defect in DGraph, so it is not really even fixable. Hence it being ignored.