Cascade Directive with Pagination produces unexpected results

Moved from GitHub dgraph/5930

Posted by amaster507:

What version of Dgraph are you using?

docker image dgraph :master v2.0.0-rc1-448-gd5892dc0c

Have you tried reproducing the issue with the latest release?

N/A as @cascade is only available for graphql in 20.07+

What is the hardware spec (RAM, OS)?

AWS ec2.large (8gb RAM, Ubuntu)

Steps to reproduce the issue (command/config used to run Dgraph).

I have a query that returns around 15K nodes and want to get the first 5 that match the @cascade directive as well. The result is 0. However if I remove the pagination limit then I get all the ones that match cascade.

I believe the problem is that pagination happens before the cascade directive on the rood of the query. Is there a way to reverse this order? I don’t see any need why someone would want to paginate and then apply a cascade directive :thinking:

This will return all of the completed Tasks:

query {
  queryTask @cascade {
    name
    completed
  }
}

This should return the first 5 completed Tasks, but it instead returns an empty set:

query {
  queryTask(first: 5) @cascade {
    name
    completed
  }
}

Expected behaviour and actual result.

I expect pagination to work in conjunction with cascade but that does not seem to be true. I understand for performance that pagination reduces the load much faster then cascade does, but I do not believe that is the understood effect.

I did test on Dgraph in Ratel and got similar results. So this issue is not specific to the graphql endpoint.

1 Like

josh-mercarto commented :

Also having this issue

quotationmarks-jzj commented :

Also having this issue

@amaster507 I think this is GraphQL, right? It should be moved to GraphQL issues.

1 Like

I believe it applies to both, I discovered it with GraphQL but also compared with DQL. I moved it now. It really applies to both.

Any news on this?

I think this is possibly related to Filter on non-scalar fields or relations (Types)

I.e. when we have more powerful filters (filter on child-child-child… properties), we wouldn’t need @cascade in most cases.

Knowing how cascade works leads to an understanding that this bug may be non fixable. The first directive is applied before the cascade as cascade is near the end of the process tree. This makes the query faster but “wrong” results. If the cascade was applied first then the query would be less efficient but with “correct” results. Some users may expect the “wrong” results. So fixing this may be actually breaking it. I hope to have the filter on edges that would heavily negate the need for cascade for the most part. The has filter already helps quite a bit but still not perfect.

2 Likes

Is there some kind of fix for this? This is a pretty catastrophic issue for me.

No, there is no fix for this issue as of now.

If it solves your problem, you may try to use has filter on every level in your query.
But note that doing so is not the same as @cascade. For example, consider these 2 DQL queries:

Query-1

query {
  queryPerson(func: type(Person)) @cascade(fields: ["name","friends"]) {
    name
    friends {
      name
    }
  }
}

Query-2

query {
  queryPerson(func: type(Person)) @filter(has(name) AND has(friends)) {
    name
    friends @filter(has(name) AND has(friends)) {
      name
    }
  }
}

Given this DQL schema:

type Person {
  name
  friends
}
name: string .
friends: [uid] .

And the following data-set:

_:a <name> "Alice" .
_:b <name> "Bob" .
_:c <name> "Charlie" .

_:a <friend> _:b .
_:b <friend> _:c .

Then the result of the two queries will be like this:

Query-1 response

{
  "queryPerson": [
    {
      "name": "Alice",
      "friends": [
        {
          "name": "Bob"
        }
      ]
    }
  ]
}

Query-2 response

{
  "queryPerson": [
    {
      "name": "Alice",
      "friends": [
        {
          "name": "Bob"
        }
      ]
    },
    {
      "name": "Bob",
      "friends": []
    }
  ]
}

Note that the 2nd query will work correctly with pagination, but it will not remove those parents for which a deep descendent was missing some data.

1 Like

Hello,

Thanks so much for the response! Dgraph is really amazing but these small issues (that seem to have big implementation ramifications) really hurt the usability.

Unfortunately, I was relying on the cascade to remove the parent after applying a bunch of filters to the parent-relationship.

I wish there was some kind of caveat in the documentation that this doesn’t work with pagination before I designed my data model! There is a lot of guidance on this discuss forum that points towards using cascade without any kind of warning.

Also, just out of curiosity, is this slated to be fixed in the foreseeable future? If it is, I can just wait for it.

4 Likes

Yes, we have started working on a fix which should be available as part of the next release i.e 21.03.

4 Likes

Thank you so much!

This has been fixed in the master. Please see this PR for more details.

2 Likes