Cascade does not work with pagination

I expect pagination to work in conjunction with cascade but that does not seem to be true (at least in GraphQL).

I have a query that returns around 15K nodes and want to get the first 5 that match the @cascade directive as well. The result is 0. However if I remove the pagination limit then I get all the ones that match cascade.

I believe the problem is that pagination happens before the cascade directive on the rood of the query. Is there a way to reverse this order? I don’t see any need why someone would want to paginate and then apply a cascade directive :thinking:

This will return all of the completed Tasks:

query {
  queryTask @cascade {
    name
    completed
  }
}

This should return the first 5 completed Tasks, but it instead returns an empty set:

query {
  queryTask(first: 5) @cascade {
    name
    completed
  }
}

I understand for performance that pagination reduces the load much faster then cascade does, but I do not believe that is the understood effect. This will probably be a pain to fix and cause performance issues :frowning:

3 Likes

I did test on Dgraph in Ratel and got similar results. So this issue is not specific to the graphql endpoint.

Edit: I also just read through the docs and while one could put this limitation together it is not explicitly stated anywhere that I found that these two used together provide odd results.

I was really looking forward to use @cascade more after it gets parameterized, but will not be able to unless there is someway to control pagination after cascade.

Pagination allows returning only a portion, rather than the whole, result set. This can be useful for top-k style queries as well as to reduce the size of the result set for client side processing or to allow paged access to results. - Query Language - Query language

With the @cascade directive, nodes that don’t have all predicates specified in the query are removed. - Query Language - Query language

^ This leads to the understanding pagination first then with those results do a cascade.

1 Like

Yeah, this doesn’t look right. @pawan, @Paras you’ve both worked in here recently, can you provide some insight.

2 Likes

This is because cascade is a post processing step. @pawan , right?

2 Likes

yeah, that is right. @cascade is a post processing step so it may exclude some nodes which were part of the paginated result. This should qualify as a bug in Dgraph and we can look into how to fix it. Feel free to open a bug @amaster507

1 Like

Done. Thank you!

https://github.com/dgraph-io/dgraph/issues/5930

1 Like

The bug that was opened for this on github circles back to this forum message. Is there another bug that is tracking this issue. This is a blocker for me, anyone else?

2 Likes

Sorry @matthewmcneely, this one fell through the cracks in the GitHub to Discuss migration. I have marked this as accepted and we’ll look into working on it next month.

3 Likes

@minhaj has started working on this and should have a fix for it soon. I expect this to be fixed for the 21.03 release.

2 Likes

While we are working on solving this issue, here is an alternate way to do this query which works well for a single level. We have recently added support for has to take a list of arguments. The query below should have the same effect as @cascade.

query {
  queryTask(filter: {input: {has: [name, completed]}}, first: 5) {
    name
    completed
  }
}
3 Likes

This has been fixed in the master. Please look at this PR.

2 Likes

Hello! I stumbled into this issue but i’m a bit confused why, because i’m using v21.03.0. Here’s my query:

queryPerformance(filter: { isUploaded: true }, first: 2, offset: 0) @cascade(fields: ["song"]) {
  id
  song(filter: { id: ["0x9c43"] }) {
    title
    artist
  }
}

The query returns 0 records. If I set first to 4 I get one record. So it seems that the behavior in v21.03.0 is unchanged, despite what’s written in the changelog.

EDIT:

interesting: it seems to be fixed in the docker tag v21.03-slash, but not in v21.03.0

3 Likes