WIP: cascade with pagination

Introduction

@cascade directive when used in a query with pagination gives wrong results. For eg, for the given GraphQL schema:-

type School {
	name: String! @id
	affln: String
	principal: String
	classes: [Class]
}
type Class {
	std:  Int! @id
	teacher: String
	students: [Student]
}

type Student {
	name: String! @id
	age: Int
}

the following GraphQL query:-

query{
  querySchool(first: 2, offset: 5) @cascade{
    name
    affln
    classes(first: 5, offset: 3){
      std
      teacher
      students(first: 10, offset: 10){
        name
        age
      }
    }
  }
} 

which is equivalent to the corresponding DQL query gives wrong results.

querySchool(func: type(School), first: 2, offset: 5) @cascade {
    School.name : School.name
    School.affln : School.affln
    School.classes : School.classes (first: 5, offset: 3) {
      Class.std : Class.std
      Class.teacher : Class.teacher
      Class.students : Class.students (first: 10, offset: 10) {
        Student.name : Student.name
        Student.age : Student.age
        dgraph.uid : uid
      }
      dgraph.uid : uid
    }
    dgraph.uid : uid
  }

Underlying Problem

The issue arises due to the fact that @cascade is a post-processing step. Suppose for the above query, Dgraph first fetches the first 2 entries of School with the offset of 5 and then 5 entries of classes at the offset of 3 and so on. It then removes null entries in a bottom-up fashion. So first any Student for which the age or name is null is removed and then classes and School are removed which leads to a result different from what is expected.

The semantics of the query should be that, Give me the first 2 schools at an offset of 5 from a list of schools having non null attributes and so on.

Possible Solutions

Based on my understanding, there are two possible solutions to fix this issue:

Query Rewriting

In order to get the desired result, the query can be written alternatively using inverse edges as:

stdnts as var(func: type(Student)) @filter(has(Student.age) AND has(Student.name)){
        cls as ~Class.students @filter(has(Class.teacher) AND has(Class.std)){
            sch as ~School.classes 
        }
    }

    query(func: uid(sch), first:2, offset: 5) {
        School.name
        School.affln
        School.classes(first: 5, offset: 3) @filter(uid(cls)){
            Class.std
            Class.teacher
            Class.students(first: 10, offset: 10) @filter(uid(stdnts)){
                Student.name
                Student.age
            }
        }
    }

Fixing pagination while applying @cascade.

In order to fetch the correct result with the DQL/GraphQL query presented above in the problem statement, we need to remove pagination from all the deep levels from query processing when the @cascade is applied at any level. The query will fetch all the data without pagination and then the @cascade filter should be applied and the result should be paginated after that.

2 Likes

As discussed with @abhimanyusinghgaur, the first approach of query rewriting will become complicated suppose if there are more than one non-scalar predicates at a level. For example, if the type Class is modified a little bit like this:-

type Class{
   std: Int! @id
   teacher: String
   students: [Student]
   subject: [Subject]
}

typ Subject {
   name: String
   referenceBook: String
}

Then we need to query uids of classes from Subject and Students and then write the remaining query. This might become complicated at each level.
We can however optimize our second approach of paginating after applying @cascade using has filters on each level. This will reduce the amount of over fetched data and hence provide some optimizations.
The DQL query which initially needs to be executed should be:-

query{
  querySchool @filter(has(name) AND (has(affln) AND has(classes))) @cascade{
    name
    affln
    classes @filter(has(std) AND (has(teacher) AND has(students))){
      std
      teacher
      students @filter(has(name) AND has(age)){
        name
        age
      }
    }
  }
} 

And then applying pagination in top to down manner will give the desired results.
The overall procedure will be:-
For the second approach:

  • user will only give a DQL query with @cascade
  • we will add has filters to the DQL apropriately and remove pagination before executing it
  • then we will apply pagination
2 Likes

@minhaj, Do you plan to make this public ?

There was a recent discuss post in which the user is facing a similar issue. It will be easier to redirect Users to this RFC to keep them updated about the progress been made.

1 Like

As already mentioned this gets complicated, but also wouldn’t this require the reverse applied to the underlying schema. From my understanding with a GraphQL schema, no reverse directives are applied to the generated DQL schema.

1 Like

Yeah, reverse edges would also be needed to make this query run.

1 Like