Parameterized Cascade

Current

Currently @cascade directive does not take any parameters. This means that, for a node, all its children in the sub-graph must be present for it to be part of the response. The directive is implied at all levels from the level it has been declared.

This is too strict for use-cases where @cascade needs to be done only on some field/children and not all.

The proposal is to introduce parameterized cascade(param1, param2, ...).

Proposal

The proposed new behavior will be as follows:

  1. When @cascade(param1, param2) is at any level, then the node is part of the response if the param1 and param2 are present. It is filtered otherwise.
  2. All the lower levels will have an implied @cascade (without the params), so the lower levels will be stricter where each of the field must be present for the lower level nodes to be part of the response.
  3. If any of the lower level also have @cascade(param3, param4), then it overrides the implied @cascade .

In other words, @cascade or @cascade(param1, param2) at any level is applied to that level. All lower levels implicitly have the @cascade unless it is overridden by @cascade(param3, param4).

e.g. Here is a simple graph.

{
  set {
    _:alice1 <name> "Alice 1" .
    _:alice1 <age> "23" .
         
    _:alice2 <name> "Alice 2" .
    
    _:alice3 <name> "Alice 3" .
    _:alice3 <age> "32" .
    
    _:bob <name> "Bob" .
  
    _:chris <name> "Chris" .
    
    _:dave <name> "Dave" .
    
    _:alice1 <friend> _:bob (close=true) .  
    _:alice1 <friend> _:dave .    
      
    _:alice2 <friend> _:chris (close=false) .	  
    
      _:bob <friend> _:chris .
  }
}

Example 1

Query 1 : parameterized @cascade(name) at root. Implied @cascade at lower level 2 and 3.

{
  q(func: anyoftext(name, "Alice")) @cascade(name) { # parameterized cascade
    name
  	age
	friend {     # implied `@cascade`
          name
          age
          friend  {   # implied `@cascade`
      	    name
      	    age 
      }
    }
  }
}

Response Query 1

  "data": {
    "q": [
      {
        "name": "Alice 1",
        "age": "23"
      },
      {
        "name": "Alice 2"
      },
      {
        "name": "Alice 3",
        "age": "32"
      }
    ]
  }

Example 2

Query 2: @cascade(age) at root and @cascade(name) at lower level 2 (this overrides the implied @cascade), and implied @cascade at level 3.

{
  q(func: anyoftext(name, "Alice")) @cascade(age) { # parameterized cascade
    name
    age
    friend @cascade(name) { # parameterized cascade
      name
      age
      friend  {  # implied @cascade
      	name
      	age 
      }
    }
  }
}

Response Query 2

  "data": {
    "q": [
      {
        "name": "Alice 1",
        "age": "23",
        "friend": [
          {
            "name": "Dave"
          },
          {
            "name": "Bob"
          }
        ]
      },
      {
        "name": "Alice 3",
        "age": "32"
      }
    ]
  }

Example 3

Query 3: Root level have parameterized @cascade(friend) on a UID node instead of Value predicate.

{
  q(func: anyoftext(name, "Alice")) @cascade(friend) {  #paremeterize on UID node
    name
  	age
	friend  { # implied @cascade here.
      name
      friend  { # implied @cascade here.
      	name
      }
    }
  }
}

Response 3:

  "data": {
    "q": [
      {
        "name": "Alice 1",
        "age": "23",
        "friend": [
          {
            "name": "Bob",
            "friend": [
              {
                "name": "Chris"
              }
            ]
          }
        ]
      }
    ]
  }

References:

  1. https://github.com/dgraph-io/dgraph/pull/5607

I am for this as well, because we may want to require only some fields or sub-graphs but not all the way down.

FWIW, I think that the @cascade params should specifically include the sub-graphs to include as well.

{
  q(func: anyoftext(name, "Alice")) @cascade(name, friend) { # parameterized cascade
    name
    age
    children {   # not implied `@cascade`
      name
      age
      friend {   # not implied `@cascade`
        name
        age
      }
    }
    friend {     # `@cascade` by param
      name
      age
      friend  {  # implied `@cascade`
      	 name
      	 age 
      }
    }
  }
}

This would allow a query to require friends but not require children but return children as well if they exist without a second query.

By implied, I mean that the child level will have that directive. It does not affect the parent’s cascade. The parent’s cascade is what is mentioned or implied at the parent.

In your example above, the root level nodes will be returned as long as they have name and friend. If there is no children or age on any root level nodes, it is still ok since we are not cascading on those.

That is what I kind of expected. In your examples, I did not see any param pointing to a sub-graph but only individual properties. I guess it sort of works the same way though. Because a predicate either points to one or more individual properties or points to one or more uids.

Right. Maybe I will add an example where it points to a UID and not a value field.

1 Like

IMO, like discussed offline, cascade (with and without parameters) should cascade to all levels below unless overridden at a particular level. In case of override that override cascade should cascade down to next levels.

1 Like

See Example 3. It is cascading on a UID node and hence Alice 3 is filtered out. Also, Alice 2 is filtered out because its friend does not have friend.

2 Likes

This is a welcome addition! :+1: I have been burned a lot by “loosing” a node because it has a nullable predicate.

1 Like

cc @pawan

Pawan and I had discussed this. And we consciously chose the method the way it is right now.

One line of thought to not cascade the params was because the lower levels may not even have those params. Second, it will preserve the current @cascade behavior (without params) which trickles all the way down. Third, the parameterized cascade should only overrides that level.

In other words, this feature simply modified the current @cascade behavior with allowing over-riding with params only at that level.

Hey @Paras, thanks for your reply. Few points:

How do you differentiate nodes which do not have the param vs nodes which ideally should have that param but have it missing? Maybe I misunderstood this. Can you tell me an example?

Yes, but then since you are introducing parametrized cascade, preserving old behavior is moot. No? It becomes non-intuitive. Thinking from how CSS does it, only the latest rules are cascaded to all the levels below them. Not the original rules.

Why?

1 Like

Yeah. Once you parameterize it, it should be recursively applied with that parameter, until told otherwise some level below.

We don’t differentiate. It is obviated if we have this behavior.

It is not moot. Note that this parameterized cascade is non-existent in the GraphQL world where they will still use the current @cascade behavior. This is a GraphQL± feature only.

This seems more intuitive and less error-prone to me. Over-ride with parameters when needed at a particular level; implied @cascade otherwise.

Trying to answer this question that should we cascade the parameters as well.

From the user’s perspective, lets look at a query like below.

{
  me(func: type(User)) {
    name
    friend @cascade(name, school) {
      name
      age
      school {
        name
        classes {
          name
          numStudents
        }
      }
  }  
}

I want all users and their friends who have a name and go to a school. Now if we apply these parameters implicitly to all levels below i.e. to school and classes, we wouldn’t get any results. The parameters are contextual to where we are in the query and making them cascade won’t make much sense. It only makes sense if you have a pretty symmetric example where all levels of the query are requesting the same information like the friends query posted above.

The default behavior that we have above means that the user can just specify the required parameters at the level that they want to consider a subset and not write anything for the other levels (school, classes). The alternative would be something like below which means you are specifying the parameters at pretty much all levels. Is this better or worse?

{
  me(func: type(User)) {
    name
    friend @cascade(name, school) {
      name
      age
      // or school @cascade(_all_)
      school @cascade(name, classes) {
        name
        classes @cascade(name, numStudents) {
          name
          numStudents
        }
      }
  }  
}
2 Likes

Cascade without param is the equivalent of cascade(all). So, we don’t need to specify params at every level.

After friend @cascade(name, school) you could set school @cascade, so it takes all the fields in account.

A counter example is where you only want to apply cascade to name. You should be able to do that with @cascade(name) and it should only apply name recursively all the way down. Not switch to @cascade after the top level.

3 Likes

Good points!

Yeah true, this can be achieved with lesser code if we cascade down the params.

I guess we can switch the behaviour to make it more explicit than implicit by cascading the params. Its probably easier for the user to understand it this way.

1 Like

Also easier to explain. Once you set a param, that param gets cascaded all the way down, until you override explicitly.

Yes, agreed. This is a good point which I missed because I was thinking in terms of nodes which only relate to similar nodes. Person --> friends (who are also person) and so on. In which case one param makes sense for all the recursive nodes down the path.

Now modifying your example a little, I want those users and their schools, cities etc. which have age filled in.

But this would return nil if we go ahead with what I proposed.

{
  me(func: type(User)) @cascade(age) {
    name
    age
    school  {
      name
      city {
         name
         country
      }
    }
  }  
}

Three solutions:

  1. Link param inside cascade to node-type, i.e it is applied only if the nodes are also type (User), this brings in the contextual information.
  2. (Easier) Introduce a directive at school level to not apply any cascade below it.
  3. (Easiest) Parametrized cascades are applied only at the level they are mentioned. But there is NO cascade below that level (parametrized or cascade(_all_) ). This also enforces contextual behavior. Vanilla @cascade stays as is (applied recursively to all levels).

@mrjn Would you want name param to be cascaded to Users, School, and City node types? What about queries where I need users with a name but their schools/cities may or may not have a name. I guess having a directive to stop cascade at a level could be there to handles these queries, or the third point which I mentioned.

Also on this:

I personally never take care of this point. Is there any guideline around this? From what I understand we ship features to GQL± and our awesome GraphQL folks take a call which one they want to support or omit. :slight_smile:

Please! Once this gets on GraphQL± it should make its way into GraphQL world quickly thereafter. Maybe with a slight syntax modification: @cascade(fields: [String])

Also in regards to:

Will there be a way to override to a nil form of @cascade? Say I want to make sure only top level predicates are non nil. Maybe something like:

{
  me(func: type(User)) @cascade {
    name
    age
    school @cascade(__none__)  {
      name
      city {
         name
         country
      }
    }
  }  
}

The @cascade(__all__) could then be cascaded down until it meets a different cascade and if it meets the __none__ then it turns off cascading until it meets the next @cascade This would get schools if they had exist but either of the name or the city or both could be nil.

This behavior seems to be the consensus. I will make the changes accordingly. Thank you all.

3 Likes

Any ETA for this coming to the graphql endpoint?

I am building client created filter logic and without this I have to make each filter its own query and then combine all of the filters together.