RFC: Nested Filters in GraphQL

Yeah, this design will cost more in terms of performance as Abhimanyu already mention. We will be discussing other possible solutions internally with the team. One good solution is definitely to make use of the inverse edges and reduce the universe at the root as you already mentioned.

But in that, we need to inverse edges in the schema, and rewriting that seems much difficult than this approach. We will be exploring it. Currently, there are many requests for this feature and there are many use cases that otherwise are not possible with GraphQL. So to allow those use cases, I guess we can go with this approach but we will be discussing and exploring all the possible optimizations before implementing it.

3 Likes

Thanks so much for this.

I don’t really have an issue with using cascade and relying on developers to have a stronger conception of the shape of their graph when defining queries in the near term. A lot of us really need this functionality. I do think intelligently minimizing the node universe by tracking node counts and measuring inverse edges is really important, but it seems to me like more of a feature upgrade than a different approach :man_shrugging:.

I also think it’s important to provide some functional filters on a set of connected nodes…things like contains, every, and count/length. For instance, being able to efficiently filter for entities that have at least one connected node with property1 and at least one connected node with property2 is very core to my needs (and blocking at scale).

One question - for layered filter queries that can be handled by DQL today, does the logic stop traversal on ‘dead-end’ paths once a node fails a filter?

3 Likes

Currently I end up combining multiple queries using @cascade as a kind of workaround. So e.g. to get all tasks that are either assigned to no user or to a specified user having either specified tags or no tags this is the query I’d end up with. Please let me know if this can be done differently:

query tasks($filter: AssignableFilter! = { has: name }, $users: UserFilter = { has: name }, $tags: TagFilter! = { has: name }) {
  assignedTasks: queryAssignable(filter: {and: [{ has: assignedTo }, {has: tags}, $filter]}) @cascade(fields: ["assignedTo", "tags"]) {
    id
    name
    assignedTo(filter: $users) {
      id
      name
    }
    tags(filter: $tags) {
      name
    }
    pastExecutions {
      user {
        id
        name
      }
      timestamp
    }
    effort
    ... on Chore {
      interval
    }
    ... on Task {
      due
    }
  }
  assignedTasksWithoutTags: queryAssignable(filter: {and: [{ has: assignedTo }, { not: { has: tags }}, $filter]}) @cascade(fields: ["assignedTo"]) {
    id
    name
    assignedTo(filter: $users) {
      id
      name
    }
    pastExecutions {
      user {
        id
        name
      }
      timestamp
    }
    effort
    ... on Chore {
      interval
    }
    ... on Task {
      due
    }
  }
  unassignedTasks: queryAssignable(filter: {and: [{not: {has: assignedTo}}, {has: tags}, $filter]}) @cascade(fields: ["tags"]) {
    id
    name
    tags(filter: $tags) {
      name
    }
    pastExecutions {
      user {
        id
        name
      }
      timestamp
    }
    effort
    ... on Chore {
      interval
    }
    ... on Task {
      due
    }
  }
  unassignedTasksWithoutTags: queryAssignable(filter: {not: {or: [{ has: assignedTo }, { has: tags }]}}) {
    id
    name
    pastExecutions {
      user {
        id
        name
      }
      timestamp
    }
    effort
    ... on Chore {
      interval
    }
    ... on Task {
      due
    }
  }
}
1 Like

I think this might be a deal-breaker for many. At least I think it will be for us.

The major use case for us is to make the front-end developers able to work separate from the back-end and get a flexible API with real-time capabilities that don’t require us to do any custom development for loading data e.g. for tables. Without proper nested filtering capabilities that will guaranteed come back to bite us since we have pretty many relations making a graph database great. The experience so far in the POC we are doing has been really great, I just took for granted that the filtering would be there since it seems like such a crucial feature.

Really hoping this will be added (or planned) before we have to make a decision as the product seems amazing except for this pretty big (IMO) shortcoming. For some reason it works in the @auth directive, if the filter on a child returns 0 rows it will filter out the parent, not sure why.

4 Likes

Hi @CosmicPangolin1, Thanks for your opinion on this. Currently, aggregate queries like count are not available in filters, once we have them then it will be easy to write the filters for connected nodes as you mentioned. Currently, that can be achieved by custom DQl though. see this
Filter by counts in GraphQL

I didn’t fully get what you mean by logic stop traversal on ‘dead-end’ paths once a node fails a filter?

But I guess you mean that if there are no nested nodes and we have a filter on them then what is the behaviour of it. So for example take below DQL query. The first query just went through all nodes which have dgraph.type Author and then select those Authors from them who have post title Dgraph. But here we also got the Author who doesn’t have any posts with Author. posts equal null. So here @cascade come for our rescue, it filters out those leaf nodes and gives us the author who have posts and whose title is Dgraph.

And then in 2nd query, we use those Authors and then filters their names. So in short for the nodes which don’t have nested node for them we filter them out using @cascade at root.

query {  
     post1 as var(func:type(Author)) @cascade {
        Author.posts : Author.posts @filter(eq(Post.title, "Dgraph")){
           uid
        }
     }
queryAuthor(func: type(Author)) @filter(eq(Author.name,"Alice") and uid(post1)){
        Author.name : Author.name
        Author.posts  : Author.posts {
        Post.title : Post.title
        Post.text : Post.text 
          dgraph.uid : uid
        }
        dgraph.uid : uid
      }
    }

i hope it answers, you doubt. Thanks.

Yeah, you can use the above workaround if it works for your use case. But I see some limitations of it

  1. Anything parameter inside the @cascade will work as AND connective. For example in your first query , we can’t do OR of assignedTo and Tags.

  2. We can only use original field names in cascade, we can’t use Aliases. If we want to use two or more filter on same nested field then it’s not possible unless Aliases are allowed inside cascade.
    see this GraphQL: Connected filter on non-scalar list element fields - #11 by pawan

Thanks, @sebwalle for highlighting the importance of nested filters. We will surely prioritize it and try to add it soon.

Can you give example schema and query, I want to see how it’s working with @auth.

3 Likes

Sure, below is the schema. Just started testing the directive and will add some more conditions, such as $ORDER_SUPER_USER etc, hence the or filter. Haven’t done the JWT parsing yet. But this correctly filters out any orders that doesn’t have fund_id 1 or 67 in the allocations block. The logic is that you can only see orders that are at least partially allocated to funds you have permission to.

type Order @withSubscription @auth(
    query: { rule: """
        query { 
            queryOrder { 
                id
				allocations(filter: { or:[ {fund_id: { in: [67,1] } }] } ) {
					id
				}
            } 
        }"""
    }
) {
	id: String! @id
	quantity: Float 
	instrument: Instrument
	order_type: String @search(by:[exact]) 
    allocations: [Allocation] @hasInverse(field:order)
}

type Instrument @withSubscription {
	id: String! @id
	name: String!
	instrument_type: String @search
}

type Allocation {
  id: String! @id
  quantity: Float
  fund_id: Int64 @search
  order: Order
}

this is a query

query {
  queryOrder {
    id
    quantity
    instrument {
      id
      name
    }
    allocations {
      id
      fund_id
      quantity
      order{
        id
      }
    }
  }
}

Thanks, @sebwalle, it will work. If you just want to filter the parent node based on query result because Auth rules are run in a mode that requires all fields in the rule to find a value in order to succeed as it’s mentioned in docs. https://dgraph.io/docs/graphql/authorization/directive/
We automatically add the @cascade directory at the root while converting graphql query to dql query.
Same effect you can get without auth rules

query {
  queryOrder @cascade {
    id
    quantity
    instrument {
      id
      name
    }
    allocations {
      id
      fund_id
      quantity
      order{
        id
      }
    }
  }
}

Auth rules are designed that way but they don’t get any other extra filtering functionality than what we currently have.

Fantastic news!! Thank you!

@abhimanyusinghgaur

I know you guys are working on this for 21.07. Is this possible with var insterad of @cascade as @amaster507 mentioned?

I have several uses for this a well (like many people), and the performance concerns me too.

Thanks,

J

1 Like

Hello, is there a time frame on when this will be available in Dgraph Cloud?

3 Likes

@minhaj is anybody currently working on this or is this stalled?

2 Likes

My team and I are also extremely interested in this functionality. Would love an update as we did not see it in the 2021 roadmap. Thanks for all the great work!

@JatinDevDG just checking to see if there has been any progress with this?

Is this coming in v21.07? A response will be great.

1 Like

I believe this was the last official mention of this incredibly important feature:

But Dgraph has since had some turnover and employee changes. We don’t know where their priorities lie.

This feature is obviously big and a necessity for many.

Then we got this:

Which obviously didn’t happen, nor has there been any sign of GraphQL commits to the repository with the small exception of:

So, it does not look good at this point.

I would ask that someone from Dgraph responds to this, but it is pulling hair to get that at this point.

J

3 Likes

During our testing for v21.09 (we decided to call it v21.09 not v21.08), we found few issues that were blocking the release. It leads us to shift our timelines by few weeks. We recently released the RC-2 tag for v21.09. We are constantly working on making this release stable and we will release across the board soon.

As for the feature, I can’t commit any timeline for this. I am adding this into my to do list and will try to come up with something concrete soon.

4 Likes

@aman-bansal - Thank you for giving us an update on this. We understand things take time to be done right, but the updates let us know Dgraph is continually making the product better.


@amaster507

I have been thinking about this for a while, but if nested-filters were implemented (or when), that would be on you to create the inverse relationship. If the inverse relationship were implemented, even with @cascade, it would still fix your problem to be 81,050 → 1,621 nodes faster:

Type Contact {
  address: [Address]
  name: String
  ...
}
Type Address {
  state: State
  contacts: [Contact] @hasInverse(field: address)
  ...
}
Type State {
  code: ID!
  name: String @id
  addresses: [Address] @hasInverse(field: state)
  ...
}

Forward:

query {
  queryContact(filter: { address: { state: { code: 'TX' } } }) {
    id
    name
    address {
      state {
        name
        ...
       }
      ...
    }
}

Note: Your example is also two nested levels deep, which we hope will be an option as well.

VERSUS Reverse:

query {
  getState(code: 'TX') @cascade {
    addresses {
      contacts {
        name
        address {
          state {
            name
            ...
          }
        }
      }
    }
  }
}

Which seems like it would be the worse of the two, but actually faster (since the other version would use @cascade under the hood anyway.

I assume nested filters will be similar to Hasura’s nested objects.

So I don’t see being able to go the reverse direction under the hood since the types won’t be there with all of the reverse triples, but you should be able to get this faster query working now.

Correct me if I’m wrong?

J

There is a lot to ingest here. And there is a lot of unknown dependencies. There is a big if-if hasInverse and reverse will be or even can be considered into the algorithm which really makes a big difference.

I had to get this working at the surface level of my app. So what I did is remove filters into one graphql query that returns only the ids of the base type I am looking for, and then another graphql query using those ids as a filter to get the actual fields I wanted to query.

So I am basically doing the nested var block filtering just all in the UI instead of the db. It is a LOT OF CODE that I would like to get rid of though. I had to do it in GraphQL and not DQL because I need to honor auth rules along the way.