The performance of @filter！

zhaojiangkun · July 30, 2023, 9:31am

I would like to know the performance of @filter when used in query conditions and in result sets。
in query：

{
  scott(func: eq(name@en, "Ridley Scott") @filter(le(initial_release_date, "2000")) {
    name@en
    initial_release_date
    director.film {
      name@en
      initial_release_date
    }
  }
}

in result sets:

{
  scott(func: eq(name@en, "Ridley Scott")) {
    name@en
    initial_release_date
    director.film @filter(le(initial_release_date, "2000")) {
      name@en
      initial_release_date
    }
  }
}

Which of these two is better, and what’s the difference？

MichelDiz · July 30, 2023, 4:00pm

I believe these queries are incorrect for what they intend to do.
The first one is filtering on ‘initial_release_date’. However, the filter is being applied to the ‘Director’ Type. There isn’t a release date for people, but there is for movies.

The second query correctly filters the movie’s release date. The first query is conceptually incorrect. They aren’t the same and therefore can’t be compared.

MichelDiz · July 31, 2023, 2:05am

To ensure you’re not left empty-handed, here are valid examples.

{
  scott(func: eq(name@en, "Ridley Scott")) {
    name@en
    initial_release_date
    director.film @filter(le(initial_release_date, "2000")) {
      name@en
      initial_release_date
    }
  }
}

Compared with

{
  scott(func: eq(name@en, "Ridley Scott")) @cascade {
    name@en
    initial_release_date
    director.film @filter(le(initial_release_date, "2000")) {
      name@en
      initial_release_date
    }
  }
}

That is, the query above will only return Ridley Scott if there are films below the year 2000.

{
  F as var(func: has(director.film)) @filter(le(initial_release_date, "2000"))

  scott(func: eq(name@en, "Ridley Scott")) @cascade {
    name@en
    initial_release_date
    director.film @filter(uid(F)) {
      name@en
      initial_release_date
    }
  }
}

The above has the same effect as the previous one but uses a different approach that may be more performant in some cases.
This is because the computation of the filters is being done concurrently. Even using has(), it may be more performant than using cascade or nested filters. As I mentioned, the computation of the filters is done concurrently, whether by threads or by Alpha if you have more than one Alpha instance in your cluster.

{
  F as var(func: has(director.film)) @filter(le(initial_release_date, "2000"))

  scott(func: eq(name@en, "Ridley Scott")) @filter(uid_in(initial_release_date, uid(F))) {
    name@en
    initial_release_date
    director.film {
      name@en
      initial_release_date
    }
  }
}

The above example is a variation of the previous one, but without using Cascade.

zhaojiangkun · July 31, 2023, 3:05am

That’s great. Thanks.

Topic		Replies	Views
How to use mutiple filters? GraphQL	4	602	November 17, 2022
Filter vs Function performance Dgraph performance , area:performance	4	1132	August 6, 2020
RFC: Nested Filters in GraphQL Dev graphql , rfc	39	9522	March 31, 2024
Filter on non-scalar fields or relations (Types) GraphQL kind:enhancement , status:accepted , ticket:created	22	4523	May 6, 2021
Connecting filter queries is slower than not connecting Dgraph dgraph	8	529	March 1, 2021

The performance of @filter！

Related topics