The performance of @filter!

I would like to know the performance of @filter when used in query conditions and in result sets。
in query:

{
  scott(func: eq(name@en, "Ridley Scott") @filter(le(initial_release_date, "2000")) {
    name@en
    initial_release_date
    director.film {
      name@en
      initial_release_date
    }
  }
}

in result sets:

{
  scott(func: eq(name@en, "Ridley Scott")) {
    name@en
    initial_release_date
    director.film @filter(le(initial_release_date, "2000")) {
      name@en
      initial_release_date
    }
  }
}

Which of these two is better, and what’s the difference?

I believe these queries are incorrect for what they intend to do.
The first one is filtering on ‘initial_release_date’. However, the filter is being applied to the ‘Director’ Type. There isn’t a release date for people, but there is for movies.

The second query correctly filters the movie’s release date. The first query is conceptually incorrect. They aren’t the same and therefore can’t be compared.

To ensure you’re not left empty-handed, here are valid examples.

{
  scott(func: eq(name@en, "Ridley Scott")) {
    name@en
    initial_release_date
    director.film @filter(le(initial_release_date, "2000")) {
      name@en
      initial_release_date
    }
  }
}

Compared with

{
  scott(func: eq(name@en, "Ridley Scott")) @cascade {
    name@en
    initial_release_date
    director.film @filter(le(initial_release_date, "2000")) {
      name@en
      initial_release_date
    }
  }
}

That is, the query above will only return Ridley Scott if there are films below the year 2000.

{
  F as var(func: has(director.film)) @filter(le(initial_release_date, "2000"))

  scott(func: eq(name@en, "Ridley Scott")) @cascade {
    name@en
    initial_release_date
    director.film @filter(uid(F)) {
      name@en
      initial_release_date
    }
  }
}

The above has the same effect as the previous one but uses a different approach that may be more performant in some cases.
This is because the computation of the filters is being done concurrently. Even using has(), it may be more performant than using cascade or nested filters. As I mentioned, the computation of the filters is done concurrently, whether by threads or by Alpha if you have more than one Alpha instance in your cluster.

{
  F as var(func: has(director.film)) @filter(le(initial_release_date, "2000"))

  scott(func: eq(name@en, "Ridley Scott")) @filter(uid_in(initial_release_date, uid(F))) {
    name@en
    initial_release_date
    director.film {
      name@en
      initial_release_date
    }
  }
}

The above example is a variation of the previous one, but without using Cascade.

2 Likes

That’s great. Thanks.