Finding the intersection of many edges from node A within node B

What I want to do

Find the intersection within an array A and an array B of edges.

Let’s suppose that we have 2 movie genres: Thriller and Comedy. These genres are unique and are themselves nodes, so the movie will have edges to these nodes depending on the number of genres that it has.

I want to find only the movies that contain these genres exclusively as follows:

movies(func: eq(dgraph.type, "Movies")) @cascade{
  id
  genre @filter( eq(name, "Thriller") AND eq(name, "Comedy") )
}

This returns an empty result as consequence, which is different from the OR operator. But this makes sense because a name is a unique predicate and it does not behave as a list. You can have a name or the other one, but you can’t have both.

So my question is, how can I find the intersection of these 2 edges results having this previous statement as a fact without consuming too many resources?. Below you’ll see my solution and will notice that it might become a very expensive operation if the transaction is made with too many elements at the same time.

What I did

{ 
  movies1 as var(func: eq(dgraph.type, "Movies")) @cascade {
     genre @filter(eq(name, "Thriller"))
  }
    
  movies2 as var(func: uid(movies1)) @cascade{
    genre @filter(eq(name, "Comedy"))
  }

  movies3 as var(func: uid(movies2)) @cascade{
    genre @filter(eq(name, "Drama"))
  }

  result(func: uid(movies3)) {
    id
  }
}

I hope that my explanation is clear, above is a solution to the problem, but I would love to get help to make this more efficient.

I appreciate a lot your contribution in advance.

Have a nice evening.

Dgraph metadata

dgraph version

v20.11.1

This query works with the schema and data on play.dgraph.io, there are 470 movies in that dataset that match this query

{
  var(func:eq(name@en,"Thriller")) {
    movies1 as ~genre
  }
  var(func:eq(name@en,"Comedy")) {
    movies2 as ~genre @filter(uid(movies1))
  }
  var(func:eq(name@en,"Drama")) {
    movies3 as ~genre @filter(uid(movies2))
  }
  result(func:uid(movies3)) @filter(type(Film)) {
    uid
    name@en
  }
}

Improvements made to query

  • Does not use cascade to post process results
  • Uses reverse edges to start with the smaller known root variable blocks
  • Builds each block upon the first
  • Final results puts uid filter in the root function and type filter as extra filter.

Hi @verneleem, thanks a lot for your reply. This seems very helpful since it achieves to improve resources. However, do you think this is the only way to find the intersection of many elements?

Imagine I have 100k elements and I want to find the intersection within all of them, for instance, if I am using Node, I would have to build a string that contains this whole transaction, at least maybe by my side I will be able to process a huge iteration, but, is it an enough efficient implementation in dgraph?