Query in tour of dgraph

I’m completely new to graph databases (not to graphs, though), but I’ve been spending a bit of time recently tinkering and learning about dgraph and I’ve come across a query in the Tour of Dgraph I don’t quite follow, most likely because I’m missing something fundamental.

{
  PJ as var(func:allofterms(name@en, "Peter Jackson")) @normalize @cascade {
    F as director.film
  }

  peterJ(func: uid(PJ)) @normalize @cascade {
    name : name@en
    actor.film {
      performance.film @filter(uid(F)) {
        film_name: name@en
      }
      performance.character {
        character: name@en
      }
    }
  }
}

The way I understand it is the first query will match all nodes that match the allofterms(...) function results. This will give me a list of uids of all Peter Jackson nodes (or at least the one matching the PJ query)

Now, the next query (peterJ) is then “run” for each of the matched node (uid(PJ)) …the query “follows” the actor.film edge and this is where I get really confused: actor.film edge will lead to a list of (presumably film) nodes, but how do I know that those matched nodes might have performance.film and performance.character predicates?

I did a quick query for schema in the playground but I’m none the wiser. How do the authors of the tour know there will be these predicates? Is there a way to discover them?

I assume actor.film will “lead” to (match) a film node, so it would never occur to me a film node would have an edge to another film node (?) alas it’s possible. Either way, how do I know what predicates the films might have?

I’m sure I’m missing something fundamental and my mental model is not quite right. I’d appreciate if someone could shed some light on this.

Right. This block is doing a kind of “pre-filter” and Capturing uids to apply to the next block.

yep

In general, Type Schema can help. You go to the Schema panel, then select “Types” and you will be able to see the Types. Look for “Performance” and you will have listed all types of that entity there.

Another way you can find out is by using the function expan (all) https://dgraph.io/tour/schema/9/

e.g:

  1. remove @normalize @cascade for a moment
{
  PJ as var(func:allofterms(name@en, "Peter Jackson")) @normalize @cascade {
   director.film { uid }
  }

  peterJ(func: uid(PJ)) {
    actor.film {
      expand(_all_) {
        uid
      }
    }
  }
}

Using

      expand(_all_) {
        uid
      }

Or

      expand(_all_) {
        expand(_all_)
      }

you are able to “dig” and find out in an indirect way which edges are there.

In fact, whoever designed (I think this come from freebase or something) this dataset had created a set of intermediate objects/nodes. Sometimes the Actor did different performances, participation and etc. Who designed this dataset was thinking about simplifying the Actor_node.

Intermediate nodes are basically “list of things that belongs to a single ou multiple entities”. In this case, the list belongs to that person/actor. It holds information of all nodes with type performance.

This is a common way to decrease the burden of predicates and edges that an entity has. Instead of you having the Actor type with

performance.actor
performance.character
performance.character_note
performance.film
performance.special_performance_type
++All the other edges from Actor type

You create an intermediate type that is between the Actor and the movie. So you don’t have crowded information in a single type.

It’s good to subdivide them from time to time, cuz make the information more reliable and ready to apply functions between edges which makes the query faster. Also, Intermediate nodes are also good for creating input and output facets (but that is not the case here) or “queues”.

If you designed your dataset, at some point you documented it. Other than that, you have the ways mentioned above like expand functions.

Thank you, this is super useful!

I did made a Schema query but it was the “generic” one ie. schema{} which was only partially helpful. Looks like it’s a good starting point, but expand(_all_) are the perfect follow up.

I did try the expand option, but I think what I was doing was I was keeping both @cascade and @normalize in which was not giving me the results I was hoping for.

Thanks again!

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.