Dgraph tour Moredata/2: how schema deduction are made?

This section starts with:

Dgraph queries, Schema Types and the visualization will help us understand the schema of the movies dataset.

Ok.
However I don’t see how deduction are made:

Checking the Type Schema tells us that directors as a “Person Type” have a name and are connected to the movies they directed via director.film.

I see from the schema that “Person Type” has name and director.film (and actor.film also) but I don’t see how it tells me that director.film is a “Movie Type“.

Idem for:

We can see a movie has a name and also genre and starring edges leading to the genres the movie is classified in and the people in the movie.

I can guess that genre edge is of type Genre because of the names but how from the schema ?

Investigate the starring edge too and you’ll learn that nodes reached via that edge represent the performance of an actor as a particular character in the movie.

How can I get “nodes reached via that edge” and know it is a performance?

For all of those deductions, I can make a guess but I have no certainty.

Hi there

So let me walk you thru Moredata/2:

First, here’s the schema:

# Define Directives and index

director.film: [uid] @reverse .
actor.film: [uid] @count .
genre: [uid] @reverse .
initial_release_date: dateTime @index(year) .
name: string @index(exact, term) @lang .
starring: [uid] .
performance.film: [uid] .
performance.character_note: string .
performance.character: [uid] .
performance.actor: [uid] .
performance.special_performance_type: [uid] .
type: [uid] .

# Define Types

type Person {
    name
    director.film
    actor.film
}

type Movie {
    name
    initial_release_date
    genre
    starring
}

type Genre {
    name
}

type Performance {
    performance.film
    performance.character
    performance.actor
}

I see from the schema that “Person Type” has name and director.film (and actor.film also) but I don’t see how it tells me that director.film is a “Movie Type“.

You are right. If you look at this

director.film: [uid] @reverse 

You see its type is a uid. Meaning it is any node. Usually we like typed nodes, so let’s only consider that. Thus a bit of leap in inference is required. Dgraph is fundamentally untyped in its core. You could of course assign a Person-typed node to director.film. But don’t do that.

I can guess that genre edge is of type Genre because of the names but how from the schema ?

So what I’d do is look at starring:

starring: [uid] .

Nothing in this schema tells us that it’s supposed to be a node that is typed by Person. However as a schema designer you would have that in mind.

The paradigm we prefer in Dgraph is to treat it as a dynamically typed language. You can always enforce that starring is a Person in your queries:

q(func: type(Movie)) {
    name
    starring @filter(type(Person)) {
       name
    }
}

I’ll discuss why this was chosen in the next post.

Of course if you run the query above, nothing will show up, because starring is not a Person. Rather it’s a Performance.

How can I get “nodes reached via that edge” and know it is a performance?

You get the nodes reached via that edge by putting it in your query. And you know its type by using the <dgraph.type> field:

q(func: type(Movie)) {
    starring { # adding this field queries nodes reached  by the edge `starring`
        <dgraph.type> # tells you the type that the nodes that are in `starrng` is 
    }
}

So you’re right, a lot of it is guesswork/intelligent inference.

1 Like

Just adding to Chew’s answer.

director.film is just an edge. You can’t determine what type is what with that reference. The Schema Type can tell you. director.film connects the Person Type with the Movie type.

type Person {
    name
    director.film
    actor.film
}

type Movie {
    name
    initial_release_date
    genre
    starring
}

Is by reading the above part that you might understand the whole picture. Not individual edges right?

genre predicate is just an edge that connects two types of objects.

All “nodes” has a dgraph.type predicate(value edge). You can simply add dgraph.type to the block of the query to find out what type is that object.

Thank you for the response. I still have questions though…

––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

Why shouldn’t I ? Is it just not idiomatic to do it or is it for a practical reason ?

––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

This query for starring does return nothing:

However, it works for genre:

q(func: type(Movie)) {
    genre { 
        <dgraph.type>
    }
}  

What’s happening ?

––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

Why this query with expand:

{
  q(func: allofterms(name@., "The Secret Life of Words")) {
    expand(_all_){
      expand(_all_){
        expand(_all_)
      }
    }
  }
}

doesn’t return starring edge:

{
  "data": {
    "q": [
      {
        "name@en": "The Secret Life of Words",
        "name@it": "La vita segreta delle parole",
        "name@de": "Das geheime Leben der Worte",
        "initial_release_date": "2005-09-01T00:00:00Z",
        "genre": [
          {
            "name@en": "Drama"
          },
          {
            "name@en": "Romance Film"
          }
        ]
      }
    ]
  }
}

And this one:

{
  q(func: allofterms(name@., "The Secret Life of Words")) {
   starring{
    expand(_all_){
    expand(_all_)
  }
  }
  }
}

returns nothing.
while this one is ok:

{
  q(func: allofterms(name@., "The Secret Life of Words")) {
   starring{
    performance.actor{
    expand(_all_)
  }
  }
  }
}

Do dots in predicates have a syntactic signification ? I thought it did not but I am not sure anymore…

Sometimes it is a loose object. Without dgraph.type. You may find some of them but is not usual. It should not happen tho.

I think it should be Person type. Or maybe there is an extra edge. Try to add “uid”
e.g:

q(func: type(Movie)) {
    starring { 
        uid
        <dgraph.type> 
    }
}

If there’s no UID we might have a problem in the dataset.

If it does not return, it is because that object does not have that edge. It is not mandatory to have all the edges that we have in the Type Definition.

It is just a “convention” that tells the direction of the relation. For example, the edge state.city - It tells that this edge comes from the object “state” and goes to “city”. This isn’t mandatory to have. But you may find it everywhere.

Returning to this point, notice that there is a set of intermediate nodes. That’s why it didn’t appear in your query(starring { <dgraph.type>). This set has no dgraph.type. I will analyze this.

I get the uids but still no type.

But It has that edge:

{
  q(func: allofterms(name@., "The Secret Life of Words")) {
    starring{
      performance.actor{
      expand(_all_)
    }
  }
  }
}

returns:

{
  "data": {
    "q": [
      {
        "starring": [
          {
            "performance.actor": [
              {
                "name@en": "Javier Cámara"
              }
            ]
          },
         # More starring…
        ]
      }
    ]
  }
}

There is no way to list edges even if they are nodes without expanding them ? I thought this would work, but it returns nothing:

{
  q(func: allofterms(name@., "The Secret Life of Words")) {
    starring{
      expand(_all_){
      uid
    }
  }
  }
}

As that level has no dgraph.type, the expand function won’t work. I gonna fix it soon.

Try this query

{
  q(func: allofterms(name@., "The Secret Life of Words")) @recurse(depth: 4, loop: false){
       uid    
       starring
       performance.actor
       name@en
  }
}