Nodes (models) sharing predicates - DGraph schema design best practice


(Cameron Batt) #1

Below is a simplified setup to focus the discussion

Imagine these two models with the following schemas:

// Animal model schema
{
name: string @index(fulltext) .
age: int .
desexed: bool
}

// Person model schema
{
name: string @index(fulltext) .
age: int .
married: string
animals: uid .
}

And the following query:

query users($a: name) {
usersbyname(func: eq(name, $a)) {
{
name,
age,
sex
}
}
}

Now imagine the database is full of data, we have users and animals that both have the name “Bob”.

Currently it seems I will get back both animals and users with this query. When I first tried this I really only wanted users and it surprised me that it returned both. On reflection it make sense as they are both using the same predicate “Name” (hopefully I am using this term correctly).

As animals and users have slightly different shapes my data will not have consistent fields with this query, even if I had enforced this in during creation.

So my question is what is the best way to handle this? My thoughts are:

  • prefix: userName animalName (naive approach?)
  • facets: add user and animal as a facet to name and filter on the facet

A concrete example you can run now highlighting the issue.

The query below is just and edit of the first query in the “Functions” section of the docs website. I just changed it from “jones indiana” to “Romance”. You can see it returns what I believe is a Genre as a primary result, which is a different shape as it does not have Genre’s of its own.

https://docs.dgraph.io/query-language/#functions


(Sharon) #2

Add a Type edge to store an identifiable type for the node.

Type: string @index(hash) .	

Now you can create a node, and assign a Type edge to it on creation. And then query against the Type edge to filter specific models.

I can’t seem to find any information regarding indexing properties for facets. I am not sure if they are indexed at all.

Edit: After reading @pawan’s reply the above approach is not recommended. Instead they recommend to create a predicate for each type you want to represent in your model.


(Pawan Rawal) #3

Hey @vespertilian

We have some recommendations on associating types with nodes at https://docs.dgraph.io/howto/#giving-nodes-a-type.

Ideally, you want to associate a unique predicate with a node and use the has function to filter nodes. So, you could get all Persons by using has(animals) (assuming all persons do have an animal edge?) and similarly something for an animal.


(Cameron Batt) #4

Ok,

Thanks Pawan.

Is there any performance hit by sharing the predicate name and using a second predicate to filter the type?

So
3 People [“Bob”, “Sarah”, “Helena”]
3 Animals [“Bob”, “Spot”, “Angus”]

If they share the “name” predicate and I filter using a type predicate [“person”, “animal”], would this not be slower than using a animalName and userName predicate from the get go, as you skip the filter step?

Or is the point that, names are names, and maybe I want to query animals and people by name one day. So using the same predicate will give me that ability.


(Pawan Rawal) #5

Having 2 different predicates (userName, animalName) would be better as

  1. Data for both can lie on different machines as data is sharded by predicates.
  2. Lesser conflicts while doing mutations.

We recommend against having a generic Type edge.


(system) #6

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.