Problem
Recently, me and my colleague have run into some performance issues with querying dgraph. We are curious whether we have taken the right approach to our dgraph design.
enum Fruit {
apple
orange
banana
}
enum Color {
red
blue
green
yellow
}
type Person {
id: String! @id
favoriteColor: Color @search(by: [hash])
favoriteFruit: Fruit @search(by: [hash])
favoriteWord: String @search(by: [hash])
age: Int! @search
...
...
...
}
{
q1(func: eq( Person.favoriteFruit , "apple")) @filter(
eq(Person.favoriteColor, "red") AND (ge(Person.age, 10) )
) {
id
}
}
- As you can see,
favoriteFruitandfavoriteColorseem to be predicates (not an edge) of a node called Person. - Using this query we have found that it is really slow. And we think it is because dgraph has to make scans of all Person nodes.
- We hypothesize that we would benefit from a new dgraph schema design. We plan to replace predicates with edges.
favoriteColorandfavoriteFruitis now an edge (not a predicate) because they are connected to new nodes FruitNode and ColorNode. In this case, the number of nodes scans is reduced for those only connected toFruitNodeandColorNode.
type FruitNode {
name: Fruit @search(by: [hash])
}
type ColorNode {
name: Color @search(by: [hash])
}
enum Fruit {
apple
orange
banana
}
enum Color {
red
blue
green
yellow
}
type Person {
id: String! @id
favoriteColor: ColorNode
favoriteFruit: FruitNode
favoriteWord: String @search(by: [hash])
age: Int! @search
...
...
...
}
{
q1(func: ge(Person.age, 10) ) @normalize {
id
Person.favoriteFruit: @filter ( eq(Fruitnode.name, "apple") ) {
name
}
Person.favoriteColor: @filter ( eq(Colornode.name, "red" ) ){
name
}
}
}
Question
- In the context of the problem, does replacing
predicatewithedgemake sense in improving query performance? - When should we choose to define our field as a
predicateor anedge(or in other words, when should I use a node attribute or a new node )? - As an aside, what if we compare
favoriteWord <> StringandfavoriteColor <> Enum Color? Are there performance improvements with using enums rather than Strings?
Any input will be much appreciated. Thank you.