I have two nodes, lets call them Person & Company.
The Person has an edge named Worker pointing to Company.
The company has a bool predicate isDeleted.
I wish to retrieve a Persons by it uid and return nothing if the company isDeleted predicate is set to true.
Preferable by using the root func to get the Person and not the Company and then recourse all decedents.
{
# UID of naman
foo(func: uid(0xfffd8d67d83ccb99)) @cascade{
name
worker @filter(eq(isDeleted, false)){
name
uid
}
}
}
If there be another company whose isDeleted is false and its name predicate is set, then you would get a single matching company. If any of the predicate is missing from the query traversal, all predicates are removed.
Both of the solutions will work fine. But these will be slow on a large amount of data if isDeleted predicate is indexed (reference) and it’ll be better to keep isDeleted predicate unindexed.
Well, it has more to do with the number of different values a predicate can have rather than if it is used in the filter. For eg. in your case isDeleted can only have two different values, True or False.
Lets say you have 10 million nodes in your dataset. Then the posting list of (<isDeleted>, true) will have almost 5 million nodes and same for the (<isDeleted>, false) case. Hence iterating over them will be a very expensive operation.
If you remove index from isDeleted then it’ll be stored as <predicate, uid> => [value1, value2 ....] and in your case as <isDeleted, 0x123> => [True] and accessing this value will be very efficient.
In the reference I provided above, query with index for sex predicate takes around 300 ms whether sex predicate is used in the @filter or in the main function and it takes around 3 ms if we don’t index the sex predicate (this time sex predicate can be only used in the @filter as the predicate of main query is required to be indexed).