antblood commented :
If any predicate is indexed then it is stored in the following way:
<predicate, predicate_value> => [uid1, uid2, uid3 .....]
hence in this case predicate sex
will be stored as:
<sex, "f"> => [0x1, 0x3, 0x5, ....]
<sex, "m"> => [0x2, 0x4, 0x6 ....]
and predicate entity_key
as:
<entity_key, "entity1"> => [0x1]
<entity_key, "entity2"> => [0x2]
<entity_key, "entity3"> => [0x3]
...
When we try to find out if the person with “entity800000” has sex
predicate as “f”. First we get it’s uid
from <entity_key, "entity800000"> => [0xC3500]
, then we traverse over the list <sex, "f"> => [0x1, 0x3, 0x5, ....]
to check if the same uid
is present in this list. Traversing over a long list makes this operation very slow.
In the case when we don’t index a predicate then it is stored in the following way:
<predicate, uid> => [value1, value2 ....]
hence in this case:
<sex, "0x1"> => ["f"]
<sex, "0x2"> => ["m"]
<sex, "0x3"> => ["f"]
...
in this case, checking the value of sex predicate for a node is very fast as we only need to traverse over one value.
Hence, it’s better not to index the predicates that can only have a few different values. Like in this case sex
predicate has only two values “f” and “m”.
Time taken for query when we index sex
predicate : 300 ms
Time taken for query when we don’t index sex
predicate : 3 ms