Understanding indexing better

Dgraph is a pretty interesting project. I was curious to know how the indexing is done and the querying is done over it.

From which place in the code can I read more about it?

applyMutations in worker/draft.go is a good entry point on how mutations are added. If an index exists on this predicate, new index entries will be added during this process as well. Another file to check is posting/index.go.

For the querying part, ProcessQuery in query/query.go is a good start point. Also look at worker/task.go, in particular the methods ProcessTaskOverNetwork and helpProcessTask.

In general, what we do is generate list of tokens to a list of UIDs. For example, if you index a predicate using the term tokenizer and add a triple with subject “Anne Smith”, two lists will be generated (one for Anne and another for Smith), each of which contains the uid of the new triple.

During queries, Dgraph looks at that list to reduce the number of UIDs it needs to look at in order to retrieve results.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.