Implementing indexing and filtering

Sorry, should have done this earlier.

This pertains to indexing predicates specified by the user.

We use Bleve indices: http://www.blevesearch.com/

Indices will run in two different modes: backfill and frontfill.

Backfill

For backfill, the user will give us a posting store (which can be a snapshot in future with a timestamp). Then for each predicate that the user wants to index, we will seek to the right place in posting store and add to indices. We also add sharding by predicate but the user has to decide the number of shards at the outset, for now. Note: We do not deal with gotomic hashes / dirty mutations. We only consume from RocksDB.

One may ask: Instead of seeking for each predicate, why not just read each row of RocksDB and add to index. The code could be reused for frontfill as well. However, we think that it will be quite common for users to add index for a new predicate over time, and we like this to be efficient. Iterating over the whole table each time seems too time-consuming.

Frontfill

For frontfill, the design is not fixed. Ideally, you want to have a log (maybe not commit logs) to keep the mutation changes so that we can replay them. This would be useful if we want to add an index while keeping dgraph running — you make a snapshot, remember the timestamp, copy the snapshot, replay mutations from logs from that timestamp onwards.

However, as warned, commit logs will be under heavy construction from the work with RAFT. So, it is best not to meddle with it for now.

Hence, I propose doing a very simple frontfill. As mutations come in, we will update the indices. We will not try to read from logs. This would not support replay, but it would work for mutations for now.

Wonder what you all think @core-devs ? Thanks.

3 Likes

To rephrase Backfill, you’re given a list of predicates that the user wants indexed. This is the flow of events:

  1. Server starts with a posting store.
  2. You go through the list of predicates and run iterator over RocksDB for these predicates. Note that each predicate would have many keys, one key per subject.
  3. Read the key-values for these predicates and apply them to Bleve Index.

The questions are: If the Bleve Index already exists, what happens? Do we overwrite it? Or, would these changes be idempotent?

For Frontfill, I think for now just passing on the mutations to a channel and having a background thread read from that channel to add them to the index, is probably sufficient. Going forward, as we implement RAFT, whatever mechanisms are put in place for PLs, we can reuse them for Bleve indices.

Part of RAFT implementation is also replays after a snapshot. So, we’ll have to put this in place anyway. I think Bleve frontfill can fit right in easily once that mechanism is set.

Bleve index is idempotent: if you call index twice with the same key and value, it will give you the same results. But if you call with the same key but different value, then the value will be overwritten.

For Backfill, I added a new binary dgraphindex which should be run on a static snapshot. For now, we will have to do it before dgraph is run or before any mutations happen.

Right now, dgraphindex will crash if the indices folder is already there. I can modify it to not crash and just overwrite since bleve is idempotent. Another thing that I plan to add to dgraphindex is adding of additional predicates without trying to rebuild the whole thing, and updating the JSON schema accordingly.

And yes, I am assuming a config file in JSON now. This can be merged with some other schema we have in future.

When dgraph is started, it will just read in all indices if it is there. One basic fact that I forgot to mention is that we create at least one Bleve index for each predicate. We create more if the users want more shards.


Let’s try editing here. Indexing can take a while. With 6 shards and my desktop, it takes 1 min 40+ seconds just to index “type.object.name.en”. With more predicates, I can imagine it taking up to around 5 mins.

I should mention that backfill should be thought of as a one-off process, or something that is not run very often.

dgraphindex binary is probably useful for debugging, but I think Backfill would have to be integral part of Dgraph server initialization. We’ll have to synchronize PL → Bleve every time server starts, before it goes live. Do you have any ideas how fast can Bleve index stuff? If it’s really fast, then we can avoid having to maintain and move Bleve index, instead just doing indexing from scratch.

Indexing from scratch has benefits that we’ll always be eventually consistent with a simpler design. Of course, it might be slower, depending upon how fast the indexing can run.

Update: Maybe try to run dgraphindex over all the names we have (4.some million edges) and see how long does it take to index them.


Btw, you could just edit your reply and add more thoughts as you deem fit, instead of creating multiple replies. Helps keep the thread small, and is the discourse way :-).

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.