Cancel an ongoing index creation

Moved from GitHub dgraph/3061

Posted by EshwarSR:

Cancel an ongoing index creation:

Accidental trigger of creation of indexes can lock the database even for reading. Even restarting the DB did not stop index creation. It started reindexing after restart.

I this it is a good feature to have an api to view the ongoing index creations and to have an option to cancel the creation of the ones not required.

I think it will also be a good option to have an asynchronous index creation, without blocking the reads to database.

manishrjain commented :

The read blocks because index regeneration happens serially with the other writes, so until that finishes, following writes can’t happen (which also block reads with higher timestamp). This issue can provide a way to read data without blocking: Support Best Effort Read · Issue #3064 · dgraph-io/dgraph · GitHub

We can change Dgraph behavior, so if it sees an indexing proposal for a predicate, can cancel any ongoing indexing proposal. This would allow a user to send a unindexed schema for predicate, causing the ongoing one to be cancelled, and its index data deleted.

martinmr commented :

What version are you using? Version 1.0.12 (still a release candidate) has a change to perform partial re-indexing so that the process no longer performs unnecessary work. This should speed up re-indexing and hopefully make the pause in writes less noticeable.

EshwarSR commented :

I’m using the latest stable version.

Dgraph version : v1.0.11
Commit SHA-1 : b2a09c5b
Commit timestamp : 2018-12-17 09:50:56 -0800
Branch : HEAD
Go version : go1.11.1

mangalaman93 commented :

We have been working hard behind the scene to figure out a way we can cancel an ongoing index and still have strong consistency semantics in Dgraph. As Manish explained, the challenge in cancelling indexing is that the followed proposal are not seen until index creation is complete i.e. an alpha will not see a proposal for index deletion (i.e. unindexed schema) until the index is fully computed. Looking ahead in the proposal would be complex and possibly inefficient to implement.

What we plan to do instead is to figure out a way to compute the indexes in the background. This has its own challenges too. But once we do this, we can continue applying rest of the proposals while indexes are computed in the background. Now, if rest of the proposals contain proposal for deleting the index (or cancelling the index, i.e. unindexed schema), we could stop the background computation of the index.

We are still working on figuring out a way to compute the indexes in the background. We have a proposal in place, we are executing the proposal and testing the code. I will keep you posted with rest of the progress. Once again, thanks for filling the issue.

mangalaman93 commented :

PR for building indexes in background #4819