Index consistency during shard population

Hey @minions, @jchiu,

I’m trying to figure out of our shard move would keep both the shard and the index consistent.

  • When we copy over the shard, we copy directly the RocksDB data. This does mean that some PLs might still be in memory, and not flushed out to RocksDB; when we do the copy. However, when the mutations flow through, they should be able to rectify the missing data, and bring it back up to the right state. So, I think shard move works.


There’re three ways to handle index.


  • We copy over the indexing data as well. Note that index is generated via goroutines which might be written out later. So, the indexing data wouldn’t be consistent either. Now, when the mutations flow through, they would do mods to the index; but I’m not sure that they’d be idempotent. For e.g.

Data: SET [1, name, Tom hanks]
Now assume that this part of the data was copied over just fine during RocksDB copy, but the index wasn’t. So, the PL exists with “Tom hanks” as value.

Now we run AddMutationWithIndex with this fact, it would see that the value already exists, and not mutate it; which would also not update the index.

This then leads to an inconsistent index, which would never be fixed.

The second way is

  • We ignore all the indexing data; and force everything via AddMutationWithIndex. That would then require us to not write directly to RocksDB, but pass everything as a mutation; which is going to be a lot more expensive than what we’re currently doing.

A third way is to have a syncIndex function which can look over all the data, and sync up the index. We can then run this after the move, and also as a periodic thread to ensure no inconsistencies lie between data and index.


Quote: When the mutations flow through, they should be able to rectify the missing data.

I wonder if these mutations can somehow include index mutations as well. In other words, can index mutations be replayed just like other mutations?

For that to happen, we’ll have to append the index mutation to the data mutation, before proposing (them together) to the cluster. The problem with that approach is, we’ll have to do the read in advance to determine what the index mutations should be; which means if we have two data mutations come up, really quickly:

Existing: Robert de-niro
SET → 1, name, Tom hanks
SET → 1, name, Bradley Cooper

Then, both of these would read [Robert de-niro], and attempt to delete it; and both would add 1 to tom hanks and bradley cooper.

That would still cause inconsistent behavior.

For the third way, we don’t copy the index data and just rebuild it? If so, this seems like the most direct way and is harder to go wrong. Do we let “subsequent mutations flow through” only after fully syncing the index?

Yeah, this shard copy process is a preparation process for the server, before it serves any real queries; so there’s no issue with any pending mutations.

So, what we could do is to copy over the index data as well, and then run the syncIndex; this might speed up our sync process. Also, for syncIndex to be run periodically, it would have to deal with existing index data anyway.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.