Backup and Restore with operations like DROP_ALL

At present, Dgraph enterprise has Backup and Restore operations which mostly work fine except for a few edge cases. I will list those cases, but first, let’s take a look at the Backup and Restore process. Here are the steps to follow:

  1. Start Dgraph cluster.
  2. Insert some data.
  3. Perform a backup operation: This creates a full backup.
  4. Keep adding data to your cluster.
  5. Perform a backup again at the same location: This creates an incremental backup.
  6. Assume that your cluster got corrupted, so now you have to restore.
  7. Run restore operation using the backed-up data.
  8. Everything works fine, you get all the data you had backed-up. No problems!

Now consider the scenarios where you do a DROP_ALL operation after you performed your first full backup. After this, you added some data and did an incremental backup. Restoring from such a backup will restore even the dropped data. Which is not the expectation. The reason this happens is that backup operation only records updates to the keys which exist in the underlying badger DB after the last backup was done. As DROP_ALL causes all the keys from badger to be dropped, it can’t know that the key was updated because it no longer exists in badger. This same behavior happens in all the below operations:

  • DROP_ALL: All the dropped data, types, and predicates will be restored.
  • DROP_DATA: All the dropped data will be restored.
  • DROP_PREDICATE: The predicate which was dropped will be restored.

Note that DROP_TYPE with backup and restore works correctly because the backup operation was always writing the current type keys to the backup ignoring the sinceTs. But, for data keys, only the ones which were updated after the sinceTs were being written to backup.

The underlying reason for all the above operations is the same, i.e., the key gets deleted from badger and then it can’t find out any changes about the key because it does not know about the key.

On the other hand, if you consider a delete mutation of the following form:

{
  delete {
    <uid> <predicate> * .
  }
}

Then, in this case, the <uid+predicate> key is not deleted from badger but instead overwritten with a nil value. This makes sure that when a backup operation is run, it will get the new version of the key which contains a nil value and during restore, the same will be overwritten back. So, backups with delete mutations work, but they don’t work with the DROP_* operations.

To fix this, we are thinking of the following:

  • Whenever a DROP kind of operation occurs, record it. This record will be kept in badger in reserved predicates like dgraph.drop.op.
  • Whenever backup is triggered, also find out what all DROP operations were run and write about them in the backup manifest.
  • When the restore is triggered, then while iterating over the different backups in a backup series; it will first check if there were any DROP operations in the current backup manifest it is processing. If yes, it will first perform those operations, and after that, it will apply the backup. This way restore will create the exact replica of the state machine which was there at the time of backup.

FAQ

  • If someone did a DROP_ALL then all the dgraph.drop.op records will be lost. Will that affect backup?
    No, it won’t. Because after DROP_ALL a record will again be created. So, the backup will always know about the DROP operations that have been performed after the last backup was taken.

Any feedback is welcome!

UPDATE [Nov 2, 2020]

After discussion with @mrjn we decided to go on with the above explained approach. We will never be deleting the dgraph.drop.op records, except when a DROP_ALL is triggered. Also, whenever a backup is triggered, whether full or incremental, it will always have a sinceTs, so it will only get those dgraph.drop.op records which were not written in the last backup. During backup, we will treat the dgraph.drop.op predicates specially and also write about them in the backup manifest, in addition to the binary backup file. So, during restore, we will first apply the DROP operations described in the backup manifest being restored, and after that, it will perform the writes described by the backup.

This will make sure that even if someone triggers two different backups at two different locations, whether full or incremental, then they always capture the full state difference between the last backup and the ongoing backup. So, the restore operation can always recreate the exact replica of the DB.

cc: @mrjn, @pawan, @vvbalaji, @gja, @ibrahim

2 Likes

Wow, thanks for getting to the bottom of this.

Wow, this is incredibly dangerous. I wonder if we should be disabling DROP this in Slash GraphQL until this is fixed, as restore(backup(cluster)) != cluster.

I have a dumb question, but don’t we stream the drop operation into the WAL, and therefore won’t it go into the backup? Or is this handled differently in dgraph? From the description, it feels almost like DROP is bypassing dgraph and going straight to badger or something?

What if someone tried to drop dgraph.drop.op? Or do we disallow that?

Yeah! that would be temporary avoiding it.
We can temporarily fix it by triggering a full backup every time someone does a DROP_* operation.

But, we want the fix for this issue to go in 20.11, so we should have a permanent fix in some time.

yes, every proposal is written in RAFT WAL. But, the contents of the proposal determine what should be written to badger, and the contents of badger determine what goes in backup. This particular proposal for DROP_ALL dictates that the contents of badger be emptied. Since, there is nothing left in badger after this operation, so nothing goes in the second backup and while restoring the old data is restored from the first full backup.

Basically, at present we are not recording during backup that such an operation has occurred, and so we are not able to construct exact replica during restore.

we will disallow that.