Does using EBS volume snapshot to backup data risk losing data or cluster information?

What I want to do

I want to use AWS EKS EBS volume snapshot to backup data stored in EBS volumes attached to pods via pv claimed by pvc, instead of using binary backup provided as Dgraph enterprise feature which requires a license.

My Question

  1. What is the difference from using Kubernetes volume snapshot and using binary backup?

  2. Is it possible to guarantee data full recovery using kubernetes volume snapshot? if yes, what strategy is recommended to make sure that Alpha or Zero pod can be rebound to the newly created volumes from snapshot and the cluster will recover and continue to work normally?

  3. Is there any 3rd option open for free to backup data and cluster information so that the whole cluster and data will be fully recovered once it’s been damaged.

In general, binary backup performs some operations at the badger layer when doing this. I don’t know exactly the details. I would deduce that it will do compression, and normalization to synchronize the data at a certain timestamp.

For snapshot backups, there are the other alpha members that may not be in sync, so if you backup alpha-0, how do you know that is the latest snapshot, as alpha-2 may have the latest. If you want to use this safely, then I would recommend resizing Dgraph to a single member, alpha-0, then doing the backup. For the restore process, the same, resize to just one member, restore, then size them back up to 3+ members.

Another option is to use the export, which is free.

2 Likes