The current implementation of encryption in Badger doesn’t allow enabling encryption on an unencrypted Alpha. This topic proposes ways of enabling encryption on an unencrypted alpha.
The encryption in Dgraph is supported via Badger. Badger stores the information about encryption in the key registry file and the manifest file. When creating a key registry file, we store a header in the file which denotes if this badger directory is encrypted or unencrypted.
This header is used to verify the encryption key when DB is reopened. When an unencrypted directory is opened with encryption key, this check fails and we return an error.
Problem with Enabling Encryption on Existing Data
Badger writes are append only. We never modify a file once it is written to the disk. If we were to support enabling encrytion on an unencrypted data directory, only the new data would be encrypted and the existing data will be stored in plain text. This is a serious problem. When someone enables encryption, they wouldn’t want half the data to be encrypted and half the data to unencrypted. I propose we do not allow users to enable encryption on existing. The old data might get garbage collected/compacted and re-written in encrypted format but there is a possibity that this might not (never) happen for a long time.
How Does Someone Enable Encryption with Existing Data?
I propose we allow enabling encryption on alpha with existing data via two ways
Backup and Restore: They take a backup of the unencrypted data in Dgraph and restore the backup with encryption enabled. This is similar to how arangodb allows enabling encryption.https://www.arangodb.com/docs/stable/security-encryption.html#limitations . Currently restore supports only
encrypted p dir. It doesn’t support
encrypted p dir. But this can be added via a flag.
Export and Bulk/Live loader: They can also enable encryption via exporting and importing the data. The data can be exported from an already running alpha and then imported by either bulk or live loader. The new alpha has to be started with
--encryption_key xxxflag. The bulk command currently doesn’t support
encrypted p dir(@Paras can correct me if I’m wrong) but this can also be added with the help of command line flags.
I had a discussion with @Paras about this and we think allowing support for encryption via backup/restore and bulk/live import is the best way.