I’m consider using badger for store key with multiple versions of values. It seems there are at least 2 approaches for this:
make composite key of (key, version), maybe using “Uint64Max - version” for version to order inverse-chronological order. I can use the out-of-box badger DB, and badger will have no knowledge of my versions.
use ManagedDB, which seems to allow users to directly specify version, and this will (hopefully) seamlessly with badger’s inner wiring.
I prefer (2), however, just want to get feedback:
is this the right understanding?
is there anything I need to pay special attention to?
I agree that option 2 is the best one for this. The one thing you must be aware is that you must manage the transaction timestamps yourself. Dgraph uses managed DB (opt. 2) and the Zero node handles the Ts accounting. So although you have more freedom you also have to do a bit more work. Maybe someone else has another caveat but that’s the one that comes to mind. Using managed DB also turns off a few convenience functions because they don’t make sense without implicit Ts.
I would suggest looking at our debug command, it has a lot of good hints about keys and DB.
Also as I’m hacking the badger code, I notice in current WriteBatch implementation, the version is associated with each underlying Txn object (the commitTs), and for that reason, I cannot including key / value pairs with different version inside a single WriteBatch.
Is that understanding correct?
Of course this is not ideal, as blind write would write different key / value pairs with multiple versions.
Do you have some thoughts on how to fix this?
You are correct, a batch is associated with a specific version. I think if you wanted to write different version data you will need to look at the stream API. Check the backup code here: badger/backup.go at master · dgraph-io/badger · GitHub
When you do a backup you need all and any versions, so it might be closer to what you want to do. I personally prefer using the stream API for any batch-type operations. That code came from our need in Dgraph to do these types of operations.