Question: Badger key version consistency among raft followers

ozan · August 13, 2020, 11:52am

What version of Go are you using (`go version`)?

$ go version
go version go1.15 linux/amd64

What operating system are you using?

Ubuntu 18 GNU/Linux

What version of Badger are you using?

github.com/dgraph-io/badger/v2

What Badger options were set?

NumVersionsToKeep = math.MaxInt32

I have a simple question but could not find my solution in the source code, so I need your expertise to address a future problem, I will be glad if you help me. I have been building a basic distributed key value store, which is write intensive, on top of badger and hashicorp’s raft packages. I will allow multi version keys for keys in badger. Follower nodes will issue RPC commands to leader for database update operations. Some operations at follower nodes will require badger’s key version to issue the command to check the stale reads at leader node. Follower reads the key from replica database not from leader, only updates go to leader. So, leader checks the key and key version (if provided) to determine if request comes from stale read (which can happen normally). In short; I want replicas have consistent key versions with leader. If we execute same set of operations with a latency on two different badger databases (what raft FSM does) , that are distributed, can I get the exactly the same item from replica badgers with both given key and version in any time (ignore log replication latency)? Thank you.

ibrahim · August 14, 2020, 7:58am

Hey @ozan, this is a great question. Let me try to answer this.

If you run badger in default mode (normal mode) and issue two write requests to two different badger DB, if one of them fails, your replicas will be out of sync. Each badger DB will have its own commit timestamp and if your requests are received in a different order on a Badger instance, they’ll have different key versions than what you would have on other replicas. I think one way of doing this would be to ensure the leader doesn’t process the next request until current one finishes. So you will have to wait until the commit completes on all the replicas.

For dgraph, we run Badger in Managed Mode (badger.OpenManaged). In this mode, dgraph sends the keys and its version to a badger instance. So you would say key:foo val:xx version:12 and badger will store the foo:xx:12. This way you can have control over what version each badger instance stores.

If a request fails for version x, you can retry it and be assured that even if you have received other requests, key foo will get a version x.

Here’s how we do this in dgraph

github.com

dgraph-io/dgraph/blob/5ecef43c70e60435099c426ef1d7ff6b20ddbbd1/posting/writer.go#L70-L85


      
          func (w *TxnWriter) update(commitTs uint64, f func(txn *badger.Txn) error) error {
          	if commitTs == 0 {
          		return nil
          	}
          	txn := w.db.NewTransactionAt(math.MaxUint64, true)
          	defer txn.Discard()
          
          	err := f(txn)
          	if err == badger.ErrTxnTooBig {
          		// continue to commit.
          	} else if err != nil {
          		return err
          	}
          	w.wg.Add(1)
          	return txn.CommitAt(commitTs, w.cb)
          }

Please be aware that in Managed mode you will have to assign the timestamps to the keys and it requires proper handling. Dgraph performs it’s own conflict checking and other stuff to ensure we don’t commit at wrong version.

If you’re building a distributed version of badger, you might be interested in GitHub - mosuka/cete: Cete is a distributed key value store server written in Go built on top of BadgerDB. which is a distributed version badger (it runs in normal mode and I don’t know how they take care of versions).

ozan · August 14, 2020, 8:48am

Thank you @ibrahim for comprehensive explanation. I’ve just checked mosuka’s cete package. It looks like it does not handle key versions but I can benefit from it for other issues.

Dealing with managed transactions will bring more problems to be solved and I am not reluctant to dive in it as you mentioned. The quickest and may be a dirty solution is to store versions along with data and that versions will be incremented per key basis while setting a key, so I need a metadata which should hold version with other stuff. Then I can process metadata to derive key version and compare it with given version. This can solve my consistency problem. Please let me know if this can help. I am not trying to invent “etcd” or “consul” again but just want to have an embedded solution to store key value pairs in a consistent manner.

What if a problem occurs in any replica and that one becomes out of sync? I think this must be solved using raft’s internals to deal with these kind of problems, in the end raft watches all commit results of followers. I will definitely simulate those errors to see what happens.
Thank you again.

ibrahim · August 17, 2020, 9:23am

@ozWhy do you need to store a version in the metadata? Badger already stores version with each key. Wouldn’t that be enough?

github.com

dgraph-io/badger/blob/763e7d7303f616fb4cfd36e6e21110ec5649df39/iterator.go#L81-L83


      
          func (item *Item) Version() uint64 {
          	return item.version
          }

Please do let us know how you experiment goes. This is indeed an interesting project

ozan · August 17, 2020, 10:12am

I cannot be sure about badger version counter to be consistent among replicas/followers between failures and leader changes so I added version counter for each key to the value. Since I have already added a data type info as 1 byte to encode/decode many Go types, adding version was easy. I saved badger’s user metadata for future requirements. I will use badger’s own item.Version() to track history of a key or to do replay for some scenarios.

Sure, I will post whatever useful for badger and community.

ozan · August 17, 2020, 9:53pm

Hi @ibrahim ,

github.com

dgraph-io/badger/blob/4c8fe7fd63e36f9e39b788c084e18e293413a71d/options.go#L277-L285


// WithNumVersionsToKeep returns a new Options value with NumVersionsToKeep set to the given value.
//
// NumVersionsToKeep sets how many versions to keep per key at most.
//
// The default value of NumVersionsToKeep is 1.
func (opt Options) WithNumVersionsToKeep(val int) Options {
	opt.NumVersionsToKeep = val
	return opt
}

Badger internally keeps versions as uint64 but options only accepts int type. Is there are specific reason for this? In documentation it says dgraph sets infinity to versions. This option lets upper bound be platform dependent. Is it possible to make it consistent and make Options.NumVersionsToKeep be uint64?
Cheers.

ibrahim · August 18, 2020, 5:37am

Hey @ozan, I am not sure why we did that but changes to the API will be a breaking change for badger users.

You can set the options value directly (skip the Withxxx method). So you can do opt.NumVersionsToKeep=someUint64

ozan · August 24, 2020, 8:59am

Hi, NumVersionsToKeep type is int in the struct but anyway it does not important for me. Just for your information.

ibrahim · August 26, 2020, 9:57am

Thanks for pointing it out @ozan

Topic		Replies	Views
Using badger for multiple version store Badger	6	1518	April 9, 2019
Data Replication and partitions in Badger Badger	7	3929	July 25, 2019
Usage of managedTxns Badger	2	635	September 20, 2019
QUESTION: Best way to create snapshots from badger Badger kind:question	1	477	December 9, 2020
Large vlog file > 2Gb while setting 2 key value pairs Badger kind:bug	0	775	June 16, 2021

Question: Badger key version consistency among raft followers

What version of Go are you using (go version)?

What operating system are you using?

What version of Badger are you using?

What Badger options were set?

Related topics

What version of Go are you using (`go version`)?