Looking for a faster way to get new member to sync with the snapshot from the leader.
Was using badger 1.6.2 and just upgraded to 3.2103.2.
I currently iterate the keys using iterator and pipe it to new member.
I came across badger stream which does spin off multiple goroutines to transfer the data to new member. Does it persist progress so it can survive restarts ?
Can I directly transfer the snapshot over network and use it on the new member ?
Before we solve the problem, could you elaborate the problem you are trying to solve? Did you try something that you found slow? How slow was it? Do you have any logs that you could share?
Thanks @amanmangal.
I use badger as a persistence storage for my raft protocol.
I currently take a snapshot and iterate through the keys to send them to the member over pipe which is trying to catchup with the existing database, which is slow because its a single goroutine and small buffer size. It took me around to 60 minutes for a 5GB database.
Well that all said its a custom logic on top of the snapshot.
I am looking to try out anything badger internally supports and can test it. I am looking at really quick way for the new member to catch up on existing db.
Sorry, I have been busy with the release. There are a few options that you could try. You could use the Stream framework to read the data instead. How long does that take for you? You could take a look at how we do it in Dgraph here dgraph/snapshot.go at main · dgraph-io/dgraph · GitHub.
Could you also measure your network throughput using iperf or a similar tool? 1 hour for 5 GB data seems a lot to me. Badger should be able to read the data much faster.
For surviving restarts, this seems a bit complex to me. When would you consider a key to be transferred to the other side, is it when you have sent it or is it when the other side has acknowledged it? You will have to build a protocol that allows you account for acknowledgement if needed. You could store the range of keys that has been sent/acknowledges on to disk once and modify the chooseKey function accordingly. But in that case, you won’t be able to optimization in the PR.