Understanding Raft WAL Logs

pawan · September 6, 2016, 11:10am

This is regarding the change to commit logs to make them compatible with RAFT. So I have been going through https://github.com/coreos/etcd/tree/master/contrib/raftexample to understand how Raft works in general and what part the WAL play. These guys have built up an example which has a key value store backed by Raft. It took rather long to wrap my head around this but I feel I have a much better understanding of the code there now.

From what I understand WAL is used to bring back data from the disk(WAL files) into memory(raftStorage in example) https://github.com/coreos/etcd/blob/master/contrib/raftexample/raft.go#L160 when a node starts, restarts.
All requests that come to the master are written to the WAL and the raftStorage. Later when they are committed(Proposition is accepted by a quorum), the are sent over the commit channel which then updates the KV store.

WAL also has the ability to store snapshots where while replaying your can open WAL from a particular snapshot and read entries from that point on.

The WAL files are of 64M each and also store HardState(https://godoc.org/github.com/coreos/etcd/raft/raftpb#HardState) apart from the entries. I have to now figure that how we use and modify our commit logs so that they work with RAFT.

mrjn · September 6, 2016, 11:16am

This is pretty accurate. Our commit logs are also 50MB each or something.

pawan · September 6, 2016, 11:35am

@mrjn Could you also please summarize the discussion we had yesterday? One of the things we decided was that we would get rid of the cache.

Also for reference

mrjn · September 6, 2016, 11:37am

Yeah, we don’t need cache because we need to read the logs rarely. I’ll summarize the discussion we had now.

pawan · September 7, 2016, 9:44am

From what I understand about Raft.

Everytime, a mutation comes in, it is written to the in-memory storage similar to this https://github.com/coreos/etcd/blob/master/contrib/raftexample/raft.go#L293. It is also written to the WAL. MemoryStorage actually implements the storage interface using an in-memory array.

Whenever a server boots up, logs from the WAL are replayed and data written to the memory storage. I think we need to implement the Storage interface(https://godoc.org/github.com/coreos/etcd/raft#Storage) for our WAL logs.

pawan · September 9, 2016, 7:50am

My present understanding regarding this is that

We have multiple Raft groups(each containing 3 nodes and storing data for a predicate). Note that a node doesn’t mean an instance here. An instance could act and typically would act as nodes for multiple predicates.
Now each node has a memory storage and a WAL(this means a log per predicate). If I am thinking correct here then we could just use wal package - github.com/coreos/etcd/wal - Go Packages. Whenever a mutation comes, it is redirected to master of the group of nodes which form the Raft group for the predicate. Master writes it to its logs, sends it to other nodes(they write it to their logs), gets consensus and then the mutation is applied to all nodes.
From what I understand RAFT library can be configured to use the memory storage(raft package - github.com/coreos/etcd/raft - Go Packages) for transmitting the entries to other instances. WAL is used to load data into memory in case a node restarts.
Whenever a predicate has to be transferred to a new instance, we just make it part of the RAFT group for the predicate and data can then be streamed using Predicate RPC + the storage.

References

pawan · November 28, 2017, 12:59am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
RAFT based Changes to commit logs Users	3	1116	November 28, 2017
Persistent Raft Logs Users	5	2038	November 28, 2017
Q&A session with Xiang and Gyu from CoreOS Users	3	1125	November 28, 2017
Worker, raft and snapshot logic Users	1	449	November 23, 2018
Snapshots/Backup of RocksDB Users	14	4709	November 28, 2017

Understanding Raft WAL Logs

Related topics