This is regarding the change to commit logs to make them compatible with RAFT. So I have been going through https://github.com/coreos/etcd/tree/master/contrib/raftexample to understand how Raft works in general and what part the WAL play. These guys have built up an example which has a key value store backed by Raft. It took rather long to wrap my head around this but I feel I have a much better understanding of the code there now.
All requests that come to the master are written to the WAL and the raftStorage. Later when they are committed(Proposition is accepted by a quorum), the are sent over the commit channel which then updates the KV store.
WAL also has the ability to store snapshots where while replaying your can open WAL from a particular snapshot and read entries from that point on.
Whenever a server boots up, logs from the WAL are replayed and data written to the memory storage. I think we need to implement the Storage interface(https://godoc.org/github.com/coreos/etcd/raft#Storage) for our WAL logs.
We have multiple Raft groups(each containing 3 nodes and storing data for a predicate). Note that a node doesn’t mean an instance here. An instance could act and typically would act as nodes for multiple predicates.
Now each node has a memory storage and a WAL(this means a log per predicate). If I am thinking correct here then we could just use wal package - github.com/coreos/etcd/wal - Go Packages. Whenever a mutation comes, it is redirected to master of the group of nodes which form the Raft group for the predicate. Master writes it to its logs, sends it to other nodes(they write it to their logs), gets consensus and then the mutation is applied to all nodes.
From what I understand RAFT library can be configured to use the memory storage(raft package - github.com/coreos/etcd/raft - Go Packages) for transmitting the entries to other instances. WAL is used to load data into memory in case a node restarts.
Whenever a predicate has to be transferred to a new instance, we just make it part of the RAFT group for the predicate and data can then be streamed using Predicate RPC + the storage.