I removed the node that was damaged by the last panic (alpha-10) and re-added him with completely fresh storage. Here is what happened:
- crashloops
- removeNode
- new node comes up, gets assigned to free spot in that group
- gets snapshot from leader
- after snapshot, gets txns sent to him from snapshotTS->now
- panics at same timestamp as before
see cutely annotated screenshot of this in grafana:
(update: and a second removeNode and re-add done after the next snapshot worked as expected, but this highlights that whatever this issue is, is committed to the raft wal.)