Error: state.commit xyz is out of range

Dgraph has been running in a stable manner on GKE for the last 6 weeks with the following config:

image:
  tag: "v22.0.2"
alpha:
  replicaCount: 9
  persistence:
    size: 250Gi
    storageClassName: premium-rwo # Doesn't work. Have to do patch. See setup docs.
  service:
    type: LoadBalancer
    annotations:
      cloud.google.com/load-balancer-type: "Internal"
zero:
  replicaCount: 3
  shardReplicaCount: 3
  persistence:
    size: 32Gi
    storageClassName: premium-rwo # Doesn't work. Have to do patch. See setup docs.
ratel:
  service:
    type: LoadBalancer
    annotations:
      cloud.google.com/load-balancer-type: "Internal"

However, 2 days ago we saw the following error:

I0301 09:40:26.686947      23 log.go:34] All 3153 tables opened in 919ms
I0301 09:40:26.691624      23 log.go:34] Discard stats nextEmptySlot: 0
I0301 09:40:26.694320      23 log.go:34] Set nextTxnTs to 24405663
I0301 09:40:26.704901      23 log.go:34] Deleting empty file: p/000230.vlog
I0301 09:40:26.705919      23 groups.go:100] Current Raft Id: 0x2
I0301 09:40:26.705940      23 groups.go:116] Sending member request to Zero: id:2 addr:"xyz-prod-dgraph-alpha-0.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080" 
I0301 09:40:26.706101      23 worker.go:114] Worker listening at address: [::]:7080
I0301 09:40:26.707415      23 run.go:567] Bringing up GraphQL HTTP API at 0.0.0.0:8080/graphql
I0301 09:40:26.707433      23 run.go:568] Bringing up GraphQL HTTP admin API at 0.0.0.0:8080/admin
I0301 09:40:26.707443      23 run.go:595] gRPC server started.  Listening on port 9080
E0301 09:40:26.707438      23 groups.go:1224] Error during SubscribeForUpdates for prefix "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x15dgraph.graphql.schema\x00": Unable to find any servers for group: 1. closer err: <nil>
I0301 09:40:26.707451      23 run.go:596] HTTP server started.  Listening on port 8080
I0301 09:40:26.807046      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-zero-0.xyz-prod-dgraph-zero-headless.xyz.svc.cluster.local:5080
I0301 09:40:26.814126      23 groups.go:135] Connected to group zero. Assigned group: 0
I0301 09:40:26.814148      23 groups.go:137] Raft Id after connection to Zero: 0x2
I0301 09:40:26.814200      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-6.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814215      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-8.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814242      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-7.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814280      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-4.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814303      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-3.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814341      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-5.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814379      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-2.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814409      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-alpha-1.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080
I0301 09:40:26.814498      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-zero-2.xyz-prod-dgraph-zero-headless.xyz.svc.cluster.local:5080
I0301 09:40:26.814526      23 pool.go:164] CONN: Connecting to xyz-prod-dgraph-zero-1.xyz-prod-dgraph-zero-headless.xyz.svc.cluster.local:5080
I0301 09:40:26.814574      23 draft.go:270] Node ID: 0x2 with GroupID: 1
I0301 09:40:26.814583      23 draft.go:279] RaftContext: id:2 group:1 addr:"xyz-prod-dgraph-alpha-0.xyz-prod-dgraph-alpha-headless.xyz.svc.cluster.local:7080" 
I0301 09:40:26.814671      23 node.go:153] Setting raft.Config to: &{ID:2 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc00045aac0 Applied:22163404 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x3113340 DisableProposalForwarding:false}
I0301 09:40:26.920997      23 node.go:312] Found Snapshot.Metadata: {ConfState:{Nodes:[2 3 4] Learners:[] XXX_unrecognized:[]} Index:22163404 Term:91 XXX_unrecognized:[]}
I0301 09:40:26.921030      23 node.go:323] Found hardstate: {Term:93 Vote:3 Commit:22171269 XXX_unrecognized:[]}
I0301 09:40:27.023525      23 node.go:328] Group 1 found 15827 entries
I0301 09:40:27.023564      23 draft.go:1828] Restarting node for group: 1
I0301 09:40:27.023573      23 node.go:190] Setting conf state to nodes:2 nodes:3 nodes:4 
2023/03/01 09:40:27 2 state.commit 22171269 is out of range [22163404, 22171268]
panic: 2 state.commit 22171269 is out of range [22163404, 22171268]

goroutine 3326 [running]:
log.Panicf({0x1f5c0b6?, 0xaf3305?}, {0xc00066ca40?, 0x1c949c0?, 0x7fcc51dbfa00?})

Now this happened for the first time 2 days ago. We solved this in the end in what feels a very hacky solution by deleting from the PVC associated to the failing alpha the w directory (after backing up inspecting the files and be comfortable with any data loss).

A quick snapshot of the w folder for the failing alpha is here:

/pvc/w # ls -l
total 137664
-rw-r--r--    1 ***     ***     268435456 Feb 28 19:35 00930.wal
-rw-r--r--    1 ***     ***       1048576 Mar  1 09:45 wal.meta

Could a) someone explain more the theory of whats going on, b) what kind of data loss one would expect from deleting `w, c) anyway to read the WAL files - when I inspect the WAL file i can see what objects it is references but any data is gibberish.

What I did previously

Please list the things you have tried.

Dgraph metadata

<docker.io/dgraph/dgraph:v22.0.2

Hi @zooney,

For (a), We’re still investigating but it seems that the error is coming from etcd/raft which is the package we use for raft consensus (raft/raft.go at d9907d6ac6baaebc3c9fd4e67acaa4154d2b3cd3 · etcd-io/raft · GitHub)

For (b), the w dir is for raft’s write ahead log (we use raft to replicate and agree on data in a multi-node setup). What it does is it persists raft entries/logs to disk every time a node receives a write (if it’s a leader) or a message to replicate from the leader (if it’s a follower).

So if you delete w and restart alpha, the data loss that you can expect are from writes that are not yet committed by zero and thus not written to disk yet. Because by that point it would’ve only been written in-memory and to the raft write ahead log. The in-memory data will be gone on restart since it’s not persisted yet while the raft write ahead log will be gone if you delete the w dir.