Error on zero startup: "ZeroProposal: illegal tag 0 (wire type 0)"

Report a Dgraph Bug

Not sure if this is a bug or user-error, but I accidentally used the latest image tag on my 3-node kubernetes deployment. It upgraded from v20.07.2 to v20.11.0 yesterday and ran into errors with version number 7 not 8. I understand that’s because of storage compatibility, this isn’t the issue I’m reporting.

I’ve set both alpha and zero replicas down to 0, downgraded the StatefulSets back to v20.07.2 and then scaled my zero’s StatefulSet back to 3 replicas. Now I get an error on the first zero, here are the logs:

 Dgraph version   : v20.07.2
 Dgraph codename  : shuri-2
 Dgraph SHA-256   : a927845127dab735c24727d5a24af411168771b55236aec50f0b987e8c0ac910
 Commit SHA-1     : a7bc16d56
 Commit timestamp : 2020-10-22 10:17:53 -0700
 Branch           : HEAD
 Go version       : go1.14.4
 I0123 21:51:45.158537      19 run.go:127] Setting up grpc listener at:
 I0123 21:51:45.159438      19 run.go:127] Setting up http listener at:
 I0123 21:51:45.160751      19 run.go:298] Opening zero BadgerDB with options: {Dir:zw ValueDir:zw SyncWrites:false TableLoadingMode:2 ValueLogLoadingMode:2 NumVersionsToKeep:1 ReadOnly:false Truncate:true Logger:0xc000
 badger 2021/01/23 21:51:45 INFO: All 1 tables opened in 5ms
 badger 2021/01/23 21:51:45 INFO: Replaying file id: 6 at offset: 42785522
 badger 2021/01/23 21:51:45 INFO: Replay took: 4.921µs
 I0123 21:51:45.208055      19 node.go:149] Setting raft.Config to: &{ID:1 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc000112050 Applied:1444061 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864
 I0123 21:51:45.208368      19 node.go:307] Found Snapshot.Metadata: {ConfState:{Nodes:[1 2 3] Learners:[] XXX_unrecognized:[]} Index:1444061 Term:3 XXX_unrecognized:[]}
 I0123 21:51:45.208468      19 node.go:318] Found hardstate: {Term:7 Vote:0 Commit:1445344 XXX_unrecognized:[]}
 I0123 21:51:45.209336      19 node.go:327] Group 0 found 1283 entries
 I0123 21:51:45.209362      19 raft.go:515] Restarting node for dgraphzero
 I0123 21:51:45.209427      19 node.go:186] Setting conf state to nodes:1 nodes:2 nodes:3
 I0123 21:51:45.209593      19 pool.go:160] CONNECTING to dgraph-alpha-0.dgraph-alpha.default.svc.cluster.local:7080
 I0123 21:51:45.209619      19 pool.go:160] CONNECTING to dgraph-alpha-1.dgraph-alpha.default.svc.cluster.local:7080
 I0123 21:51:45.209642      19 pool.go:160] CONNECTING to dgraph-alpha-2.dgraph-alpha.default.svc.cluster.local:7080
 I0123 21:51:45.209665      19 pool.go:160] CONNECTING to dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080
 I0123 21:51:45.209724      19 pool.go:160] CONNECTING to dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080
 I0123 21:51:45.210211      19 log.go:34] 1 became follower at term 7
 I0123 21:51:45.210517      19 log.go:34] newRaft 1 [peers: [1,2,3], term: 7, commit: 1445344, applied: 1444061, lastindex: 1445344, lastterm: 7]
 [Sentry] 2021/01/23 21:51:45 Sending fatal event [1118306852724b18a70f595e9e64d2cc] to project: 1805390
 2021/01/23 21:51:45 proto: ZeroProposal: illegal tag 0 (wire type 0)

What is the hardware spec (RAM, OS)?

Each zero/alpha is on a 4CPU 16GB node

Expected behaviour and actual result.

I can downgrade back to v20.07

Also, my mistake of using latest has happened on 5 different clusters, and following the steps of downgrading as described above has worked on 4 of them, just 1 is getting the error.