Logs from failed blockade test


(Lucas Wang) #1

I0308 17:44:40.660619 1 groups.go:826] Committed: 60052 -> 60053
I0308 17:44:40.660650 1 groups.go:826] Committed: 60054 -> 60056
I0308 17:44:40.660682 1 groups.go:826] Committed: 60059 -> 60060
I0308 17:44:40.660706 1 groups.go:826] Committed: 60061 -> 60062
I0308 17:44:40.660729 1 groups.go:826] Committed: 60063 -> 60065
I0308 17:44:40.660762 1 groups.go:826] Committed: 60066 -> 60067
I0308 17:44:40.660785 1 groups.go:826] Committed: 60068 -> 60070
I0308 17:44:40.682429 1 groups.go:266] group 1 checksum: 6533378544029036162
I0308 17:44:40.718330 1 node.go:83] 1 [term: 18] received a MsgAppResp message with higher term from 2 [term: 19]
I0308 17:44:40.718389 1 node.go:83] 1 became follower at term 19
I0308 17:44:40.718407 1 node.go:83] raft.node: 1 lost leader 1 at term 19
I0308 17:44:40.797864 1 node.go:83] 1 [logterm: 18, index: 484, vote: 0] rejected MsgPreVote from 3 [logterm: 18, index: 482] at term 19
I0308 17:44:40.798100 1 node.go:83] 1 [logterm: 18, index: 484, vote: 0] rejected MsgVote from 3 [logterm: 18, index: 482] at term 19
I0308 17:44:40.798339 1 node.go:83] found conflict at index 483 [existing term: 18, conflicting term: 19]
I0308 17:44:40.798369 1 node.go:83] replace the unstable entries from index 483
I0308 17:44:40.798394 1 node.go:83] raft.node: 1 elected leader 3 at term 19
2019/03/08 17:44:40 unexpected error (requested entry at index is unavailable)
panic: unexpected error (requested entry at index is unavailable)

goroutine 227 [running]:
log.Panicf(0x153c32f, 0x15, 0xc0132c7650, 0x1, 0x1)
/usr/local/go/src/log/log.go:340 +0xc0
github.com/dgraph-io/dgraph/conn.(*ToGlog).Panicf(0x219e0d8, 0x153c32f, 0x15, 0xc0132c7650, 0x1, 0x1)
/home/lucas/go/src/github.com/dgraph-io/dgraph/conn/node.go:89 +0x53
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raftLog).zeroTermOnErrCompacted(…)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/log.go:368
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raftLog).findConflict(0xc000276460, 0xc009c43b00, 0xe, 0x10, 0x0)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/log.go:133 +0x27f
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raftLog).maybeAppend(0xc000276460, 0x1e3, 0x13, 0x1f1, 0xc009c43b00, 0xe, 0x10, 0x7fed8c435fff, 0x7fed983e6ca8)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/log.go:91 +0xa1
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raft).handleAppendEntries(0xc0093c2000, 0x3, 0x1, 0x3, 0x13, 0x13, 0x1e3, 0xc009c43b00, 0xe, 0x10, …)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/raft.go:1314 +0xa7
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.stepFollower(0xc0093c2000, 0x3, 0x1, 0x3, 0x13, 0x13, 0x1e3, 0xc009c43b00, 0xe, 0x10, …)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/raft.go:1265 +0x3fb
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raft).Step(0xc0093c2000, 0x3, 0x1, 0x3, 0x13, 0x13, 0x1e3, 0xc009c43b00, 0xe, 0x10, …)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/raft.go:971 +0x1396
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*node).run(0xc0005ec480, 0xc0093c2000)
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/node.go:357 +0x10e0
created by github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.RestartNode
/home/lucas/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/node.go:246 +0x31c


(Manish R Jain) #2

I was able to see this as well. After a snapshot from leader, the follower just got stuck, receiving nothing for a long time. To get it out of that state, I partitioned the leader, which then immediately caused this panic in the follower.

2019/03/09 02:54:23 slice[396,396) out of bound [396,393]
panic: slice[396,396) out of bound [396,393]

goroutine 326 [running]:
log.Panicf(0x14c48be, 0x21, 0xc438613400, 0x4, 0x4)
	/usr/lib/go-1.10/src/log/log.go:333 +0xda
github.com/dgraph-io/dgraph/conn.(*ToGlog).Panicf(0x1ba3d40, 0x14c48be, 0x21, 0xc438613400, 0x4, 0x4)
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/conn/node.go:89 +0x53
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raftLog).mustCheckOutOfBounds(0xc4201c48c0, 0x18c, 0x18c, 0x0, 0x0)
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/log.go:356 +0x24b
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raftLog).slice(0xc4201c48c0, 0x18c, 0x18c, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/log.go:307 +0x5a
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raft).Step(0xc4203c8000, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/raft.go:913 +0xa4
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raft).tickElection(0xc4203c8000)
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/raft.go:676 +0x102
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*raft).(github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.tickElection)-fm()
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/raft.go:709 +0x2a
github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.(*node).run(0xc4204c8f00, 0xc4203c8000)
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/node.go:392 +0xbc4
created by github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft.RestartNode
	/home/mrjn/go/src/github.com/dgraph-io/dgraph/vendor/go.etcd.io/etcd/raft/node.go:246 +0x321


(Manish R Jain) #3

Found the issue. This was being caused by an incorrect cache update in Raft storage for lastIndex.