Alpha stuck at Raft.Ready took too long to process

Moved from GitHub dgraph/5946

Posted by jarifibrahim:

What version of Dgraph are you using?

https://github.com/dgraph-io/dgraph/commit/156bc23bca4ef941ed5b1e84638961764bd59f27

Have you tried reproducing the issue with the latest release?

Yes, master is on fix(dgraph): Fix snapshot calculation in ludicrous mode (#5585) · dgraph-io/dgraph@156bc23 · GitHub

Steps to reproduce the issue (command/config used to run Dgraph).

Run 6 node dgraph cluster and run flock against it.
Flock was running for a week before it ran in to this issue.

Expected behaviour and actual result.

I expected dgraph to work normally.

The actual result was that dgraph got stuck in the Raft.Ready took too long to process forever. For the first few days active mutations were failing. Later I stopped the mutations hoping the cluster will recover by it never recovered.

alpha3    | W0712 13:39:31.410790      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 984ms. Breakdown: [{advance 983ms} {disk 1ms} {proposals 0s}] Num entries: 0. MustSync: false
alpha3    | W0712 13:39:31.882951      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 472ms. Breakdown: [{advance 470ms} {disk 1ms} {proposals 0s}] Num entries: 1. MustSync: true
alpha1    | W0712 13:39:33.340034      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 563ms. Breakdown: [{advance 562ms} {disk 1ms} {proposals 0s}] Num entries: 0. MustSync: false
alpha3    | W0712 13:39:33.522478      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 1.035s. Breakdown: [{advance 1.034s} {disk 1ms} {proposals 0s}] Num entries: 0. MustSync: false
alpha3    | W0712 13:39:34.005682      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 483ms. Breakdown: [{advance 483ms} {disk 0s} {proposals 0s}] Num entries: 0. MustSync: false
alpha1    | W0712 13:39:34.553391      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 563ms. Breakdown: [{advance 562ms} {disk 0s} {proposals 0s}] Num entries: 0. MustSync: false
alpha1    | W0712 13:39:35.965456      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 856ms. Breakdown: [{advance 855ms} {disk 0s} {proposals 0s}] Num entries: 0. MustSync: false
alpha3    | W0712 13:39:36.064111      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 1.509s. Breakdown: [{advance 1.509s} {disk 0s} {proposals 0s}] Num entries: 0. MustSync: false
alpha1    | W0712 13:39:36.561155      19 draft.go:1222] Raft.Ready took too long to process: Timer Total: 596ms. Breakdown: [{advance 595ms} {disk 0s} {proposals 0s}] Num entries: 0. MustSync: false

See logs here (truncated logs) - Flock raft.ready stuck · GitHub

1 Like