Error: A tick missed to fire. Node blocks too long!. How to resolve?

Valdanito · September 14, 2020, 4:35am

I have two groups, each with three alphas.

Each alpha in group 0 has 1.4T data and each alpha in group 1 has 1.1T data.

After rebalance, one of the alhpa becomes 338GB, and the other alhpa are 1.4T.

This 338GB one has been outputting A tick missed to fire. Node blocks too long! error.

Can I stop this alpha and copy the alpha data of the same group into it?

my dgraph version: v20.07

aman-bansal · September 14, 2020, 5:51am

@Valdanito can you please share alpha and zero logs for further investigation? It would be really helpful in understanding what went wrong.

ibrahim · September 14, 2020, 9:18am

This happens when we’re producing more ticks than what we can process. All operations in raft are dependent on the ticker.

github.com

dgraph-io/dgraph/blob/1e179c175e7da3e4f18507feca84c7aa955d913c/worker/draft.go#L1041-L1048


      
          		case <-ticker.C:
          			n.Raft().Tick()
          
          		case rd := <-n.Raft().Ready():
          			timer.Start()
          			_, span := otrace.StartSpan(n.ctx, "Alpha.RunLoop",
          				otrace.WithSampler(otrace.ProbabilitySampler(0.001)))

@Valdanito How long has the cluster been running?
We’ve seen these messages in zero nodes. Can you confirm you’re seeing it in alpha nodes?

Each node has its own state. You shouldn’t copy over the data directory. You could add a new node to the group and the leader will stream the data to the new node (which is much better than copy-pasting the directory since the leader will stream only the valid data and it will clean up your disk too). This operation is generally safe but you should take a backup.

Also see
https://github.com/etcd-io/etcd/issues/9939
https://github.com/dgraph-io/dgraph/issues/2541

Valdanito · September 14, 2020, 10:08am

Thank you, it will be fine after a while, no useful information is found in the log, but the data is always 338GB

Valdanito · September 14, 2020, 10:09am

Thank you very much！

porsche · January 18, 2022, 5:37am

I just hit this error.
Any clue to resolve this?

Cloud: AKS
dgraph version: v21.12.0
cluster config: one zero, 3 alpha
Alpha: one NVMe SSD disk 1.6 T; 48 GB RAM
Zero: two NVMe SSD disks each 1.6 T, 128 GB RAM

I hit this blocking error after running Live Uploader for about 6 hours.

Topic		Replies	Views
Issue with Dgraph 1.0.7 Dgraph	3	509	October 11, 2018
Node.go:400: WARN: A tick missed to fire. Node blocks too long! Dgraph	12	1363	December 27, 2018
TryAbort selectively proposing only aborted txns Dgraph kind:bug	10	845	September 23, 2020
One alpha node begin to dropAll when there are some normal mutations Users mutation , kind:bug	10	816	February 17, 2019
Alpha stuck at Raft.Ready took too long to process Dgraph dgraph , status:accepted , kind:bug	0	595	July 12, 2020

Error: A tick missed to fire. Node blocks too long!. How to resolve?

Related topics