Please help me: My cluster is blocked every 5 minutes

Version

[Decoder]: Using assembly version of decoder

Dgraph version   : v20.07.0
Dgraph codename  : shuri
Dgraph SHA-256   : 4cd320fc6eab163ab68602a5122a6c82c8467c2ed5ac93478d5f40d44eec71c4
Commit SHA-1     : d65e20530
Commit timestamp : 2020-07-28 15:31:37 -0700
Branch           : HEAD
Go version       : go1.14.4

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.
To say hi to the community       , visit https://dgraph.slack.com.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.

Problem

In my cluster, last_collect_time table takes up a lot of space:

Groups sorted by size: [{gid:1 size:43091109854} {gid:3 size:43282091840} {gid:2 size:73910000764}]


In my log, the following is output every 5 minutes:

# in zero log
I0924 17:45:38.602550    2669 zero.go:706] Tablet: buy does not belong to group: 2. Sending delete instruction.
W0924 17:45:48.601989    2669 zero.go:670] While deleting predicates: rpc error: code = Canceled desc = context canceled

# in alpha log
I0924 17:40:53.961989    2693 schema.go:103] Deleting schema for predicate: [buy]
I0924 17:45:38.609925    2693 index.go:1238] Dropping predicate: [buy]
I0924 17:45:38.609947    2693 log.go:34] DropPrefix Called
I0924 17:45:38.609959    2693 log.go:34] Writes flushed. Stopping compactions now...
I0924 17:45:38.618228    2693 log.go:34] Dropping prefix at level 2 (1 tableGroups)
I0924 17:45:38.623543    2693 log.go:34] LOG Compact 2->2, del 1 tables, add 1 tables, took 5.293786ms
I0924 17:45:38.623564    2693 log.go:34] Dropping prefix at level 1 (1 tableGroups)
I0924 17:45:46.155157    2693 log.go:34] LOG Compact 1->1, del 1 tables, add 1 tables, took 7.531575833s
I0924 17:45:46.155190    2693 log.go:34] Got compaction priority: {level:0 score:1.74 dropPrefixes:[[0 0 3 98 117 121] [33 98 97 100 103 101 114 33 109 111 118 101 0 0 3 98 117 121]]}
I0924 17:45:46.155209    2693 log.go:34] Running for level: 0
I0924 17:45:53.775667    2693 log.go:34] LOG Compact 0->1, del 2 tables, add 1 tables, took 7.620436724s
I0924 17:45:53.775706    2693 log.go:34] Compaction for level: 0 DONE
I0924 17:45:53.775728    2693 log.go:34] DropPrefix done
I0924 17:45:53.775734    2693 log.go:34] Resuming writes
I0924 17:45:53.775748    2693 schema.go:103] Deleting schema for predicate: [buy]

This will block my cluster for a dozen seconds. I tried to move the buy table to group2, but zero would still move it to another group, and then still do dropping predicate every 5 minutes

# in zero log
Groups sorted by size: [{gid:1 size:42856229273} {gid:3 size:43282091840} {gid:2 size:74144881345}]

I0924 16:53:41.388683    2669 tablet.go:213] size_diff 31288652072
I0924 16:53:41.389454    2669 tablet.go:108] Going to move predicate: [buy], size: [235 MB] from group 2 to 1
I0924 16:53:41.389536    2669 tablet.go:135] Starting move: predicate:"buy" source_gid:2 dest_gid:1 txn_ts:5480274 
I0924 16:53:46.141297    2669 tablet.go:150] Move at Alpha done. Now proposing: tablet:<group_id:1 predicate:"buy" force:true space:234880581 move_ts:5480274 > 
I0924 16:53:46.143362    2669 tablet.go:156] Predicate move done for: [buy] from group 2 to 1
I0924 16:55:38.602391    2669 zero.go:706] Tablet: buy does not belong to group: 2. Sending delete instruction.
W0924 16:55:48.601753    2669 zero.go:670] While deleting predicates: rpc error: code = Canceled desc = context canceled

# in alpha log
I0924 16:55:38.609622    2693 index.go:1238] Dropping predicate: [buy]
I0924 16:55:38.609646    2693 log.go:34] DropPrefix Called
I0924 16:55:38.609655    2693 log.go:34] Writes flushed. Stopping compactions now...
I0924 16:55:38.616727    2693 log.go:34] Dropping prefix at level 2 (1 tableGroups)
I0924 16:55:38.620798    2693 log.go:34] LOG Compact 2->2, del 1 tables, add 1 tables, took 4.05084ms
I0924 16:55:38.620816    2693 log.go:34] Dropping prefix at level 1 (1 tableGroups)
I0924 16:55:46.017755    2693 log.go:34] LOG Compact 1->1, del 1 tables, add 1 tables, took 7.396927672s
I0924 16:55:46.017782    2693 log.go:34] Got compaction priority: {level:0 score:1.74 dropPrefixes:[[0 0 3 98 117 121] [33 98 97 100 103 101 114 33 109 111 118 101 0 0 3 98 117 121]]}
I0924 16:55:46.017805    2693 log.go:34] Running for level: 0
I0924 16:55:53.811932    2693 log.go:34] LOG Compact 0->1, del 2 tables, add 1 tables, took 7.79410864s
I0924 16:55:53.811968    2693 log.go:34] Compaction for level: 0 DONE
I0924 16:55:53.811988    2693 log.go:34] DropPrefix done
I0924 16:55:53.811995    2693 log.go:34] Resuming writes
I0924 16:55:53.812005    2693 schema.go:103] Deleting schema for predicate: [buy]

How can I solve the problem?

You can try to increase the tablet/predicate balancing to infinity. That way you can bypass the balance procedure hardcoded in the zero instance. The balancing is based on disk, make sure the disks are even, and so on.

@ MichelDiz Thanks.

I still have some questions. This link mentions that read requests are not blocked, please ask which version this is supported from? How to trigger log compact online?