Batch mutations are very slow

What version of Dgraph are you using?

20.07.02
java client: 20.03.2

Have you tried reproducing the issue with the latest release?

This is the latest release

What is the hardware spec (RAM, OS)?

os: linux
cluster info : 3 machines with 140G memory and 16 cores

Steps to reproduce the issue (command/config used to run Dgraph).

zero

dgraph zero --my xxxx:5080 --replicas 3

alpha

dgraph alpha --whitelist 10.4.0.0:10.5.0.0 --lru_mb 42000 --my 10.4.31.22:7080 --zero 10.4.31.22:5080 --badger.vlog=disk --pending_proposals 32

I have 600 millions users and users relations will to insert dgraph. I have run my program 3 days. But I find mutations very slow. I use 5 threads and batch size 100 to upsert dgraph.
I finds some logs below.

A machine

I1120 09:47:41.694395    7160 log.go:34] 1 is starting a new election at term 606
I1120 09:47:41.694437    7160 log.go:34] 1 became pre-candidate at term 606
I1120 09:47:41.694447    7160 log.go:34] 1 received MsgPreVoteResp from 1 at term 606
I1120 09:47:41.694597    7160 log.go:34] 1 [logterm: 606, index: 2791343] sent MsgPreVote request to 2 at term 606
I1120 09:47:41.694705    7160 log.go:34] 1 [logterm: 606, index: 2791343] sent MsgPreVote request to 3 at term 606
I1120 09:47:41.694767    7160 log.go:34] raft.node: 1 lost leader 3 at term 606
I1120 09:47:41.762206    7160 log.go:34] 1 received MsgPreVoteResp from 2 at term 606
I1120 09:47:41.762257    7160 log.go:34] 1 [quorum:2] has received 2 MsgPreVoteResp votes and 0 vote rejections
I1120 09:47:41.762284    7160 log.go:34] 1 became candidate at term 607
I1120 09:47:41.762294    7160 log.go:34] 1 received MsgVoteResp from 1 at term 607
I1120 09:47:41.762469    7160 log.go:34] 1 [logterm: 606, index: 2791343] sent MsgVote request to 2 at term 607
I1120 09:47:41.762589    7160 log.go:34] 1 [logterm: 606, index: 2791343] sent MsgVote request to 3 at term 607
W1120 09:47:42.090461    7160 draft.go:1245] Raft.Ready took too long to process: Timer Total: 328ms. Breakdown: [{sync 9ms} {disk 0s} {proposals 0s} {advance 0s}] Num entries: 0. MustSync: true
I1120 09:47:42.114617    7160 log.go:34] 1 received MsgVoteResp from 2 at term 607
I1120 09:47:42.114911    7160 log.go:34] 1 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
I1120 09:47:42.115081    7160 log.go:34] 1 became leader at term 607
I1120 09:47:42.115675    7160 log.go:34] raft.node: 1 elected leader 1 at term 607
I1120 09:47:42.594558    7160 groups.go:888] Leader idx=0x1 of group=1 is connecting to Zero for txn updates
I1120 09:47:43.022847    7160 groups.go:900] Got Zero leader: 10.4.31.22:5080
W1120 09:47:50.196476    7160 draft.go:1245] Raft.Ready took too long to process: Timer Total: 202ms. Breakdown: [{disk 202ms} {proposals 0s} {advance 0s}] Num entries: 0. MustSync: false
W1120 09:48:24.337345    7160 draft.go:1245] Raft.Ready took too long to process: Timer Total: 346ms. Breakdown: [{proposals 342ms} {disk 0s} {advance 0s}] Num entries: 0. MustSync: false
I1120 09:48:29.668724    7160 log.go:34] 1 [logterm: 607, index: 2791418, vote: 1] rejected MsgPreVote from 3 [logterm: 607, index: 2791399] at term 607
W1120 09:48:37.431686    7160 draft.go:1245] Raft.Ready took too long to process: Timer Total: 525ms. Breakdown: [{disk 524ms} {sync 1ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
I1120 09:48:38.657856    7160 groups.go:956] Zero leadership changed. Renewing oracle delta stream.
E1120 09:48:38.697401    7160 groups.go:932] Error in oracle delta stream. Error: rpc error: code = Canceled desc = context canceled
I1120 09:48:39.582348    7160 groups.go:888] Leader idx=0x1 of group=1 is connecting to Zero for txn updates
I1120 09:48:39.582391    7160 groups.go:900] Got Zero leader: 10.4.31.22:5080
I1120 09:48:45.909440    7160 log.go:34] 1 [logterm: 607, index: 2791454, vote: 1] cast MsgPreVote for 2 [logterm: 607, index: 2791454] at term 607
W1120 09:48:53.373391    7160 draft.go:1245] Raft.Ready took too long to process: Timer Total: 689ms. Breakdown: [{proposals 689ms} {disk 0s} {advance 0s}] Num entries: 0. MustSync: false
W1120 09:49:28.170497    7160 draft.go:1245] Raft.Ready took too long to process: Timer Total: 534ms. Breakdown: [{advance 461ms} {disk 73ms} {proposals 0s}] Num entries: 0. MustSync: false

B machine

W1120 09:47:03.353028    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 299ms. Breakdown: [{disk 159ms} {proposals 139ms} {advance 0s}] Num entries: 0. MustSync: false
I1120 09:47:10.948928    1889 log.go:34] 2 [logterm: 605, index: 2791287, vote: 2] rejected MsgPreVote from 3 [logterm: 605, index: 2791285] at term 605
W1120 09:47:26.909217    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 234ms. Breakdown: [{disk 234ms} {proposals 0s} {advance 0s}] Num entries: 0. MustSync: false
I1120 09:47:30.088970    1889 log.go:34] 2 [logterm: 605, index: 2791323, vote: 2] cast MsgPreVote for 3 [logterm: 605, index: 2791323] at term 605
I1120 09:47:30.089031    1889 log.go:34] 2 [term: 605] received a MsgVote message with higher term from 3 [term: 606]
I1120 09:47:30.089053    1889 log.go:34] 2 became follower at term 606
I1120 09:47:30.089272    1889 log.go:34] 2 [logterm: 605, index: 2791323, vote: 0] cast MsgVote for 3 [logterm: 605, index: 2791323] at term 606
I1120 09:47:30.089294    1889 log.go:34] raft.node: 2 lost leader 2 at term 606
I1120 09:47:30.089529    1889 log.go:34] raft.node: 2 elected leader 3 at term 606
E1120 09:47:30.294109    1889 groups.go:995] No longer the leader of group 1. Exiting
E1120 09:47:30.336331    1889 groups.go:932] Error in oracle delta stream. Error: rpc error: code = Canceled desc = context canceled
I1120 09:47:41.722897    1889 log.go:34] 2 [logterm: 606, index: 2791343, vote: 3] cast MsgPreVote for 1 [logterm: 606, index: 2791343] at term 606
I1120 09:47:42.057312    1889 log.go:34] 2 [term: 606] received a MsgVote message with higher term from 1 [term: 607]
I1120 09:47:42.057361    1889 log.go:34] 2 became follower at term 607
I1120 09:47:42.057510    1889 log.go:34] 2 [logterm: 606, index: 2791343, vote: 0] cast MsgVote for 1 [logterm: 606, index: 2791343] at term 607
I1120 09:47:42.057523    1889 log.go:34] raft.node: 2 lost leader 3 at term 607
I1120 09:47:42.166533    1889 log.go:34] raft.node: 2 elected leader 1 at term 607
W1120 09:48:01.292605    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 700ms. Breakdown: [{disk 692ms} {sync 8ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
I1120 09:48:29.668536    1889 log.go:34] 2 [logterm: 607, index: 2791418, vote: 1] rejected MsgPreVote from 3 [logterm: 607, index: 2791399] at term 607
W1120 09:48:42.081009    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 378ms. Breakdown: [{proposals 338ms} {disk 40ms} {advance 0s}] Num entries: 0. MustSync: false
I1120 09:48:45.168502    1889 log.go:34] 2 is starting a new election at term 607
I1120 09:48:45.168544    1889 log.go:34] 2 became pre-candidate at term 607
I1120 09:48:45.168561    1889 log.go:34] 2 received MsgPreVoteResp from 2 at term 607
I1120 09:48:45.293819    1889 log.go:34] 2 [logterm: 607, index: 2791454] sent MsgPreVote request to 1 at term 607
I1120 09:48:45.293926    1889 log.go:34] 2 [logterm: 607, index: 2791454] sent MsgPreVote request to 3 at term 607
I1120 09:48:45.293984    1889 log.go:34] raft.node: 2 lost leader 1 at term 607
I1120 09:48:45.294093    1889 log.go:34] 2 became follower at term 607
I1120 09:48:45.294168    1889 log.go:34] raft.node: 2 elected leader 1 at term 607
W1120 09:48:57.260837    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 344ms. Breakdown: [{disk 304ms} {sync 39ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
W1120 09:49:16.897298    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 306ms. Breakdown: [{advance 11ms} {disk 0s} {proposals 0s}] Num entries: 0. MustSync: false
W1120 09:49:26.763856    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 218ms. Breakdown: [{proposals 203ms} {sync 15ms} {disk 0s} {advance 0s}] Num entries: 1. MustSync: true
W1120 09:50:06.468996    1889 draft.go:1245] Raft.Ready took too long to process: Timer Total: 227ms. Breakdown: [{disk 215ms} {sync 12ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
I1120 09:50:18.471931    1889 draft.go:1545] Skipping snapshot at index: 2791442. Insufficient discard entries: 1. MinPendingStartTs: 3201118

C Machine

I1120 09:55:02.204864    3675 log.go:34] 3 [term: 609] ignored a MsgPreVoteResp message with lower term from 1 [term: 608]
I1120 09:55:02.204944    3675 log.go:34] 3 [term: 609] ignored a MsgVoteResp message with lower term from 1 [term: 608]
I1120 09:55:02.204985    3675 log.go:34] 3 [term: 609] ignored a MsgAppResp message with lower term from 1 [term: 608]
I1120 09:55:02.204997    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205049    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205067    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205097    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205109    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205146    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205158    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205200    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205213    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205243    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205254    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205296    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205310    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205348    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205360    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205397    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205409    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205448    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205460    3675 log.go:34] 3 [term: 609] ignored a MsgHeartbeatResp message with lower term from 1 [term: 608]
I1120 09:55:02.205677    3675 log.go:34] 3 [logterm: 609, index: 2792596, vote: 2] rejected MsgPreVote from 1 [logterm: 608, index: 2792593] at term 609
E1120 09:55:02.248389    3675 groups.go:995] No longer the leader of group 1. Exiting
E1120 09:55:02.250683    3675 groups.go:932] Error in oracle delta stream. Error: rpc error: code = Canceled desc = context canceled
W1120 09:55:19.835225    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 649ms. Breakdown: [{disk 565ms} {sync 84ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
I1120 09:55:25.305845    3675 log.go:34] 3 is starting a new election at term 609
I1120 09:55:25.305914    3675 log.go:34] 3 became pre-candidate at term 609
I1120 09:55:25.305925    3675 log.go:34] 3 received MsgPreVoteResp from 3 at term 609
I1120 09:55:25.306072    3675 log.go:34] 3 [logterm: 609, index: 2792638] sent MsgPreVote request to 2 at term 609
I1120 09:55:25.306159    3675 log.go:34] 3 [logterm: 609, index: 2792638] sent MsgPreVote request to 1 at term 609
I1120 09:55:25.306213    3675 log.go:34] raft.node: 3 lost leader 2 at term 609
I1120 09:55:25.721993    3675 log.go:34] 3 received MsgPreVoteResp from 1 at term 609
I1120 09:55:25.722050    3675 log.go:34] 3 [quorum:2] has received 2 MsgPreVoteResp votes and 0 vote rejections
I1120 09:55:25.722077    3675 log.go:34] 3 became candidate at term 610
I1120 09:55:25.722085    3675 log.go:34] 3 received MsgVoteResp from 3 at term 610
I1120 09:55:25.722193    3675 log.go:34] 3 [logterm: 609, index: 2792638] sent MsgVote request to 1 at term 610
I1120 09:55:25.722279    3675 log.go:34] 3 [logterm: 609, index: 2792638] sent MsgVote request to 2 at term 610
I1120 09:55:25.985965    3675 log.go:34] 3 received MsgVoteResp from 1 at term 610
I1120 09:55:25.986018    3675 log.go:34] 3 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
I1120 09:55:25.986048    3675 log.go:34] 3 became leader at term 610
I1120 09:55:25.986193    3675 log.go:34] raft.node: 3 elected leader 3 at term 610
W1120 09:55:30.034861    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 3.973s. Breakdown: [{advance 3.749s} {proposals 216ms} {sync 8ms} {disk 0s}] Num entries: 1. MustSync: true
I1120 09:55:30.035166    3675 groups.go:888] Leader idx=0x3 of group=1 is connecting to Zero for txn updates
I1120 09:55:30.035182    3675 groups.go:900] Got Zero leader: 10.4.31.22:5080
I1120 09:55:30.038150    3675 log.go:34] 3 [logterm: 610, index: 2792639, vote: 3] cast MsgPreVote for 2 [logterm: 610, index: 2792639] at term 610
I1120 09:55:30.038189    3675 log.go:34] 3 [term: 610] received a MsgVote message with higher term from 2 [term: 611]
I1120 09:55:30.038211    3675 log.go:34] 3 became follower at term 611
I1120 09:55:30.038339    3675 log.go:34] 3 [logterm: 610, index: 2792639, vote: 0] cast MsgVote for 2 [logterm: 610, index: 2792639] at term 611
I1120 09:55:30.038360    3675 log.go:34] raft.node: 3 lost leader 3 at term 611
I1120 09:55:30.074023    3675 log.go:34] raft.node: 3 elected leader 2 at term 611
E1120 09:55:30.263403    3675 groups.go:995] No longer the leader of group 1. Exiting
E1120 09:55:30.263834    3675 groups.go:932] Error in oracle delta stream. Error: rpc error: code = Canceled desc = context canceled
W1120 09:55:30.350105    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 312ms. Breakdown: [{sync 240ms} {disk 72ms} {proposals 0s} {advance 0s}] Num entries: 0. MustSync: true
W1120 09:56:15.523927    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 261ms. Breakdown: [{disk 212ms} {sync 49ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
W1120 09:57:04.805294    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 207ms. Breakdown: [{sync 205ms} {proposals 1ms} {disk 0s} {advance 0s}] Num entries: 2. MustSync: true
I1120 09:58:21.062441    3675 log.go:34] Storing value log head: {Fid:1215 Len:29 Offset:9139140}
W1120 09:58:21.090172    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 416ms. Breakdown: [{disk 416ms} {proposals 0s} {advance 0s}] Num entries: 0. MustSync: false
W1120 09:58:40.452997    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 515ms. Breakdown: [{disk 510ms} {sync 5ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
W1120 09:58:51.801853    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 735ms. Breakdown: [{disk 726ms} {sync 9ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
W1120 09:59:08.857891    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 201ms. Breakdown: [{sync 201ms} {disk 0s} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true
W1120 09:59:27.014183    3675 draft.go:1245] Raft.Ready took too long to process: Timer Total: 2.028s. Breakdown: [{disk 2.024s} {sync 4ms} {proposals 0s} {advance 0s}] Num entries: 1. MustSync: true

from my program log I feel obvious pause. how can i do?

How many Alphas?

100 what? lines of RDF? 100 JSON objects? 100 entities in RDF?

@MichelDiz 3 alphas. 100 lines of RDF

Okay, thanks.

So, I personally would recommend the increasing of Alphas. Even in the same machines. Maybe 9 Alphas (so you have 3 groups) - so each machine would have 3 Alphas. After that, you could setup your client (Java client?) to distribute each transaction between these Alphas. Or you can use an LB (NGINX) here an example https://gist.github.com/MichelDiz/42954e321620159c872c35c20e9d85c6

That way I can guarantee that your mutations would be fine over time.

If you don’t wanna increase the number of Alphas, you can just try balancing at the client level.

@MichelDiz. Okay, thanks for your suggestions. But there have a question. Now our one machine with 16 cores and 128G RAM have only one Aplha. I have watched system load average is very higher. Used iostat shows.

[root@yzsjhl31-22 zhangpan]# iostat 1
Linux 3.10.0-1062.el7.x86_64 (yzsjhl31-22.opi.com) 	2020年11月20日 	_x86_64_	(16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          13.31    0.00    0.65    0.29    0.00   85.75

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              68.95        39.70      8979.12   22603521 5112307357
sdb               0.00         0.05         0.00      27969       2048

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          97.56    0.00    2.44    0.00    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          97.25    0.00    2.44    0.00    0.00    0.31

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          98.44    0.00    1.56    0.00    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               2.00         0.00       260.00          0        260
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          92.50    0.00    7.31    0.00    0.00    0.19

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              59.00         0.00      1220.50          0       1220
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          92.50    0.00    7.50    0.00    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               2.00         0.00         8.00          0          8
sdb               0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          90.25    0.00    9.75    0.00    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0
sdb               0.00         0.00         0.00          0          0

top -H -p pid can shows

top - 11:35:17 up 6 days, 14:16,  1 user,  load average: 13.90, 41.06, 50.17
Threads: 137 total,   3 running, 134 sleeping,   0 stopped,   0 zombie
%Cpu(s): 18.8 us,  1.2 sy,  0.0 ni, 80.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 13182044+total, 61864784 free, 54971944 used, 14983712 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 76085848 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 7488 root      20   0   70.7g  54.1g   3.6g R 63.5 43.1  60:55.56 dgraph
 7466 root      20   0   70.7g  54.1g   3.6g R 48.5 43.1  59:55.68 dgraph
 7487 root      20   0   70.7g  54.1g   3.6g S 39.2 43.1  63:27.68 dgraph
 7471 root      20   0   70.7g  54.1g   3.6g S 28.2 43.1  62:25.07 dgraph
 7187 root      20   0   70.7g  54.1g   3.6g S 24.6 43.1  56:10.88 dgraph
 7286 root      20   0   70.7g  54.1g   3.6g S 18.3 43.1  60:50.44 dgraph
 7173 root      20   0   70.7g  54.1g   3.6g R 16.9 43.1  61:09.15 dgraph
 7569 root      20   0   70.7g  54.1g   3.6g S 10.6 43.1  59:35.69 dgraph
 7461 root      20   0   70.7g  54.1g   3.6g S 10.0 43.1  60:52.33 dgraph
 7282 root      20   0   70.7g  54.1g   3.6g S  9.0 43.1  57:23.60 dgraph
 7168 root      20   0   70.7g  54.1g   3.6g S  7.3 43.1  60:15.54 dgraph
 7460 root      20   0   70.7g  54.1g   3.6g S  3.0 43.1  63:41.97 dgraph
 7478 root      20   0   70.7g  54.1g   3.6g S  2.3 43.1  55:11.33 dgraph
 7469 root      20   0   70.7g  54.1g   3.6g S  2.0 43.1  64:16.86 dgraph
 7276 root      20   0   70.7g  54.1g   3.6g S  1.7 43.1  59:30.94 dgraph
 7288 root      20   0   70.7g  54.1g   3.6g S  1.7 43.1  56:58.55 dgraph
 7292 root      20   0   70.7g  54.1g   3.6g S  1.7 43.1  62:28.84 dgraph
 7314 root      20   0   70.7g  54.1g   3.6g S  1.7 43.1  60:11.76 dgraph
 7456 root      20   0   70.7g  54.1g   3.6g S  1.7 43.1  63:01.23 dgraph
 7477 root      20   0   70.7g  54.1g   3.6g S  1.7 43.1  61:01.50 dgraph
 7186 root      20   0   70.7g  54.1g   3.6g S  1.3 43.1  62:14.19 dgraph
 7266 root      20   0   70.7g  54.1g   3.6g S  1.3 43.1  60:50.34 dgraph
 7480 root      20   0   70.7g  54.1g   3.6g S  1.3 43.1  59:30.76 dgraph
 7490 root      20   0   70.7g  54.1g   3.6g S  1.3 43.1  63:54.28 dgraph
 7491 root      20   0   70.7g  54.1g   3.6g S  1.3 43.1  58:15.77 dgraph
 7165 root      20   0   70.7g  54.1g   3.6g S  1.0 43.1  58:07.94 dgraph
 7274 root      20   0   70.7g  54.1g   3.6g S  1.0 43.1  61:15.73 dgraph
 7281 root      20   0   70.7g  54.1g   3.6g S  1.0 43.1  61:35.70 dgraph
 7306 root      20   0   70.7g  54.1g   3.6g S  1.0 43.1  61:21.78 dgraph
 7465 root      20   0   70.7g  54.1g   3.6g S  1.0 43.1  37:23.55 dgraph
 7473 root      20   0   70.7g  54.1g   3.6g S  1.0 43.1  58:35.04 dgraph
 7162 root      20   0   70.7g  54.1g   3.6g S  0.7 43.1  19:47.45 dgraph
 7265 root      20   0   70.7g  54.1g   3.6g S  0.7 43.1  62:08.19 dgraph
 7311 root      20   0   70.7g  54.1g   3.6g S  0.7 43.1  64:04.45 dgraph
 7312 root      20   0   70.7g  54.1g   3.6g S  0.7 43.1  61:42.03 dgraph
 7462 root      20   0   70.7g  54.1g   3.6g S  0.7 43.1  60:44.91 dgraph

So if one machine have three alphas will cause system load higher. Right?

Not sure what I see, do you have 18 Dgraph nodes running? I see several PIDs.

It wasn’t 3 machines?

@MichelDiz My cluster with one zero and three alphas (3 machines, one machine with one alpha).

I used top -H -p PID to show current machine already have a high system load. One alpha will start about 138 threads to serve our clients(java client or go client or http client …).

So , my means one machine with one alpha already have a high system load. If I follow you one machine with three alphas, three machines with nine alphas, will not cause a higher system load?

Got it.

No, the same load or less. But as I mentioned before, if you distribute the load between the 3 machines, even maintaining the same setup. Will help.

@MichelDiz Thanks, I will try it at my workday. I will tell you experiment result. Waiting for my reply.