Start alpha take up nearly an hour and can not query when exist pending proposals or pending query

hhtlxhhxy · January 5, 2019, 4:22am

Dgraph cluster:
zero：1
alpha：3
Server information：
CPU：64
Memory:256G
Disk: 7T SSD
Data：
RDF：8billion
Edges：18billion
WechatIMG8

Question One:
I use the dgraph bulk loader gengrate the dir p/ ,and copy the dir p/ to three alpha nodes, the p/ dir includes nearly 2W+ files, when I start up the alpha node, it take up nearly 45 minutes, why it is so slow?
WechatIMG9

Queation Two:
After the alpha node start, it have pending proposals , and I can not query anything! Why?And,it happens not only at the start time , it looks like when it have a pending proposals ,I can not query ,and all
the queries become pending at any time.

hhtlxhhxy · January 7, 2019, 6:05am

more information when the cluster have some pending proposals：

hhtlxhhxy · January 7, 2019, 7:00am

github.com

dgraph-io/dgraph/blob/0caeb41856d7cc94d0b89b32ffc201d2de5195bc/worker/groups.go#L972


						glog.Infof("Committed: %d -> %d", txn.StartTs, txn.CommitTs)
					}
				}
			}
			for {
				// Block forever trying to propose this.
				err := g.Node.proposeAndWait(context.Background(), &pb.Proposal{Delta: delta})
				if err == nil {
					break
				}
				glog.Errorf("While proposing delta: %v. Error=%v. Retrying...\n", delta, err)
			}
		}
	}

	for {
		select {
		case <-g.closer.HasBeenClosed():
			return
		case <-ticker.C:
			// Only the leader needs to connect to Zero and get transaction

it may go to loop here

mrjn · January 7, 2019, 7:49pm

For that proposal, as the comment says, Dgraph must continuously try to propose it. The suspicious thing here is the opening time, which is very slow and why there’re so many pending proposals (are you running mutations?)?

If you’re on the community Slack channel, can you ping @dmai?

mrjn · January 7, 2019, 7:52pm

Also, can you zip up all the logs, and upload them here?

hhtlxhhxy · January 8, 2019, 12:45am

yes, I’m running mutations and quering…here is the log,I have three alpha nodes,there are the log files:

sorry, I do not know how to upload log files here, so I save it in my github, here is the link:

Thank You!

dmai · January 8, 2019, 7:12am

Thanks for sharing the logs for the first Alpha, @hhtlxhhxy. It’d be great if you could also share the complete logs for Zero and the other Alphas too.

From what I see, the network is not fast nor is it stable.

W0108 00:11:47.262794   43883 node.go:347] No healthy connection to node Id: 2 addr: [10.237.7.232:7083], err: Unhealthy connection
...
W0108 00:11:47.982048   43883 node.go:347] No healthy connection to node Id: 3 addr: [10.237.7.231:7083], err: Unhealthy connection

This Alpha isn’t able to stay connected to the other Alphas in the group.

I0108 05:25:44.423032   43883 groups.go:859] Got Zero leader: 10.237.7.231:5080
I0108 05:25:44.423097   43883 groups.go:913] Zero leadership changed. Renewing oracle delta stream.
...
I0108 05:25:45.142062   43883 groups.go:850] Leader idx=1 of group=1 is connecting to Zero for txn updates
I0108 05:25:45.142080   43883 groups.go:859] Got Zero leader: 10.237.7.231:5080

The Alpha leader isn’t able to stay connected to the Zero group of 1 member.

I0108 00:06:16.149643   43883 stream.go:240] Rolling up Time elapsed: 10s, bytes sent: 429 B, speed: 42 B/sec
...
I0108 08:30:02.787696   43883 stream.go:240] Rolling up Time elapsed: 19m44s, bytes sent: 22 MB, speed: 18 kB/sec

Speeds of 42 B/sec or 18 kB/sec is extremely slow, especially for the size of your data.

Is your network being throttled which is causing the slowdowns?

hhtlxhhxy · January 8, 2019, 8:52am

The alpha nodes in cluster are under the same switch，the ping delay between them is less than 0.1ms,the NIC is 10GE ,so I don’t think the network is going to be a limitation.
I suspect it has something to do with the load of the machine.

I do not know why does the dgraph alpha process take up so many handles, as below:

and the handle detail like this:

it looks like the process has opened dobule of the num of files in the dir p/ …

and I choose one alpha 10.237.7.231 ,the machine state as below:

hhtlxhhxy · January 8, 2019, 9:01am

I have upload all the alpha nodes’ log and the Zero log in my github, you can check it.
Thank you very much!

dmai · January 9, 2019, 8:05am

There are also write-ahead log files in the w directory.

Have you seen any metrics on the instances that point out any bottlenecks or resource saturation that shows the load on the system?

hhtlxhhxy · January 9, 2019, 9:34am

Sorry, I have not seen any other metric.I just think so many handles may cause some problems sometime in the future. And I want to describe my doubts in detail.

About 2019/1/7 23:00:00 , I finish copying the dir p/ to my three alpha nodes , and I start up the three alphas , it takes up nearly 1 hour. After successfully start up, I can query successfully and quickly.Everything looks like normal.
About 2019/1/8 00:00:00 , I start up my own program, you can see it in Grafana , the dgraph start to have some mutations, and some pending proposals.Then I go to bed.
When I wake up, I find my program has been hanging up.I found the dgraph have some pending proposals in 10.237.7.232.I can not query and mutate .
About 2019/1/7 11:20:00 , I restart the alpha node in 10.237.7.232, and I restart my program. Then I can query and mutate, and my program run normally.But I think the memory line of 10.237.7.232 looks like unnormal.And all the read happen in the other two alpha nodes.
Since then ,the I can query and mutate in dgraph normally. But I do not think the cluster is normal because that all the queries are happened in 10.237.7.230/231.And the memory line of 10.237.7.232 is not same like the other two nodes.
And I found the size of the dir p/ and dir w/ in three alpha nodes is different:
- 10.237.7.230, p/ is 1.8T,w/ is 5.9G
- 10.237.7.231, p/ is 1.8T,w/ is 5.9G
- 10.237.7.232, p/ is 2.6T,w/ is 579M
  why ? I think the alpha nodes are equal, at least not diff so much.
And I found the read rate have something to do with the LRU Cache.When the LRU Cache reduce, the read rate increase.

hhtlxhhxy · January 9, 2019, 10:23am

I plan to restart everything tonight. And I have add the dgraph cluster in my zabbix to get more information.

hhtlxhhxy · January 9, 2019, 2:27pm

The same problem again…

After I start up the three alpha nodes, and I didn’t do anything, I check the zero log:

I0109 20:49:12.058289   88881 run.go:98] Setting up grpc listener at: 0.0.0.0:5080
I0109 20:49:12.058636   88881 run.go:98] Setting up http listener at: 0.0.0.0:6080
I0109 20:49:13.332560   88881 node.go:151] Setting raft.Config to: &{ID:1 peers:[] learners:[] ElectionTick:100 HeartbeatTick:1 Storage:0xc00044a960 Applied:0 MaxSizePerMsg:1048576 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x1f94370 DisableProposalForwarding:false}
I0109 20:49:13.333579   88881 node.go:281] Found hardstate: {Term:2 Vote:1 Commit:184319 XXX_unrecognized:[]}
I0109 20:49:13.443653   88881 node.go:290] Group 0 found 184320 entries
I0109 20:49:13.443689   88881 raft.go:398] Restarting node for dgraphzero
I0109 20:49:13.443840   88881 node.go:83] 1 became follower at term 2
I0109 20:49:13.451778   88881 node.go:83] newRaft 1 [peers: [], term: 2, commit: 184319, applied: 0, lastindex: 184319, lastterm: 2]
I0109 20:49:13.451980   88881 run.go:265] Running Dgraph Zero...
I0109 20:49:14.101756   88881 node.go:173] Setting conf state to nodes:1
I0109 20:49:14.102112   88881 raft.go:596] Done applying conf change at 1
I0109 20:49:14.452173   88881 node.go:83] 1 no leader at term 2; dropping index reading msg
I0109 20:49:17.212251   88881 node.go:83] 1 is starting a new election at term 2
I0109 20:49:17.212316   88881 node.go:83] 1 became pre-candidate at term 2
I0109 20:49:17.212403   88881 node.go:83] 1 received MsgPreVoteResp from 1 at term 2
I0109 20:49:17.212506   88881 node.go:83] 1 became candidate at term 3
I0109 20:49:17.212570   88881 node.go:83] 1 received MsgVoteResp from 1 at term 3
I0109 20:49:17.212670   88881 node.go:83] 1 became leader at term 3
I0109 20:49:17.212714   88881 node.go:83] raft.node: 1 elected leader 1 at term 3
I0109 20:49:17.213343   88881 raft.go:613] I've become the leader, updating leases.
I0109 20:49:17.213408   88881 assign.go:44] Updated Lease id: 1843150001. Txn Ts: 10001
W0109 20:49:17.452268   88881 node.go:550] [1] Read index context timed out
I0109 21:32:20.276455   88881 zero.go:368] Got connection request: cluster_info_only:true
I0109 21:32:20.277061   88881 zero.go:386] Connected: cluster_info_only:true
I0109 21:32:20.277974   88881 zero.go:368] Got connection request: addr:"10.237.7.230:7083"
I0109 21:32:20.296464   88881 pool.go:140] CONNECTED to 10.237.7.230:7083
I0109 21:32:20.297123   88881 zero.go:495] Connected: id:1 group_id:1 addr:"10.237.7.230:7083"
I0109 21:32:37.491400   88881 zero.go:368] Got connection request: cluster_info_only:true
I0109 21:32:37.491522   88881 zero.go:386] Connected: cluster_info_only:true
I0109 21:32:37.496051   88881 zero.go:368] Got connection request: addr:"10.237.7.232:7083"
I0109 21:32:37.500180   88881 pool.go:140] CONNECTED to 10.237.7.232:7083
I0109 21:32:37.500660   88881 zero.go:495] Connected: id:2 group_id:1 addr:"10.237.7.232:7083"
I0109 21:32:47.503063   88881 zero.go:368] Got connection request: cluster_info_only:true
I0109 21:32:47.503308   88881 zero.go:386] Connected: cluster_info_only:true
I0109 21:32:47.503987   88881 zero.go:368] Got connection request: addr:"10.237.7.231:7083"
I0109 21:32:47.505764   88881 pool.go:140] CONNECTED to 10.237.7.231:7083
I0109 21:32:47.517070   88881 zero.go:495] Connected: id:3 group_id:1 addr:"10.237.7.231:7083"
**E0109 21:32:55.501003   88881 pool.go:206] Echo error from 10.237.7.232:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded**
I0109 21:32:56.276996   88881 pool.go:217] Connection established with 10.237.7.232:7083

It have a rpc error.

Then I start my program, start to query and mutate ,more rpc errors occur:

E0109 21:42:44.296710   88881 pool.go:206] Echo error from 10.237.7.230:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:42:45.552491   88881 oracle.go:103] Purged below ts:10036, len(o.commits):18, len(o.rowCommit):274430
I0109 21:42:46.614435   88881 pool.go:217] Connection established with 10.237.7.230:7083
I0109 21:42:46.920082   88881 zero.go:368] Got connection request: cluster_info_only:true
I0109 21:42:46.920510   88881 zero.go:386] Connected: cluster_info_only:true
I0109 21:42:54.840951   88881 raft.go:552] Writing snapshot at index: 184356, applied mark: 184380
E0109 21:43:05.296569   88881 pool.go:206] Echo error from 10.237.7.230:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:43:06.920731   88881 pool.go:217] Connection established with 10.237.7.230:7083
E0109 21:43:20.500682   88881 pool.go:206] Echo error from 10.237.7.232:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:43:22.099175   88881 pool.go:217] Connection established with 10.237.7.232:7083
E0109 21:43:24.500484   88881 pool.go:206] Echo error from 10.237.7.232:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:43:24.712885   88881 pool.go:217] Connection established with 10.237.7.232:7083
E0109 21:43:46.296684   88881 pool.go:206] Echo error from 10.237.7.230:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:43:46.403857   88881 pool.go:217] Connection established with 10.237.7.230:7083
E0109 21:43:48.506003   88881 pool.go:206] Echo error from 10.237.7.231:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
W0109 21:43:50.572058   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.592060   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.612062   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.632057   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.652054   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.672055   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.692047   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.712053   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.732049   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.752052   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.772047   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.792048   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.812048   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.832049   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.852052   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.872047   88881 node.go:85] A tick missed to fire. Node blocks too long!
W0109 21:43:50.892048   88881 node.go:85] A tick missed to fire. Node blocks too long!

I can not query and mutation after a few minutes when a pending proposal exist.
I check the alpha log,

E0109 21:47:44.308302  101919 pool.go:206] Echo error from 10.237.7.231:5080. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:47:44.518611  101919 pool.go:217] Connection established with 10.237.7.231:5080
I0109 21:47:45.230305  101919 stream.go:240] Rolling up Time elapsed: 05m24s, bytes sent: 38 kB, speed: 118 B/sec
E0109 21:47:45.241915  101919 pool.go:206] Echo error from 10.237.7.231:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:47:45.284401  101919 pool.go:217] Connection established with 10.237.7.231:7083
I0109 21:47:46.230332  101919 stream.go:240] Rolling up Time elapsed: 05m25s, bytes sent: 38 kB, speed: 117 B/sec
I0109 21:47:47.230277  101919 stream.go:240] Rolling up Time elapsed: 05m26s, bytes sent: 38 kB, speed: 117 B/sec
I0109 21:47:48.230260  101919 stream.go:240] Rolling up Time elapsed: 05m27s, bytes sent: 38 kB, speed: 117 B/sec
I0109 21:47:49.230249  101919 stream.go:240] Rolling up Time elapsed: 05m28s, bytes sent: 38 kB, speed: 116 B/sec
I0109 21:47:50.230283  101919 stream.go:240] Rolling up Time elapsed: 05m29s, bytes sent: 38 kB, speed: 116 B/sec
E0109 21:47:50.241831  101919 pool.go:206] Echo error from 10.237.7.231:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:47:51.097758  101919 pool.go:217] Connection established with 10.237.7.231:7083
I0109 21:47:51.230298  101919 stream.go:240] Rolling up Time elapsed: 05m30s, bytes sent: 38 kB, speed: 115 B/sec
I0109 21:47:52.230275  101919 stream.go:240] Rolling up Time elapsed: 05m31s, bytes sent: 38 kB, speed: 115 B/sec
I0109 21:47:53.230309  101919 stream.go:240] Rolling up Time elapsed: 05m32s, bytes sent: 38 kB, speed: 115 B/sec
I0109 21:47:54.230269  101919 stream.go:240] Rolling up Time elapsed: 05m33s, bytes sent: 38 kB, speed: 114 B/sec
I0109 21:47:55.230300  101919 stream.go:240] Rolling up Time elapsed: 05m34s, bytes sent: 38 kB, speed: 114 B/sec
I0109 21:47:56.230259  101919 stream.go:240] Rolling up Time elapsed: 05m35s, bytes sent: 38 kB, speed: 114 B/sec
I0109 21:47:57.230323  101919 stream.go:240] Rolling up Time elapsed: 05m36s, bytes sent: 38 kB, speed: 113 B/sec
I0109 21:47:58.230332  101919 stream.go:240] Rolling up Time elapsed: 05m37s, bytes sent: 38 kB, speed: 113 B/sec
I0109 21:47:59.230268  101919 stream.go:240] Rolling up Time elapsed: 05m38s, bytes sent: 38 kB, speed: 113 B/sec
I0109 21:48:00.230276  101919 stream.go:240] Rolling up Time elapsed: 05m39s, bytes sent: 38 kB, speed: 112 B/sec
I0109 21:48:01.230285  101919 stream.go:240] Rolling up Time elapsed: 05m40s, bytes sent: 38 kB, speed: 112 B/sec
I0109 21:48:02.230270  101919 stream.go:240] Rolling up Time elapsed: 05m41s, bytes sent: 38 kB, speed: 112 B/sec
I0109 21:48:03.230285  101919 stream.go:240] Rolling up Time elapsed: 05m42s, bytes sent: 38 kB, speed: 111 B/sec
E0109 21:48:03.242137  101919 pool.go:206] Echo error from 10.237.7.231:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0109 21:48:03.429639  101919 pool.go:217] Connection established with 10.237.7.231:7083
I0109 21:48:04.230317  101919 stream.go:240] Rolling up Time elapsed: 05m43s, bytes sent: 38 kB, speed: 111 B/sec
I0109 21:48:05.230292  101919 stream.go:240] Rolling up Time elapsed: 05m44s, bytes sent: 38 kB, speed: 111 B/sec
I0109 21:48:06.230283  101919 stream.go:240] Rolling up Time elapsed: 05m45s, bytes sent: 38 kB, speed: 110 B/sec
I0109 21:48:07.230276  101919 stream.go:240] Rolling up Time elapsed: 05m46s, bytes sent: 38 kB, speed: 110 B/sec
I0109 21:48:08.230266  101919 stream.go:240] Rolling up Time elapsed: 05m47s, bytes sent: 38 kB, speed: 110 B/sec
I0109 21:48:09.230267  101919 stream.go:240] Rolling up Time elapsed: 05m48s, bytes sent: 38 kB, speed: 109 B/sec
I0109 21:48:10.230277  101919 stream.go:240] Rolling up Time elapsed: 05m49s, bytes sent: 38 kB, speed: 109 B/sec
I0109 21:48:11.230287  101919 stream.go:240] Rolling up Time elapsed: 05m50s, bytes sent: 62 kB, speed: 177 B/sec
I0109 21:48:12.230349  101919 stream.go:240] Rolling up Time elapsed: 05m51s, bytes sent: 62 kB, speed: 177 B/sec
I0109 21:48:13.230333  101919 stream.go:240] Rolling up Time elapsed: 05m52s, bytes sent: 62 kB, speed: 176 B/sec
I0109 21:48:14.230277  101919 stream.go:240] Rolling up Time elapsed: 05m53s, bytes sent: 62 kB, speed: 176 B/sec
I0109 21:48:15.230281  101919 stream.go:240] Rolling up Time elapsed: 05m54s, bytes sent: 62 kB, speed: 175 B/sec
I0109 21:48:16.230252  101919 stream.go:240] Rolling up Time elapsed: 05m55s, bytes sent: 62 kB, speed: 175 B/sec
I0109 21:48:17.230257  101919 stream.go:240] Rolling up Time elapsed: 05m56s, bytes sent: 62 kB, speed: 174 B/sec
I0109 21:48:18.230301  101919 stream.go:240] Rolling up Time elapsed: 05m57s, bytes sent: 62 kB, speed: 174 B/sec
I0109 21:48:19.230301  101919 stream.go:240] Rolling up Time elapsed: 05m58s, bytes sent: 62 kB, speed: 173 B/sec

rolling up so slow…I do not know why…

mrjn · January 10, 2019, 1:52pm

You’re on a slow disk. Can you run ioping while “Rolling up” or other stuff is happening?

hhtlxhhxy · January 11, 2019, 3:45am

Yes, you are right,when there are some pending proposals in 10.237.7.230, the disk I/O is high.And when the I/O ops decline , the pending proposals disappear at once.

And I do not know why the disk I/O is so high at the start few hours of my program, my program has no special operations.

hhtlxhhxy · January 11, 2019, 5:19am

Just now , At 12:35, the same problem again on 10.237.7.231.

the log

I0111 12:34:34.915164   73444 node.go:83] 2 received MsgVoteResp from 3 at term 27
I0111 12:34:34.915252   73444 node.go:83] 2 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
I0111 12:34:34.915489   73444 node.go:83] 2 became leader at term 27
I0111 12:34:34.915674   73444 node.go:83] raft.node: 2 elected leader 2 at term 27
I0111 12:34:35.543273   73444 groups.go:850] Leader idx=2 of group=1 is connecting to Zero for txn updates
I0111 12:34:35.543312   73444 groups.go:859] Got Zero leader: 10.237.7.231:5080
E0111 12:34:37.318400   73444 pool.go:206] Echo error from 10.237.7.231:5080. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0111 12:34:37.543142   73444 groups.go:913] Zero leadership changed. Renewing oracle delta stream.
E0112 12:34:37.546179   73444 groups.go:889] Error in oracle delta stream. Error: rpc error: code = Canceled desc = context canceled
I0111 12:34:38.028294   73444 pool.go:217] Connection established with 10.237.7.231:5080
I0111 12:34:38.830630   73444 groups.go:850] Leader idx=2 of group=1 is connecting to Zero for txn updates
I0111 12:34:38.830707   73444 groups.go:859] Got Zero leader: 10.237.7.231:5080
E0111 12:34:38.868130   73444 draft.go:457] Lastcommit 233229 > current 24806. This would cause some commits to be lost.
W0111 12:34:45.334840   73444 node.go:347] No healthy connection to node Id: 1 addr: [10.237.7.232:7083], err: Unhealthy connection
I0111 12:34:53.249047   73444 node.go:83] 2 [logterm: 27, index: 225777, vote: 2] cast MsgPreVote for 3 [logterm: 27, index: 225777] at term 27
I0111 12:34:56.951250   73444 node.go:83] 2 [logterm: 27, index: 225777, vote: 2] cast MsgPreVote for 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:00.672539   73444 node.go:83] 2 [logterm: 27, index: 225777, vote: 2] cast MsgPreVote for 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:04.391431   73444 node.go:83] 2 [logterm: 27, index: 225778, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:08.110002   73444 node.go:83] 2 [logterm: 27, index: 225779, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:11.854394   73444 node.go:83] 2 [logterm: 27, index: 225779, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:16.349051   73444 node.go:83] 2 [logterm: 27, index: 225780, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:20.079534   73444 node.go:83] 2 [logterm: 27, index: 225780, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:23.808705   73444 node.go:83] 2 [logterm: 27, index: 225780, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:27.541635   73444 node.go:83] 2 [logterm: 27, index: 225780, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
E0111 12:35:29.569689   73444 groups.go:972] While proposing delta: txns:<start_ts:233356 commit_ts:233357 > . Error=Server overloaded with pending proposals. Please retry later.. Retrying...
I0111 12:35:31.270707   73444 node.go:83] 2 [logterm: 27, index: 225781, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:34.987829   73444 node.go:83] 2 [logterm: 27, index: 225782, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:38.710686   73444 node.go:83] 2 [logterm: 27, index: 225782, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:42.439117   73444 node.go:83] 2 [logterm: 27, index: 225783, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:46.148824   73444 node.go:83] 2 [logterm: 27, index: 225783, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:49.872358   73444 node.go:83] 2 [logterm: 27, index: 225783, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:53.592726   73444 node.go:83] 2 [logterm: 27, index: 225783, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:35:57.313442   73444 node.go:83] 2 [logterm: 27, index: 225783, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
E0111 12:35:57.571941   73444 groups.go:972] While proposing delta: txns:<start_ts:233356 commit_ts:233357 > . Error=Server overloaded with pending proposals. Please retry later.. Retrying...
I0111 12:36:01.532763   73444 node.go:83] 2 [logterm: 27, index: 225784, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:05.249056   73444 node.go:83] 2 [logterm: 27, index: 225785, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:08.970436   73444 node.go:83] 2 [logterm: 27, index: 225785, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:13.331292   73444 node.go:83] 2 [logterm: 27, index: 225786, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:17.031526   73444 node.go:83] 2 [logterm: 27, index: 225786, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:20.769625   73444 node.go:83] 2 [logterm: 27, index: 225786, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:24.491165   73444 node.go:83] 2 [logterm: 27, index: 225786, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
E0111 12:36:25.572811   73444 groups.go:972] While proposing delta: txns:<start_ts:233356 commit_ts:233357 > . Error=Server overloaded with pending proposals. Please retry later.. Retrying...
E0111 12:36:27.546897   73444 pool.go:206] Echo error from 10.237.7.230:7083. Err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0111 12:36:29.843267   73444 pool.go:217] Connection established with 10.237.7.230:7083
I0111 12:36:31.807922   73444 node.go:83] 2 [logterm: 27, index: 225788, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:35.528859   73444 node.go:83] 2 [logterm: 27, index: 225788, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:39.248820   73444 node.go:83] 2 [logterm: 27, index: 225789, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:42.972735   73444 node.go:83] 2 [logterm: 27, index: 225789, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27
I0111 12:36:46.698027   73444 node.go:83] 2 [logterm: 27, index: 225789, vote: 2] rejected MsgPreVote from 3 [logterm: 27, index: 225777] at term 27

ioping result:

[root@A01-R06-I7-231-J33GRGY dgraph]# ioping -c 50 .
4 KiB from . (xfs /dev/sdb1): request=1 time=146 us
4 KiB from . (xfs /dev/sdb1): request=2 time=82 us
4 KiB from . (xfs /dev/sdb1): request=3 time=116 us
4 KiB from . (xfs /dev/sdb1): request=4 time=99 us
4 KiB from . (xfs /dev/sdb1): request=5 time=114 us
4 KiB from . (xfs /dev/sdb1): request=6 time=112 us
4 KiB from . (xfs /dev/sdb1): request=7 time=114 us
4 KiB from . (xfs /dev/sdb1): request=8 time=110 us
4 KiB from . (xfs /dev/sdb1): request=9 time=116 us
4 KiB from . (xfs /dev/sdb1): request=10 time=112 us
4 KiB from . (xfs /dev/sdb1): request=11 time=101 us
4 KiB from . (xfs /dev/sdb1): request=12 time=77 us
4 KiB from . (xfs /dev/sdb1): request=13 time=90 us
4 KiB from . (xfs /dev/sdb1): request=14 time=108 us
4 KiB from . (xfs /dev/sdb1): request=15 time=120 us
4 KiB from . (xfs /dev/sdb1): request=16 time=81 us
4 KiB from . (xfs /dev/sdb1): request=17 time=98 us
4 KiB from . (xfs /dev/sdb1): request=18 time=81 us
4 KiB from . (xfs /dev/sdb1): request=19 time=116 us
4 KiB from . (xfs /dev/sdb1): request=20 time=111 us
4 KiB from . (xfs /dev/sdb1): request=21 time=130 us
4 KiB from . (xfs /dev/sdb1): request=22 time=152 us
4 KiB from . (xfs /dev/sdb1): request=23 time=119 us
4 KiB from . (xfs /dev/sdb1): request=24 time=122 us
4 KiB from . (xfs /dev/sdb1): request=25 time=124 us
4 KiB from . (xfs /dev/sdb1): request=26 time=115 us
4 KiB from . (xfs /dev/sdb1): request=27 time=113 us
4 KiB from . (xfs /dev/sdb1): request=28 time=106 us
4 KiB from . (xfs /dev/sdb1): request=29 time=122 us
4 KiB from . (xfs /dev/sdb1): request=30 time=84 us
4 KiB from . (xfs /dev/sdb1): request=31 time=110 us
4 KiB from . (xfs /dev/sdb1): request=32 time=109 us
4 KiB from . (xfs /dev/sdb1): request=33 time=138 us
4 KiB from . (xfs /dev/sdb1): request=34 time=112 us
4 KiB from . (xfs /dev/sdb1): request=35 time=100 us
4 KiB from . (xfs /dev/sdb1): request=36 time=85 us
4 KiB from . (xfs /dev/sdb1): request=37 time=114 us
4 KiB from . (xfs /dev/sdb1): request=38 time=126 us
4 KiB from . (xfs /dev/sdb1): request=39 time=121 us
4 KiB from . (xfs /dev/sdb1): request=40 time=103 us
4 KiB from . (xfs /dev/sdb1): request=41 time=95 us
4 KiB from . (xfs /dev/sdb1): request=42 time=85 us
4 KiB from . (xfs /dev/sdb1): request=43 time=95 us
4 KiB from . (xfs /dev/sdb1): request=44 time=83 us
4 KiB from . (xfs /dev/sdb1): request=45 time=85 us
4 KiB from . (xfs /dev/sdb1): request=46 time=125 us
4 KiB from . (xfs /dev/sdb1): request=47 time=110 us
4 KiB from . (xfs /dev/sdb1): request=48 time=116 us
4 KiB from . (xfs /dev/sdb1): request=49 time=117 us
4 KiB from . (xfs /dev/sdb1): request=50 time=126 us

--- . (xfs /dev/sdb1) ioping statistics ---
50 requests completed in 49.0 s, 9.18 k iops, 35.9 MiB/s
min/avg/max/mdev = 77 us / 108 us / 152 us / 16 us

system · February 10, 2019, 5:20am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
When bulk finish the size of each directory is uneven and start alpha slow Dgraph	8	774	January 9, 2019
Server Overloaded with Pending Proposals Users kind:bug	2	747	June 16, 2019
Performance issue in cluster Dgraph	3	660	August 7, 2019
Extremely slow Oracle().WaitForTs() during mutation Dgraph	10	378	June 10, 2021
How to plan the cluster size Dgraph kind:question , dgraph , area:operations	5	678	April 23, 2020

Start alpha take up nearly an hour and can not query when exist pending proposals or pending query

Related topics