Unable to run dgraph in a multi-node kubernetes cluster


(Pawan Rawal) #21
  1. Did X less than minTs: y during query help in understanding this issue?

  2. is related to https://github.com/dgraph-io/dgraph/issues/2047, a fix for which has been merged to master. Can you try the dgraph/dgraph:master image and see if you still face this issue?

  3. and 4 make me feel that there is something wrong with the config or that this is a bug. Could you share steps to reproduce? How many nodes are you running and when does this happen?


(Jzhu077) #22
  1. That makes sense.
  2. Looks like the issue no longer exists in the master branch, but I will let it run for a bit longer and see. When will you make another release? I prefer a release branch as tip sometimes leads to unexpected problems.
  3. It is a configuration error, which is resolved after I wipe the database and restart the problematic pods.
    Cheers

(Pawan Rawal) #23

We’ll be doing a release on Thursday/Friday. Tip shouldn’t cause any issues as it only contains bug fixes on top of the last release.


(Jzhu077) #24

dgraph image dgraph/dgraph:master

I run into a problem that all pods are unable to join the network… I thought I got that to work, but I found out that I was using an old image.

So dgraph-zero-0 is good but
other zero:

kubectl logs dgraph-zero-1
++ hostname
+ [[ dgraph-zero-1 =~ -([0-9]+)$ ]]
+ ordinal=1
+ idx=2
+ [[ 1 -eq 0 ]]
++ hostname -f
+ dgraph zero -o -2000 --replicas 3 --my=dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080 --peer dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080 --idx 2
Setting up grpc listener at: 0.0.0.0:3080
Setting up http listener at: 0.0.0.0:4080
2018/01/30 03:07:12 node.go:258: Group 0 found 0 entries
2018/01/30 03:07:12 pool.go:168: Echo error from dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:12 pool.go:118: == CONNECT ==> Setting dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
2018/01/30 03:07:12 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:12 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:12 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:12 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:13 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:15 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:18 raft.go:442: Error while joining cluster rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:07:22 pool.go:168: Echo error from dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure

other dgraph-server:

+ offset=10
++ hostname
+ [[ dgraph-server-0 =~ -([0-9]+)$ ]]
+ ordinal=0
+ idx=10
++ hostname -f
+ dgraph server --my=dgraph-server-0.dgraph-server.default.svc.cluster.local:7090 --memory_mb 3036 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080 --idx 10 --port_offset 10 --debugmode
2018/01/30 03:13:22 groups.go:86: Current Raft Id: 10
2018/01/30 03:13:22 worker.go:99: Worker listening at address: [::]:7090
2018/01/30 03:13:22 gRPC server started.  Listening on port 9090
2018/01/30 03:13:22 HTTP server started.  Listening on port 8090
2018/01/30 03:13:22 pool.go:168: Echo error from dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:22 pool.go:118: == CONNECT ==> Setting dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
2018/01/30 03:13:22 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:22 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:22 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:22 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:23 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:23 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:25 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:28 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:32 pool.go:168: Echo error from dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:35 groups.go:102: Error while connecting with group zero: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:42 pool.go:168: Echo error from dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/01/30 03:13:47 Unable to join cluster via dgraphzero

I have tested the exact same config with dgraph/dgraph:latest that works as expected.


(Pawan Rawal) #25

So, we made the default ports for zero as 5080 and 6080 in master. You’d have to adjust the offsets accordingly. No need to give -2000 mostly.

If you give offset it runs Zero on 3080 and 4080

Setting up grpc listener at: 0.0.0.0:3080
Setting up http listener at: 0.0.0.0:4080

(Jzhu077) #26

got you. Can’t believe I missed that. Thank you!


(Jzhu077) #27

I still encounter the undefined predicate issue after I keeping the dgraph cluster running for couple days.
I also noticed it only happens to certain pods. Since I have a load balancer sitting in front of all dgraph servers, running the same query repeatedly yields different results. which means in the undefined predicates issue only occurs in some pods.


(Pawan Rawal) #28

Can you check if the issue persists after a restart? So you are running 15 Dgraph servers divided into three groups, right?


(Jzhu077) #29

I can’t try that right now as I have manually fixed the problem by re-index the predicates, so I can carry on. I will try to restart the nodes and report back here when I hit that again. I am now using 30 nodes divided into 10 groups.


(Pawan Rawal) #30

And also just make sure all nodes are using dgraph/dgraph:master image.


(Jzhu077) #31

I hit another issue when I try to reset the dgraph cluster, I deleted all dgraph services and persistent volumes. Then try to redeploy the cluster. There are 30 servers to be deployed, but dgraph-server-21 is having some troubles with elections.

2018/02/01 02:07:59 node.go:246: Found hardstate: {Term:291 Vote:31 Commit:4 XXX_unrecognized:[]}
2018/02/01 02:07:59 node.go:258: Group 8 found 4 entries
2018/02/01 02:07:59 draft.go:657: Restarting node for group: 8
2018/02/01 02:07:59 raft.go:567: INFO: 1f became follower at term 291
2018/02/01 02:07:59 raft.go:315: INFO: newRaft 1f [peers: [], term: 291, commit: 4, applied: 0, lastindex: 4, lastterm: 2]
2018/02/01 02:07:59 groups.go:296: Asking if I can serve tablet for: _predicate_
2018/02/01 02:07:59 node.go:127: Setting conf state to nodes:31 
2018/02/01 02:07:59 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 02:07:59 pool.go:118: == CONNECT ==> Setting dgraph-server-23.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:07:59 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 02:07:59 pool.go:118: == CONNECT ==> Setting dgraph-server-22.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:07:59 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 02:07:59 node.go:127: Setting conf state to nodes:31 nodes:32 
2018/02/01 02:08:02 raft.go:749: INFO: 1f is starting a new election at term 291
2018/02/01 02:08:02 raft.go:580: INFO: 1f became candidate at term 292
2018/02/01 02:08:02 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 292
2018/02/01 02:08:02 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 292
2018/02/01 02:08:02 node.go:315: No healthy connection found to node Id: 32, err: Unhealthy connection
2018/02/01 02:08:04 raft.go:749: INFO: 1f is starting a new election at term 292
2018/02/01 02:08:04 raft.go:580: INFO: 1f became candidate at term 293
2018/02/01 02:08:04 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 293
2018/02/01 02:08:04 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 293
2018/02/01 02:08:07 raft.go:749: INFO: 1f is starting a new election at term 293

the similar logs carries on with increasing term number…

that makes the next pod dgraph-server-22 entering a crash loop.

2018/02/01 02:09:53 node.go:258: Group 8 found 0 entries
2018/02/01 02:09:53 draft.go:668: New Node for group: 8
2018/02/01 02:09:53 pool.go:118: == CONNECT ==> Setting dgraph-server-16.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:09:53 Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8

Is there a way to recover from this? or I have to delete all statefulsets again and wipe the persistent volume?
I have tried to restart the pod 21 and 22, that doesn’t help.
PS: all groups prior to group 8 have set up correctly. This is the first time I hit this for running the same setup script.


(Pawan Rawal) #32

Are these the full logs for dgraph-server-22?


(Jzhu077) #33
kubectl logs dgraph-server-22
+ offset=10
++ hostname
+ [[ dgraph-server-22 =~ -([0-9]+)$ ]]
+ ordinal=22
+ idx=32
++ hostname -f
+ dgraph server --my=dgraph-server-22.dgraph-server.default.svc.cluster.local:7090 --memory_mb 6000 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080 --idx 32 --port_offset 10 --debugmode
2018/02/01 02:20:08 gRPC server started.  Listening on port 9090
2018/02/01 02:20:08 HTTP server started.  Listening on port 8090
2018/02/01 02:20:08 groups.go:86: Current Raft Id: 32
2018/02/01 02:20:08 worker.go:99: Worker listening at address: [::]:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 02:20:08 groups.go:109: Connected to group zero. Connection state: member:<id:32 addr:"dgraph-server-22.dgraph-server.default.svc.cluster.local:7090" > state:<counter:120 groups:<key:1 value:<members:<key:10 value:<id:10 group_id:1 addr:"dgraph-server-0.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449286 > > members:<key:11 value:<id:11 group_id:1 addr:"dgraph-server-1.dgraph-server.default.svc.cluster.local:7090" > > members:<key:12 value:<id:12 group_id:1 addr:"dgraph-server-2.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:2 value:<members:<key:13 value:<id:13 group_id:2 addr:"dgraph-server-3.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449326 > > members:<key:14 value:<id:14 group_id:2 addr:"dgraph-server-4.dgraph-server.default.svc.cluster.local:7090" > > members:<key:15 value:<id:15 group_id:2 addr:"dgraph-server-5.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:3 value:<members:<key:16 value:<id:16 group_id:3 addr:"dgraph-server-6.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449556 > > members:<key:17 value:<id:17 group_id:3 addr:"dgraph-server-7.dgraph-server.default.svc.cluster.local:7090" > > members:<key:18 value:<id:18 group_id:3 addr:"dgraph-server-8.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:4 value:<members:<key:19 value:<id:19 group_id:4 addr:"dgraph-server-9.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449618 > > members:<key:20 value:<id:20 group_id:4 addr:"dgraph-server-10.dgraph-server.default.svc.cluster.local:7090" > > members:<key:21 value:<id:21 group_id:4 addr:"dgraph-server-11.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:5 value:<members:<key:22 value:<id:22 group_id:5 addr:"dgraph-server-12.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449679 > > members:<key:23 value:<id:23 group_id:5 addr:"dgraph-server-13.dgraph-server.default.svc.cluster.local:7090" > > members:<key:24 value:<id:24 group_id:5 addr:"dgraph-server-14.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:6 value:<members:<key:25 value:<id:25 group_id:6 addr:"dgraph-server-15.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449877 > > members:<key:26 value:<id:26 group_id:6 addr:"dgraph-server-16.dgraph-server.default.svc.cluster.local:7090" > > members:<key:27 value:<id:27 group_id:6 addr:"dgraph-server-17.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:7 value:<members:<key:28 value:<id:28 group_id:7 addr:"dgraph-server-18.dgraph-server.default.svc.cluster.local:7090" last_update:1517449963 > > members:<key:29 value:<id:29 group_id:7 addr:"dgraph-server-19.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449968 > > members:<key:30 value:<id:30 group_id:7 addr:"dgraph-server-20.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:8 value:<members:<key:31 value:<id:31 group_id:8 addr:"dgraph-server-21.dgraph-server.default.svc.cluster.local:7090" last_update:1517450006 > > members:<key:32 value:<id:32 group_id:8 addr:"dgraph-server-22.dgraph-server.default.svc.cluster.local:7090" > > members:<key:33 value:<id:33 group_id:8 addr:"dgraph-server-23.dgraph-server.default.svc.cluster.local:7090" > > tablets:<key:"_predicate_" value:<group_id:8 predicate:"_predicate_" > > > > zeros:<key:1 value:<id:1 addr:"dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080" leader:true > > zeros:<key:2 value:<id:2 addr:"dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:3 value:<id:3 addr:"dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:4 value:<id:4 addr:"dgraph-zero-3.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:5 value:<id:5 addr:"dgraph-zero-4.dgraph-zero.default.svc.cluster.local:5080" > > maxTxnTs:10000 > 
2018/02/01 02:20:08 draft.go:139: Node ID: 32 with GroupID: 8
2018/02/01 02:20:08 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-23.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-0.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-18.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-14.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-21.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-1.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-5.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-zero-3.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-12.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-15.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-16.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-13.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-4.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-6.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-20.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-11.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-7.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-2.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-8.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-zero-4.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-10.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-17.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-3.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 node.go:258: Group 8 found 0 entries
2018/02/01 02:20:08 draft.go:668: New Node for group: 8
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-19.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 pool.go:118: == CONNECT ==> Setting dgraph-server-9.dgraph-server.default.svc.cluster.local:7090
2018/02/01 02:20:08 Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8

github.com/dgraph-io/dgraph/x.Fatalf
	/home/travis/gopath/src/github.com/dgraph-io/dgraph/x/error.go:103
github.com/dgraph-io/dgraph/worker.(*node).retrieveSnapshot
	/home/travis/gopath/src/github.com/dgraph-io/dgraph/worker/draft.go:434
github.com/dgraph-io/dgraph/worker.(*node).InitAndStartNode
	/home/travis/gopath/src/github.com/dgraph-io/dgraph/worker/draft.go:672
github.com/dgraph-io/dgraph/worker.StartRaftNodes
	/home/travis/gopath/src/github.com/dgraph-io/dgraph/worker/groups.go:120
runtime.goexit
	/home/travis/.gimme/versions/go1.9.2.linux.amd64/src/runtime/asm_amd64.s:2337

(Pawan Rawal) #34

This looks like bug, where even though the node was earlier part of the cluster, it is being treated as a new node on restart. Now since this is counted as a new node, its trying to get a snapshot which fails because there is no leader for group 8. Node server-21 can’t elect a leader because atleast 2 out of 3 nodes need to be up. Is server-23 also stuck in a similar error loop?

I can push a temporary fix for you while we reproduce and fix this properly.


(Jzhu077) #35

Yes, server-23 is also stuck in the same error loop.

A temporary fix for now will be awesome, it would be much better than reset the cluster and likely to hit this again.


(Pawan Rawal) #36

Have pushed an updated docker image under dgraph/dgraph:master. Give it a try and see if you can start you cluster.


(Jzhu077) #37

Hitting a different error now.

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
dgraph/dgraph master bcf71ccbf50e 5 minutes ago 167 MB

kubectl logs dgraph-server-21
+ offset=10
++ hostname
+ [[ dgraph-server-21 =~ -([0-9]+)$ ]]
+ ordinal=21
+ idx=31
++ hostname -f
+ dgraph server --my=dgraph-server-21.dgraph-server.default.svc.cluster.local:7090 --memory_mb 6000 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080 --idx 31 --port_offset 10 --debugmode
2018/02/01 04:04:59 groups.go:86: Current Raft Id: 31
2018/02/01 04:04:59 gRPC server started.  Listening on port 9090
2018/02/01 04:04:59 HTTP server started.  Listening on port 8090
2018/02/01 04:04:59 worker.go:99: Worker listening at address: [::]:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:04:59 groups.go:109: Connected to group zero. Connection state: member:<id:31 addr:"dgraph-server-21.dgraph-server.default.svc.cluster.local:7090" > state:<counter:138 groups:<key:1 value:<members:<key:10 value:<id:10 group_id:1 addr:"dgraph-server-0.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449286 > > members:<key:11 value:<id:11 group_id:1 addr:"dgraph-server-1.dgraph-server.default.svc.cluster.local:7090" > > members:<key:12 value:<id:12 group_id:1 addr:"dgraph-server-2.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:2 value:<members:<key:13 value:<id:13 group_id:2 addr:"dgraph-server-3.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449326 > > members:<key:14 value:<id:14 group_id:2 addr:"dgraph-server-4.dgraph-server.default.svc.cluster.local:7090" > > members:<key:15 value:<id:15 group_id:2 addr:"dgraph-server-5.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:3 value:<members:<key:16 value:<id:16 group_id:3 addr:"dgraph-server-6.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449556 > > members:<key:17 value:<id:17 group_id:3 addr:"dgraph-server-7.dgraph-server.default.svc.cluster.local:7090" > > members:<key:18 value:<id:18 group_id:3 addr:"dgraph-server-8.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:4 value:<members:<key:19 value:<id:19 group_id:4 addr:"dgraph-server-9.dgraph-server.default.svc.cluster.local:7090" last_update:1517457894 > > members:<key:20 value:<id:20 group_id:4 addr:"dgraph-server-10.dgraph-server.default.svc.cluster.local:7090" > > members:<key:21 value:<id:21 group_id:4 addr:"dgraph-server-11.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517457894 > > > > groups:<key:5 value:<members:<key:22 value:<id:22 group_id:5 addr:"dgraph-server-12.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517457853 > > members:<key:23 value:<id:23 group_id:5 addr:"dgraph-server-13.dgraph-server.default.svc.cluster.local:7090" last_update:1517457851 > > members:<key:24 value:<id:24 group_id:5 addr:"dgraph-server-14.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:6 value:<members:<key:25 value:<id:25 group_id:6 addr:"dgraph-server-15.dgraph-server.default.svc.cluster.local:7090" last_update:1517456062 > > members:<key:26 value:<id:26 group_id:6 addr:"dgraph-server-16.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517456065 > > members:<key:27 value:<id:27 group_id:6 addr:"dgraph-server-17.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:7 value:<members:<key:28 value:<id:28 group_id:7 addr:"dgraph-server-18.dgraph-server.default.svc.cluster.local:7090" last_update:1517449963 > > members:<key:29 value:<id:29 group_id:7 addr:"dgraph-server-19.dgraph-server.default.svc.cluster.local:7090" last_update:1517454525 > > members:<key:30 value:<id:30 group_id:7 addr:"dgraph-server-20.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517455429 > > > > groups:<key:8 value:<members:<key:31 value:<id:31 group_id:8 addr:"dgraph-server-21.dgraph-server.default.svc.cluster.local:7090" last_update:1517450006 > > members:<key:32 value:<id:32 group_id:8 addr:"dgraph-server-22.dgraph-server.default.svc.cluster.local:7090" > > members:<key:33 value:<id:33 group_id:8 addr:"dgraph-server-23.dgraph-server.default.svc.cluster.local:7090" > > tablets:<key:"_predicate_" value:<group_id:8 predicate:"_predicate_" > > > > zeros:<key:1 value:<id:1 addr:"dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080" leader:true > > zeros:<key:2 value:<id:2 addr:"dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:3 value:<id:3 addr:"dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:4 value:<id:4 addr:"dgraph-zero-3.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:5 value:<id:5 addr:"dgraph-zero-4.dgraph-zero.default.svc.cluster.local:5080" > > maxTxnTs:10000 > 
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-0.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-1.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-2.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-4.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-5.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-3.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-6.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-7.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-8.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-10.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-11.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-9.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-13.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-14.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-12.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-16.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-17.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-15.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-18.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-19.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-20.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-22.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-server-23.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-zero-3.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:04:59 pool.go:118: == CONNECT ==> Setting dgraph-zero-4.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:04:59 draft.go:140: Node ID: 31 with GroupID: 8
2018/02/01 04:04:59 node.go:246: Found hardstate: {Term:2611 Vote:31 Commit:4 XXX_unrecognized:[]}
2018/02/01 04:04:59 node.go:258: Group 8 found 4 entries
2018/02/01 04:04:59 draft.go:661: Restarting node for group: 8
2018/02/01 04:04:59 raft.go:567: INFO: 1f became follower at term 2611
2018/02/01 04:04:59 raft.go:315: INFO: newRaft 1f [peers: [], term: 2611, commit: 4, applied: 0, lastindex: 4, lastterm: 2]
2018/02/01 04:04:59 node.go:127: Setting conf state to nodes:31 
2018/02/01 04:04:59 node.go:127: Setting conf state to nodes:31 nodes:32 
2018/02/01 04:05:03 raft.go:749: INFO: 1f is starting a new election at term 2611
2018/02/01 04:05:03 raft.go:580: INFO: 1f became candidate at term 2612
2018/02/01 04:05:03 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2612
2018/02/01 04:05:03 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2612
2018/02/01 04:05:03 node.go:315: No healthy connection found to node Id: 32, err: Unhealthy connection
2018/02/01 04:05:05 raft.go:749: INFO: 1f is starting a new election at term 2612
2018/02/01 04:05:05 raft.go:580: INFO: 1f became candidate at term 2613
2018/02/01 04:05:05 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2613
2018/02/01 04:05:05 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2613
2018/02/01 04:05:08 raft.go:749: INFO: 1f is starting a new election at term 2613
2018/02/01 04:05:08 raft.go:580: INFO: 1f became candidate at term 2614
2018/02/01 04:05:08 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2614
2018/02/01 04:05:08 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2614
2018/02/01 04:05:09 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:10 raft.go:749: INFO: 1f is starting a new election at term 2614
2018/02/01 04:05:10 raft.go:580: INFO: 1f became candidate at term 2615
2018/02/01 04:05:10 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2615
2018/02/01 04:05:10 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2615
2018/02/01 04:05:13 raft.go:749: INFO: 1f is starting a new election at term 2615
2018/02/01 04:05:13 raft.go:580: INFO: 1f became candidate at term 2616
2018/02/01 04:05:13 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2616
2018/02/01 04:05:13 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2616
2018/02/01 04:05:16 raft.go:749: INFO: 1f is starting a new election at term 2616
2018/02/01 04:05:16 raft.go:580: INFO: 1f became candidate at term 2617
2018/02/01 04:05:16 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2617
2018/02/01 04:05:16 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2617
2018/02/01 04:05:19 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:19 raft.go:749: INFO: 1f is starting a new election at term 2617
2018/02/01 04:05:19 raft.go:580: INFO: 1f became candidate at term 2618
2018/02/01 04:05:19 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2618
2018/02/01 04:05:19 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2618
2018/02/01 04:05:20 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:20 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:23 raft.go:749: INFO: 1f is starting a new election at term 2618
2018/02/01 04:05:23 raft.go:580: INFO: 1f became candidate at term 2619
2018/02/01 04:05:23 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2619
2018/02/01 04:05:23 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2619
2018/02/01 04:05:25 raft.go:749: INFO: 1f is starting a new election at term 2619
2018/02/01 04:05:25 raft.go:580: INFO: 1f became candidate at term 2620
2018/02/01 04:05:25 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2620
2018/02/01 04:05:25 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2620
2018/02/01 04:05:29 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:29 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:29 raft.go:749: INFO: 1f is starting a new election at term 2620
2018/02/01 04:05:29 raft.go:580: INFO: 1f became candidate at term 2621
2018/02/01 04:05:29 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2621
2018/02/01 04:05:29 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2621
2018/02/01 04:05:31 raft.go:749: INFO: 1f is starting a new election at term 2621
2018/02/01 04:05:31 raft.go:580: INFO: 1f became candidate at term 2622
2018/02/01 04:05:31 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2622
2018/02/01 04:05:31 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2622
2018/02/01 04:05:35 raft.go:749: INFO: 1f is starting a new election at term 2622
2018/02/01 04:05:35 raft.go:580: INFO: 1f became candidate at term 2623
2018/02/01 04:05:35 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2623
2018/02/01 04:05:35 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2623
2018/02/01 04:05:37 raft.go:749: INFO: 1f is starting a new election at term 2623
2018/02/01 04:05:37 raft.go:580: INFO: 1f became candidate at term 2624
2018/02/01 04:05:37 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2624
2018/02/01 04:05:37 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2624
2018/02/01 04:05:39 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:39 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:40 raft.go:749: INFO: 1f is starting a new election at term 2624
2018/02/01 04:05:40 raft.go:580: INFO: 1f became candidate at term 2625
2018/02/01 04:05:40 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2625
2018/02/01 04:05:40 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2625
2018/02/01 04:05:42 raft.go:749: INFO: 1f is starting a new election at term 2625
2018/02/01 04:05:42 raft.go:580: INFO: 1f became candidate at term 2626
2018/02/01 04:05:42 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2626
2018/02/01 04:05:42 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2626
2018/02/01 04:05:45 raft.go:749: INFO: 1f is starting a new election at term 2626
2018/02/01 04:05:45 raft.go:580: INFO: 1f became candidate at term 2627
2018/02/01 04:05:45 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2627
2018/02/01 04:05:45 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2627
2018/02/01 04:05:48 raft.go:749: INFO: 1f is starting a new election at term 2627
2018/02/01 04:05:48 raft.go:580: INFO: 1f became candidate at term 2628
2018/02/01 04:05:48 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2628
2018/02/01 04:05:48 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2628
2018/02/01 04:05:49 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:49 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:51 raft.go:749: INFO: 1f is starting a new election at term 2628
2018/02/01 04:05:51 raft.go:580: INFO: 1f became candidate at term 2629
2018/02/01 04:05:51 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2629
2018/02/01 04:05:51 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2629
2018/02/01 04:05:54 raft.go:749: INFO: 1f is starting a new election at term 2629
2018/02/01 04:05:54 raft.go:580: INFO: 1f became candidate at term 2630
2018/02/01 04:05:54 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2630
2018/02/01 04:05:54 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2630
2018/02/01 04:05:56 raft.go:749: INFO: 1f is starting a new election at term 2630
2018/02/01 04:05:56 raft.go:580: INFO: 1f became candidate at term 2631
2018/02/01 04:05:56 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2631
2018/02/01 04:05:56 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2631
2018/02/01 04:05:59 raft.go:749: INFO: 1f is starting a new election at term 2631
2018/02/01 04:05:59 raft.go:580: INFO: 1f became candidate at term 2632
2018/02/01 04:05:59 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2632
2018/02/01 04:05:59 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2632
2018/02/01 04:05:59 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:05:59 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:01 raft.go:749: INFO: 1f is starting a new election at term 2632
2018/02/01 04:06:01 raft.go:580: INFO: 1f became candidate at term 2633
2018/02/01 04:06:01 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2633
2018/02/01 04:06:01 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2633
2018/02/01 04:06:05 raft.go:749: INFO: 1f is starting a new election at term 2633
2018/02/01 04:06:05 raft.go:580: INFO: 1f became candidate at term 2634
2018/02/01 04:06:05 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2634
2018/02/01 04:06:05 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2634
2018/02/01 04:06:09 raft.go:749: INFO: 1f is starting a new election at term 2634
2018/02/01 04:06:09 raft.go:580: INFO: 1f became candidate at term 2635
2018/02/01 04:06:09 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2635
2018/02/01 04:06:09 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2635
2018/02/01 04:06:09 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:09 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:12 raft.go:749: INFO: 1f is starting a new election at term 2635
2018/02/01 04:06:12 raft.go:580: INFO: 1f became candidate at term 2636
2018/02/01 04:06:12 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2636
2018/02/01 04:06:12 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2636
2018/02/01 04:06:15 raft.go:749: INFO: 1f is starting a new election at term 2636
2018/02/01 04:06:15 raft.go:580: INFO: 1f became candidate at term 2637
2018/02/01 04:06:15 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2637
2018/02/01 04:06:15 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2637
2018/02/01 04:06:19 raft.go:749: INFO: 1f is starting a new election at term 2637
2018/02/01 04:06:19 raft.go:580: INFO: 1f became candidate at term 2638
2018/02/01 04:06:19 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2638
2018/02/01 04:06:19 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2638
2018/02/01 04:06:19 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:19 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:22 raft.go:749: INFO: 1f is starting a new election at term 2638
2018/02/01 04:06:22 raft.go:580: INFO: 1f became candidate at term 2639
2018/02/01 04:06:22 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2639
2018/02/01 04:06:22 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2639
2018/02/01 04:06:24 raft.go:749: INFO: 1f is starting a new election at term 2639
2018/02/01 04:06:24 raft.go:580: INFO: 1f became candidate at term 2640
2018/02/01 04:06:24 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2640
2018/02/01 04:06:24 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2640
2018/02/01 04:06:27 raft.go:749: INFO: 1f is starting a new election at term 2640
2018/02/01 04:06:27 raft.go:580: INFO: 1f became candidate at term 2641
2018/02/01 04:06:27 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2641
2018/02/01 04:06:27 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2641
2018/02/01 04:06:29 pool.go:168: Echo error from dgraph-server-10.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:29 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:29 raft.go:749: INFO: 1f is starting a new election at term 2641
2018/02/01 04:06:29 raft.go:580: INFO: 1f became candidate at term 2642
2018/02/01 04:06:29 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2642
2018/02/01 04:06:29 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2642
2018/02/01 04:06:29 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unknown desc = No node has been set up yet
2018/02/01 04:06:32 raft.go:749: INFO: 1f is starting a new election at term 2642
2018/02/01 04:06:32 raft.go:580: INFO: 1f became candidate at term 2643
2018/02/01 04:06:32 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2643
2018/02/01 04:06:32 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2643
2018/02/01 04:06:32 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:35 raft.go:749: INFO: 1f is starting a new election at term 2643
2018/02/01 04:06:35 raft.go:580: INFO: 1f became candidate at term 2644
2018/02/01 04:06:35 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2644
2018/02/01 04:06:35 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2644
2018/02/01 04:06:35 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:38 raft.go:749: INFO: 1f is starting a new election at term 2644
2018/02/01 04:06:38 raft.go:580: INFO: 1f became candidate at term 2645
2018/02/01 04:06:38 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2645
2018/02/01 04:06:38 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2645
2018/02/01 04:06:38 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:39 pool.go:168: Echo error from dgraph-server-10.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:39 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:39 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:42 raft.go:749: INFO: 1f is starting a new election at term 2645
2018/02/01 04:06:42 raft.go:580: INFO: 1f became candidate at term 2646
2018/02/01 04:06:42 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2646
2018/02/01 04:06:42 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2646
2018/02/01 04:06:42 node.go:315: No healthy connection found to node Id: 32, err: Unhealthy connection
2018/02/01 04:06:45 raft.go:749: INFO: 1f is starting a new election at term 2646
2018/02/01 04:06:45 raft.go:580: INFO: 1f became candidate at term 2647
2018/02/01 04:06:45 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2647
2018/02/01 04:06:45 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2647
2018/02/01 04:06:48 raft.go:749: INFO: 1f is starting a new election at term 2647
2018/02/01 04:06:48 raft.go:580: INFO: 1f became candidate at term 2648
2018/02/01 04:06:48 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2648
2018/02/01 04:06:48 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2648
2018/02/01 04:06:49 pool.go:168: Echo error from dgraph-server-10.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:49 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:51 raft.go:749: INFO: 1f is starting a new election at term 2648
2018/02/01 04:06:51 raft.go:580: INFO: 1f became candidate at term 2649
2018/02/01 04:06:51 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2649
2018/02/01 04:06:51 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2649
2018/02/01 04:06:51 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unknown desc = No node has been set up yet
2018/02/01 04:06:53 raft.go:749: INFO: 1f is starting a new election at term 2649
2018/02/01 04:06:53 raft.go:580: INFO: 1f became candidate at term 2650
2018/02/01 04:06:53 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2650
2018/02/01 04:06:53 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2650
2018/02/01 04:06:53 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:57 raft.go:749: INFO: 1f is starting a new election at term 2650
2018/02/01 04:06:57 raft.go:580: INFO: 1f became candidate at term 2651
2018/02/01 04:06:57 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2651
2018/02/01 04:06:57 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2651
2018/02/01 04:06:57 node.go:343: Error while sending message to node with addr: dgraph-server-22.dgraph-server.default.svc.cluster.local:7090, err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:59 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:59 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:07:01 raft.go:749: INFO: 1f is starting a new election at term 2651
2018/02/01 04:07:01 raft.go:580: INFO: 1f became candidate at term 2652
2018/02/01 04:07:01 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2652
2018/02/01 04:07:01 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2652
2018/02/01 04:07:01 node.go:315: No healthy connection found to node Id: 32, err: Unhealthy connection
2018/02/01 04:07:04 raft.go:749: INFO: 1f is starting a new election at term 2652
2018/02/01 04:07:04 raft.go:580: INFO: 1f became candidate at term 2653
2018/02/01 04:07:04 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2653
2018/02/01 04:07:04 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2653
2018/02/01 04:07:06 raft.go:749: INFO: 1f is starting a new election at term 2653
2018/02/01 04:07:06 raft.go:580: INFO: 1f became candidate at term 2654
2018/02/01 04:07:06 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2654
2018/02/01 04:07:06 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2654
2018/02/01 04:07:09 raft.go:749: INFO: 1f is starting a new election at term 2654
2018/02/01 04:07:09 raft.go:580: INFO: 1f became candidate at term 2655
2018/02/01 04:07:09 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2655
2018/02/01 04:07:09 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2655
2018/02/01 04:07:09 pool.go:168: Echo error from dgraph-server-22.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:07:09 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:07:12 raft.go:749: INFO: 1f is starting a new election at term 2655
2018/02/01 04:07:12 raft.go:580: INFO: 1f became candidate at term 2656
2018/02/01 04:07:12 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2656
2018/02/01 04:07:12 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2656
2018/02/01 04:07:15 raft.go:749: INFO: 1f is starting a new election at term 2656
2018/02/01 04:07:15 raft.go:580: INFO: 1f became candidate at term 2657
2018/02/01 04:07:15 raft.go:664: INFO: 1f received MsgVoteResp from 1f at term 2657
2018/02/01 04:07:15 raft.go:651: INFO: 1f [logterm: 2, index: 4] sent MsgVote request to 20 at term 2657

(Jzhu077) #38
kubectl logs dgraph-server-22
+ offset=10
++ hostname
+ [[ dgraph-server-22 =~ -([0-9]+)$ ]]
+ ordinal=22
+ idx=32
++ hostname -f
+ dgraph server --my=dgraph-server-22.dgraph-server.default.svc.cluster.local:7090 --memory_mb 6000 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080 --idx 32 --port_offset 10 --debugmode
2018/02/01 04:06:45 groups.go:86: Current Raft Id: 32
2018/02/01 04:06:45 gRPC server started.  Listening on port 9090
2018/02/01 04:06:45 HTTP server started.  Listening on port 8090
2018/02/01 04:06:45 worker.go:99: Worker listening at address: [::]:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:06:45 groups.go:109: Connected to group zero. Connection state: member:<id:32 addr:"dgraph-server-22.dgraph-server.default.svc.cluster.local:7090" > state:<counter:138 groups:<key:1 value:<members:<key:10 value:<id:10 group_id:1 addr:"dgraph-server-0.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449286 > > members:<key:11 value:<id:11 group_id:1 addr:"dgraph-server-1.dgraph-server.default.svc.cluster.local:7090" > > members:<key:12 value:<id:12 group_id:1 addr:"dgraph-server-2.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:2 value:<members:<key:13 value:<id:13 group_id:2 addr:"dgraph-server-3.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449326 > > members:<key:14 value:<id:14 group_id:2 addr:"dgraph-server-4.dgraph-server.default.svc.cluster.local:7090" > > members:<key:15 value:<id:15 group_id:2 addr:"dgraph-server-5.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:3 value:<members:<key:16 value:<id:16 group_id:3 addr:"dgraph-server-6.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517449556 > > members:<key:17 value:<id:17 group_id:3 addr:"dgraph-server-7.dgraph-server.default.svc.cluster.local:7090" > > members:<key:18 value:<id:18 group_id:3 addr:"dgraph-server-8.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:4 value:<members:<key:19 value:<id:19 group_id:4 addr:"dgraph-server-9.dgraph-server.default.svc.cluster.local:7090" last_update:1517457894 > > members:<key:20 value:<id:20 group_id:4 addr:"dgraph-server-10.dgraph-server.default.svc.cluster.local:7090" > > members:<key:21 value:<id:21 group_id:4 addr:"dgraph-server-11.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517457894 > > > > groups:<key:5 value:<members:<key:22 value:<id:22 group_id:5 addr:"dgraph-server-12.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517457853 > > members:<key:23 value:<id:23 group_id:5 addr:"dgraph-server-13.dgraph-server.default.svc.cluster.local:7090" last_update:1517457851 > > members:<key:24 value:<id:24 group_id:5 addr:"dgraph-server-14.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:6 value:<members:<key:25 value:<id:25 group_id:6 addr:"dgraph-server-15.dgraph-server.default.svc.cluster.local:7090" last_update:1517456062 > > members:<key:26 value:<id:26 group_id:6 addr:"dgraph-server-16.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517456065 > > members:<key:27 value:<id:27 group_id:6 addr:"dgraph-server-17.dgraph-server.default.svc.cluster.local:7090" > > > > groups:<key:7 value:<members:<key:28 value:<id:28 group_id:7 addr:"dgraph-server-18.dgraph-server.default.svc.cluster.local:7090" last_update:1517449963 > > members:<key:29 value:<id:29 group_id:7 addr:"dgraph-server-19.dgraph-server.default.svc.cluster.local:7090" last_update:1517454525 > > members:<key:30 value:<id:30 group_id:7 addr:"dgraph-server-20.dgraph-server.default.svc.cluster.local:7090" leader:true last_update:1517455429 > > > > groups:<key:8 value:<members:<key:31 value:<id:31 group_id:8 addr:"dgraph-server-21.dgraph-server.default.svc.cluster.local:7090" last_update:1517450006 > > members:<key:32 value:<id:32 group_id:8 addr:"dgraph-server-22.dgraph-server.default.svc.cluster.local:7090" > > members:<key:33 value:<id:33 group_id:8 addr:"dgraph-server-23.dgraph-server.default.svc.cluster.local:7090" > > tablets:<key:"_predicate_" value:<group_id:8 predicate:"_predicate_" > > > > zeros:<key:1 value:<id:1 addr:"dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080" leader:true > > zeros:<key:2 value:<id:2 addr:"dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:3 value:<id:3 addr:"dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:4 value:<id:4 addr:"dgraph-zero-3.dgraph-zero.default.svc.cluster.local:5080" > > zeros:<key:5 value:<id:5 addr:"dgraph-zero-4.dgraph-zero.default.svc.cluster.local:5080" > > maxTxnTs:10000 > 
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-11.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-9.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:168: Echo error from dgraph-server-10.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-10.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-13.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-14.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-12.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-15.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-16.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-17.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-20.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-18.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-19.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-21.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:168: Echo error from dgraph-server-23.dgraph-server.default.svc.cluster.local:7090. Err: rpc error: code = Unavailable desc = all SubConns are in TransientFailure
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-23.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-0.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-1.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-2.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-5.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-3.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-4.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-6.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-7.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-server-8.dgraph-server.default.svc.cluster.local:7090
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-zero-3.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-zero-4.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-zero-1.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:06:45 pool.go:118: == CONNECT ==> Setting dgraph-zero-2.dgraph-zero.default.svc.cluster.local:5080
2018/02/01 04:06:45 draft.go:140: Node ID: 32 with GroupID: 8
2018/02/01 04:06:45 node.go:258: Group 8 found 0 entries
2018/02/01 04:06:45 draft.go:672: New Node for group: 8
2018/02/01 04:06:45 draft.go:679: While retrieving snapshot, err: Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8
2018/02/01 04:06:46 draft.go:679: While retrieving snapshot, err: Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8
2018/02/01 04:06:47 draft.go:679: While retrieving snapshot, err: Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8
2018/02/01 04:06:48 draft.go:679: While retrieving snapshot, err: Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8
2018/02/01 04:06:49 draft.go:679: While retrieving snapshot, err: Cannot retrieve snapshot from peer, error: rpc error: code = Unknown desc = Not leader of group: 8
2018/02/01 04:06:50 predicate.go:51: Getting SNAPSHOT: Time elapsed: 05s, bytes written: 0 B
2018/02/01 04:06:50 predicate.go:51: Getting SNAPSHOT: Time elapsed: 05s, bytes written: 0 B
2018/02/01 04:06:50 draft.go:642: Calling JoinCluster
2018/02/01 04:06:51 predicate.go:51: Getting SNAPSHOT: Time elapsed: 05s, bytes written: 0 B
2018/02/01 04:06:51 rpc error: code = DeadlineExceeded desc = context deadline exceeded
Error while joining cluster
github.com/dgraph-io/dgraph/x.Wrapf
	/home/pawan/go/src/github.com/dgraph-io/dgraph/x/error.go:90
github.com/dgraph-io/dgraph/x.Checkf
	/home/pawan/go/src/github.com/dgraph-io/dgraph/x/error.go:48
github.com/dgraph-io/dgraph/worker.(*node).joinPeers
	/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/draft.go:649
github.com/dgraph-io/dgraph/worker.(*node).InitAndStartNode
	/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/draft.go:684
github.com/dgraph-io/dgraph/worker.StartRaftNodes
	/home/pawan/go/src/github.com/dgraph-io/dgraph/worker/groups.go:120
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:2337

similar log for server-23


(Pawan Rawal) #39

I see the temporary fix didn’t really solve this. We need to figure out now how your cluster got into this state. I’ll try to reproduce and fix the cause of the issue.

Created an issue https://github.com/dgraph-io/dgraph/issues/2072


(Jzhu077) #40

I was wondering, what is the size of CPU I should request per dgraph service? From monitoring, the peak CPU usage I have seen is 3 cores in Google cloud platform. Would it go up higher than that? or it depends on how complicated the query is?