Dgraph cluster not initialize on v20.11.0 and v20.11.1

preved911 · February 5, 2021, 8:43am

Report a Dgraph Bug

What version of Dgraph are you using?

Dgraph Version

$ dgraph version
 
[Decoder]: Using assembly version of decoder
Page Size: 4096

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.

Have you tried reproducing the issue with the latest release?

yes, on dgraph v20.11.1 with helm chart v0.0.15

What is the hardware spec (RAM, OS)?

3 nodes with 4 cpu, 16 Gb ram

Steps to reproduce the issue (command/config used to run Dgraph).

Initialize cluster from helm template v0.0.15 with values:

values.yaml

alpha:
  antiAffinity: hard

  podAntiAffinitytopologyKey: topology.kubernetes.io/zone

  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ss-worker-ru-central1-a-4
          - ss-worker-ru-central1-b-4
          - ss-worker-ru-central1-c-4
          
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/dgraph
    operator: Exists

  persistence:
    size: 32Gi
    storageClass: network-ssd

  livenessProbe:
    enabled: true
  readinessProbe:
    enabled: true

  replicaCount: 2

  extraFlags: "--whitelist 192.168.0.0:192.168.255.255"

zero:
  antiAffinity: hard

  podAntiAffinitytopologyKey: topology.kubernetes.io/zone

  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - ss-worker-ru-central1-a-4
          - ss-worker-ru-central1-b-4
          - ss-worker-ru-central1-c-4
          
  tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/dgraph
    operator: Exists

  persistence:
    size: 32Gi
    storageClass: network-ssd

  livenessProbe:
    enabled: true
  readinessProbe:
    enabled: true

  replicaCount: 3
  shardReplicaCount: 3

ratel:
  livenessProbe:
    enabled: true
  readinessProbe:
    enabled: true

Expected behaviour and actual result.

Initialization complete succesfuly.

preved911 · February 5, 2021, 8:44am

Pod logs

zero-0

++ hostname
+ [[ dgraph-zero-0 =~ -([0-9]+)$ ]]
+ ordinal=0
+ idx=1
+ [[ 0 -eq 0 ]]
++ hostname -f
++ awk '{gsub(/\.$/,""); print $0}'
+ exec dgraph zero --my=dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 --idx 1 --replicas 3
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:07:35 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:07:35 Integration installed: Environment
[Sentry] 2021/02/05 08:07:35 Integration installed: Modules
[Sentry] 2021/02/05 08:07:35 Integration installed: IgnoreErrors
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:07:35 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:07:35 Integration installed: Environment
[Sentry] 2021/02/05 08:07:35 Integration installed: Modules
[Sentry] 2021/02/05 08:07:35 Integration installed: IgnoreErrors
I0205 08:07:36.111019      18 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.
I0205 08:07:36.302465      18 init.go:107] 

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.


I0205 08:07:36.302580      18 run.go:185] Setting Config to: {bindall:true portOffset:0 nodeId:1 numReplicas:3 peer: w:zw rebalanceInterval:480000000000 tlsClientConfig:<nil>}
I0205 08:07:36.302665      18 run.go:98] Setting up grpc listener at: 0.0.0.0:5080
I0205 08:07:36.303161      18 run.go:98] Setting up http listener at: 0.0.0.0:6080
I0205 08:07:36.306703      18 storage.go:132] Init Raft Storage with snap: 0, first: 1, last: 0
I0205 08:07:36.307110      18 node.go:152] Setting raft.Config to: &{ID:1 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc000322ff0 Applied:0 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x2e0ddf8 DisableProposalForwarding:false}
I0205 08:07:36.314184      18 node.go:326] Group 0 found 0 entries
I0205 08:07:36.314264      18 raft.go:602] Starting a brand new node
I0205 08:07:36.314303      18 log.go:34] 1 became follower at term 0
I0205 08:07:36.314312      18 log.go:34] newRaft 1 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
I0205 08:07:36.314322      18 log.go:34] 1 became follower at term 1
I0205 08:07:36.315241      18 run.go:303] Running Dgraph Zero...
E0205 08:07:36.315651      18 raft.go:486] While proposing CID: Not Zero leader. Aborting proposal: cid:"e2d67497-f426-4d3f-b0eb-2d1015144b2a" . Retrying...
I0205 08:07:36.315101      18 node.go:189] Setting conf state to nodes:1 
I0205 08:07:36.316511      18 raft.go:880] Done applying conf change at 0x1
E0205 08:07:39.316816      18 raft.go:486] While proposing CID: Not Zero leader. Aborting proposal: cid:"2abd98a5-80b8-454b-8c20-adce6cd9e261" . Retrying...
I0205 08:07:39.515088      18 log.go:34] 1 is starting a new election at term 1
I0205 08:07:39.515133      18 log.go:34] 1 became pre-candidate at term 1
I0205 08:07:39.515139      18 log.go:34] 1 received MsgPreVoteResp from 1 at term 1
I0205 08:07:39.515168      18 log.go:34] 1 became candidate at term 2
I0205 08:07:39.515175      18 log.go:34] 1 received MsgVoteResp from 1 at term 2
I0205 08:07:39.515193      18 log.go:34] 1 became leader at term 2
I0205 08:07:39.515204      18 log.go:34] raft.node: 1 elected leader 1 at term 2
I0205 08:07:39.515266      18 raft.go:845] I've become the leader, updating leases.
I0205 08:07:39.515288      18 assign.go:42] Updated Lease id: 1. Txn Ts: 1
I0205 08:07:42.317136      18 raft.go:479] CID set for cluster: de31e002-671e-4295-8aa4-69a9563db183
I0205 08:07:42.317401      18 license_ee.go:45] Enterprise trial license proposed to the cluster: license:<maxNodes:18446744073709551615 expiryTs:1615104462 > 
I0205 08:08:23.012336      18 pool.go:162] CONNECTING to dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:08:23.012373      18 node.go:586] Trying to add 0x2 to cluster. Addr: dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:08:23.012383      18 node.go:587] Current confstate at 0x1: nodes:1 
I0205 08:08:23.012621      18 node.go:189] Setting conf state to nodes:1 nodes:2 
I0205 08:08:23.012723      18 raft.go:880] Done applying conf change at 0x1
I0205 08:08:23.012815      18 node.go:756] [0x2] Done joining cluster with err: <nil>
W0205 08:08:23.022479      18 pool.go:267] Connection lost with dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local: no such host"
W0205 08:08:24.015266      18 node.go:420] Unable to send message to peer: 0x2. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local: no such host"
W0205 08:08:25.315353      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:27.315478      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:29.315658      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:31.315921      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:33.316142      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:34.015404      18 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:08:35.316318      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:37.316513      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:39.316747      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:41.316903      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:43.317072      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:44.115217      18 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:08:45.317272      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:47.317400      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:49.317557      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:51.317850      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:53.318133      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:54.115506      18 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:08:55.318393      18 node.go:681] [0x1] Read index context timed out
W0205 08:08:57.318572      18 node.go:681] [0x1] Read index context timed out
I0205 08:09:14.333816      18 pool.go:162] CONNECTING to dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:09:14.333850      18 node.go:586] Trying to add 0x3 to cluster. Addr: dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:09:14.333859      18 node.go:587] Current confstate at 0x1: nodes:1 nodes:2 
I0205 08:09:14.342163      18 node.go:189] Setting conf state to nodes:1 nodes:2 nodes:3 
I0205 08:09:14.343042      18 raft.go:880] Done applying conf change at 0x1
W0205 08:09:14.343007      18 pool.go:267] Connection lost with dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local: no such host"
I0205 08:09:14.343040      18 node.go:756] [0x3] Done joining cluster with err: <nil>
W0205 08:09:15.415479      18 node.go:420] Unable to send message to peer: 0x3. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local: no such host"
W0205 08:09:25.515169      18 node.go:420] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0205 08:09:35.615134      18 node.go:420] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0205 08:09:45.616477      18 node.go:420] Unable to send message to peer: 0x3. Error: Unhealthy connection
I0205 08:14:58.753230      18 zero.go:432] Got connection request: cluster_info_only:true 
I0205 08:14:58.758571      18 zero.go:450] Connected: cluster_info_only:true 
I0205 08:14:58.760373      18 zero.go:432] Got connection request: addr:"dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080" 
I0205 08:14:58.765469      18 pool.go:162] CONNECTING to dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:14:58.770855      18 zero.go:584] Connected: id:1 group_id:1 addr:"dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080" 
W0205 08:14:58.775063      18 pool.go:267] Connection lost with dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"
I0205 08:15:36.315323      18 raft.go:719] Skipping creating a snapshot. Num groups: 1, Num checkpoints: 0
I0205 08:19:07.543195      18 zero.go:432] Got connection request: cluster_info_only:true 
I0205 08:19:07.547784      18 zero.go:450] Connected: cluster_info_only:true 
I0205 08:19:07.552338      18 zero.go:432] Got connection request: addr:"dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080" 
I0205 08:19:07.557446      18 pool.go:162] CONNECTING to dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:19:07.562332      18 zero.go:584] Connected: id:2 group_id:1 addr:"dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080" 
W0205 08:19:07.579684      18 pool.go:267] Connection lost with dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"

zero-1

++ hostname
+ [[ dgraph-zero-1 =~ -([0-9]+)$ ]]
+ ordinal=1
+ idx=2
+ [[ 1 -eq 0 ]]
++ hostname -f
++ awk '{gsub(/\.$/,""); print $0}'
+ exec dgraph zero --my=dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080 --peer dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 --idx 2 --replicas 3
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:08:22 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:08:22 Integration installed: Environment
[Sentry] 2021/02/05 08:08:22 Integration installed: Modules
[Sentry] 2021/02/05 08:08:22 Integration installed: IgnoreErrors
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:08:22 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:08:22 Integration installed: Environment
[Sentry] 2021/02/05 08:08:22 Integration installed: Modules
[Sentry] 2021/02/05 08:08:22 Integration installed: IgnoreErrors
I0205 08:08:22.772812      18 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.
I0205 08:08:22.970327      18 init.go:107] 

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.


I0205 08:08:22.970397      18 run.go:185] Setting Config to: {bindall:true portOffset:0 nodeId:2 numReplicas:3 peer:dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 w:zw rebalanceInterval:480000000000 tlsClientConfig:<nil>}
I0205 08:08:22.970449      18 run.go:98] Setting up grpc listener at: 0.0.0.0:5080
I0205 08:08:22.970994      18 run.go:98] Setting up http listener at: 0.0.0.0:6080
I0205 08:08:22.974161      18 storage.go:132] Init Raft Storage with snap: 0, first: 1, last: 0
I0205 08:08:22.974550      18 node.go:152] Setting raft.Config to: &{ID:2 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc0000a81e0 Applied:0 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x2e0ddf8 DisableProposalForwarding:false}
I0205 08:08:22.979772      18 node.go:326] Group 0 found 0 entries
I0205 08:08:22.980568      18 pool.go:162] CONNECTING to dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:08:23.017491      18 raft.go:598] [0x2] Starting node
I0205 08:08:23.017550      18 log.go:34] 2 became follower at term 0
I0205 08:08:23.017563      18 log.go:34] newRaft 2 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
I0205 08:08:23.017573      18 log.go:34] 2 became follower at term 1
I0205 08:08:23.018150      18 run.go:303] Running Dgraph Zero...
I0205 08:08:57.119888      18 log.go:34] 2 [term: 1] received a MsgHeartbeat message with higher term from 1 [term: 2]
I0205 08:08:57.119933      18 log.go:34] 2 became follower at term 2
I0205 08:08:57.119943      18 log.go:34] raft.node: 2 elected leader 1 at term 2
I0205 08:08:58.137699      18 node.go:189] Setting conf state to nodes:1 
I0205 08:08:58.137816      18 raft.go:880] Done applying conf change at 0x2
I0205 08:08:58.137879      18 node.go:189] Setting conf state to nodes:1 nodes:2 
I0205 08:08:58.137916      18 raft.go:880] Done applying conf change at 0x2
I0205 08:09:14.346267      18 node.go:189] Setting conf state to nodes:1 nodes:2 nodes:3 
I0205 08:09:14.346384      18 raft.go:880] Done applying conf change at 0x2
I0205 08:09:14.346388      18 pool.go:162] CONNECTING to dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080
W0205 08:09:14.380864      18 pool.go:267] Connection lost with dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local: no such host"
I0205 08:14:58.775182      18 pool.go:162] CONNECTING to dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080
W0205 08:14:58.807797      18 pool.go:267] Connection lost with dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"
I0205 08:19:07.566808      18 pool.go:162] CONNECTING to dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080
W0205 08:19:07.597254      18 pool.go:267] Connection lost with dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"

zero-2

++ hostname
+ [[ dgraph-zero-2 =~ -([0-9]+)$ ]]
+ ordinal=2
+ idx=3
+ [[ 2 -eq 0 ]]
++ awk '{gsub(/\.$/,""); print $0}'
++ hostname -f
+ exec dgraph zero --my=dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080 --peer dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 --idx 3 --replicas 3
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:09:13 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:09:13 Integration installed: Environment
[Sentry] 2021/02/05 08:09:13 Integration installed: Modules
[Sentry] 2021/02/05 08:09:13 Integration installed: IgnoreErrors
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:09:13 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:09:13 Integration installed: Environment
[Sentry] 2021/02/05 08:09:13 Integration installed: Modules
[Sentry] 2021/02/05 08:09:13 Integration installed: IgnoreErrors
I0205 08:09:14.112743      18 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.
I0205 08:09:14.298024      18 init.go:107] 

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.


I0205 08:09:14.298141      18 run.go:185] Setting Config to: {bindall:true portOffset:0 nodeId:3 numReplicas:3 peer:dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 w:zw rebalanceInterval:480000000000 tlsClientConfig:<nil>}
I0205 08:09:14.298198      18 run.go:98] Setting up grpc listener at: 0.0.0.0:5080
I0205 08:09:14.298733      18 run.go:98] Setting up http listener at: 0.0.0.0:6080
I0205 08:09:14.301832      18 storage.go:132] Init Raft Storage with snap: 0, first: 1, last: 0
I0205 08:09:14.302296      18 node.go:152] Setting raft.Config to: &{ID:3 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc000112870 Applied:0 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x2e0ddf8 DisableProposalForwarding:false}
I0205 08:09:14.308079      18 node.go:326] Group 0 found 0 entries
I0205 08:09:14.308301      18 pool.go:162] CONNECTING to dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:09:14.344859      18 raft.go:598] [0x3] Starting node
I0205 08:09:14.344906      18 log.go:34] 3 became follower at term 0
I0205 08:09:14.344920      18 log.go:34] newRaft 3 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
I0205 08:09:14.344930      18 log.go:34] 3 became follower at term 1
I0205 08:09:14.345335      18 run.go:303] Running Dgraph Zero...
I0205 08:09:49.618207      18 log.go:34] 3 [term: 1] received a MsgHeartbeat message with higher term from 1 [term: 2]
I0205 08:09:49.618253      18 log.go:34] 3 became follower at term 2
I0205 08:09:49.618267      18 log.go:34] raft.node: 3 elected leader 1 at term 2
I0205 08:09:50.631085      18 node.go:189] Setting conf state to nodes:1 
I0205 08:09:50.631210      18 raft.go:880] Done applying conf change at 0x3
I0205 08:09:50.631266      18 node.go:189] Setting conf state to nodes:1 nodes:2 
I0205 08:09:50.631375      18 raft.go:880] Done applying conf change at 0x3
I0205 08:09:50.631420      18 node.go:189] Setting conf state to nodes:1 nodes:2 nodes:3 
I0205 08:09:50.631467      18 raft.go:880] Done applying conf change at 0x3
I0205 08:09:50.631589      18 pool.go:162] CONNECTING to dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:14:58.772127      18 pool.go:162] CONNECTING to dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080
W0205 08:14:58.800904      18 pool.go:267] Connection lost with dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"
I0205 08:19:07.564185      18 pool.go:162] CONNECTING to dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080
W0205 08:19:07.576005      18 pool.go:267] Connection lost with dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"

preved911 · February 5, 2021, 8:45am

Pod logs

alpha-0

++ hostname -f
++ awk '{gsub(/\.$/,""); print $0}'
+ dgraph alpha --my=dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080 --zero dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080,dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080,dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080 --whitelist 192.168.0.0:192.168.255.255
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:14:57 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:14:57 Integration installed: Environment
[Sentry] 2021/02/05 08:14:57 Integration installed: Modules
[Sentry] 2021/02/05 08:14:57 Integration installed: IgnoreErrors
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:14:58 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:14:58 Integration installed: Environment
[Sentry] 2021/02/05 08:14:58 Integration installed: Modules
[Sentry] 2021/02/05 08:14:58 Integration installed: IgnoreErrors
I0205 08:14:58.391933      17 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.
W0205 08:14:58.392818      17 run.go:573] --lru_mb is deprecated, use --cache_mb instead
I0205 08:14:58.603221      17 init.go:107] 

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.


I0205 08:14:58.603259      17 run.go:696] x.Config: {PortOffset:0 QueryEdgeLimit:1000000 NormalizeNodeLimit:10000 MutationsNQuadLimit:1000000 PollInterval:1s GraphqlExtension:true GraphqlDebug:false GraphqlLambdaUrl:}
I0205 08:14:58.603294      17 run.go:697] x.WorkerConfig: {TmpDir:t ExportPath:export NumPendingProposals:256 Tracing:0.01 MyAddr:dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080 ZeroAddr:[dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080 dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080] TLSClientConfig:<nil> TLSServerConfig:<nil> RaftId:0 WhiteListedIPRanges:[{Lower:192.168.0.0 Upper:192.168.255.255}] MaxRetries:-1 StrictMutations:false AclEnabled:false AbortOlderThan:5m0s SnapshotAfter:10000 ProposedGroupId:0 StartTime:2021-02-05 08:14:57.932525096 +0000 UTC m=+0.023188101 LudicrousMode:false LudicrousConcurrency:2000 EncryptionKey:**** LogRequest:0 HardSync:false}
I0205 08:14:58.603372      17 run.go:698] worker.Config: {PostingDir:p PostingDirCompression:1 PostingDirCompressionLevel:0 WALDir:w MutationsMode:0 AuthToken: PBlockCacheSize:1395864371 PIndexCacheSize:751619276 WalCache:0 HmacSecret:**** AccessJwtTtl:0s RefreshJwtTtl:0s CachePercentage:0,65,35,0 CacheMb:0}
I0205 08:14:58.606535      17 storage.go:132] Init Raft Storage with snap: 0, first: 1, last: 0
I0205 08:14:58.607205      17 server_state.go:76] Setting Posting Dir Compression Level: 0
I0205 08:14:58.607254      17 server_state.go:120] Opening postings BadgerDB with options: {Dir:p ValueDir:p SyncWrites:false NumVersionsToKeep:2147483647 ReadOnly:false Logger:0x2e0ddf8 Compression:1 InMemory:false MemTableSize:67108864 BaseTableSize:2097152 BaseLevelSize:10485760 LevelSizeMultiplier:10 TableSizeMultiplier:2 MaxLevels:7 ValueThreshold:1024 NumMemtables:5 BlockSize:4096 BloomFalsePositive:0.01 BlockCacheSize:1395864371 IndexCacheSize:751619276 NumLevelZeroTables:5 NumLevelZeroTablesStall:15 ValueLogFileSize:1073741823 ValueLogMaxEntries:1000000 NumCompactors:4 CompactL0OnClose:false ZSTDCompressionLevel:0 VerifyValueChecksum:false EncryptionKey:[] EncryptionKeyRotationDuration:240h0m0s BypassLockGuard:false ChecksumVerificationMode:0 DetectConflicts:false managedTxns:false maxBatchCount:0 maxBatchSize:0}
I0205 08:14:58.636215      17 log.go:34] All 0 tables opened in 0s
I0205 08:14:58.639563      17 log.go:34] Discard stats nextEmptySlot: 0
I0205 08:14:58.639627      17 log.go:34] Set nextTxnTs to 0
I0205 08:14:58.641394      17 groups.go:99] Current Raft Id: 0x0
E0205 08:14:58.641377      17 groups.go:1143] Error during SubscribeForUpdates for prefix "\x00\x00\vdgraph.cors\x00": Unable to find any servers for group: 1. closer err: <nil>
I0205 08:14:58.642055      17 worker.go:104] Worker listening at address: [::]:7080
I0205 08:14:58.644267      17 run.go:519] Bringing up GraphQL HTTP API at 0.0.0.0:8080/graphql
I0205 08:14:58.644303      17 run.go:520] Bringing up GraphQL HTTP admin API at 0.0.0.0:8080/admin
E0205 08:14:58.644276      17 groups.go:1143] Error during SubscribeForUpdates for prefix "\x00\x00\x15dgraph.graphql.schema\x00": Unable to find any servers for group: 1. closer err: <nil>
I0205 08:14:58.644386      17 run.go:552] gRPC server started.  Listening on port 9080
I0205 08:14:58.644412      17 run.go:553] HTTP server started.  Listening on port 8080
I0205 08:14:58.742122      17 pool.go:162] CONNECTING to dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:14:58.771448      17 groups.go:127] Connected to group zero. Assigned group: 1
I0205 08:14:58.771509      17 groups.go:129] Raft Id after connection to Zero: 0x1
I0205 08:14:58.771677      17 pool.go:162] CONNECTING to dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:14:58.772209      17 pool.go:162] CONNECTING to dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:14:58.773345      17 draft.go:230] Node ID: 0x1 with GroupID: 1
I0205 08:14:58.773493      17 node.go:152] Setting raft.Config to: &{ID:1 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc00009a2d0 Applied:0 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x2e0ddf8 DisableProposalForwarding:false}
I0205 08:14:58.774403      17 node.go:326] Group 1 found 0 entries
I0205 08:14:58.774971      17 draft.go:1712] New Node for group: 1
I0205 08:14:58.775214      17 log.go:34] 1 became follower at term 0
I0205 08:14:58.775411      17 log.go:34] newRaft 1 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
I0205 08:14:58.775520      17 log.go:34] 1 became follower at term 1
I0205 08:14:58.775636      17 draft.go:180] Operation started with id: opRollup
I0205 08:14:58.775978      17 draft.go:1084] Found Raft progress: 0
I0205 08:14:58.776272      17 node.go:189] Setting conf state to nodes:1 
I0205 08:14:58.776753      17 groups.go:807] Got address of a Zero leader: dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:14:58.776906      17 groups.go:821] Starting a new membership stream receive from dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080.
I0205 08:14:58.777215      17 log.go:34] 1 is starting a new election at term 1
I0205 08:14:58.777247      17 log.go:34] 1 became pre-candidate at term 1
I0205 08:14:58.777255      17 log.go:34] 1 received MsgPreVoteResp from 1 at term 1
I0205 08:14:58.777472      17 log.go:34] 1 became candidate at term 2
I0205 08:14:58.777481      17 log.go:34] 1 received MsgVoteResp from 1 at term 2
I0205 08:14:58.777494      17 log.go:34] 1 became leader at term 2
I0205 08:14:58.777639      17 log.go:34] raft.node: 1 elected leader 1 at term 2
I0205 08:14:58.782339      17 groups.go:838] Received first state update from Zero: counter:7 groups:<key:1 value:<members:<key:1 value:<id:1 group_id:1 addr:"dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080" > > > > zeros:<key:1 value:<id:1 addr:"dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080" leader:true > > zeros:<key:2 value:<id:2 addr:"dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080" > > zeros:<key:3 value:<id:3 addr:"dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080" > > maxRaftId:1 cid:"de31e002-671e-4295-8aa4-69a9563db183" license:<maxNodes:18446744073709551615 expiryTs:1615104462 enabled:true > 
I0205 08:14:59.642779      17 pool.go:162] CONNECTING to dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:14:59.776308      17 groups.go:893] Leader idx=0x1 of group=1 is connecting to Zero for txn updates
I0205 08:14:59.776421      17 groups.go:905] Got Zero leader: dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:14:59.790928      17 groups.go:487] Serving tablet for: dgraph.cors
I0205 08:14:59.791391      17 draft.go:180] Operation started with id: opIndexing
I0205 08:14:59.791441      17 index.go:889] Computing prefix index for attr dgraph.cors and tokenizers []
I0205 08:14:59.791448      17 index.go:907] Deleting index for attr dgraph.cors and tokenizers [exact]
I0205 08:14:59.791517      17 index.go:779] Deleting indexes for dgraph.cors
I0205 08:14:59.791529      17 log.go:34] DropPrefix called for [dgraph.cors dgraph.cors dgraph.cors dgraph.cors]
I0205 08:14:59.791570      17 log.go:34] Writes flushed. Stopping compactions now...
I0205 08:14:59.791663      17 draft.go:124] Operation completed with id: opRollup
I0205 08:14:59.794378      17 log.go:34] DropPrefix done
I0205 08:14:59.794398      17 log.go:34] Resuming writes
I0205 08:14:59.820387      17 log.go:34] Rebuilding index for predicate dgraph.cors (1/2): Sent data of size 0 B
I0205 08:14:59.831542      17 index.go:942] Rebuilding index for attr dgraph.cors and tokenizers [exact]
I0205 08:14:59.839604      17 groups.go:487] Serving tablet for: dgraph.type
I0205 08:14:59.854852      17 log.go:34] Rebuilding index for predicate dgraph.cors (1/2): Sent data of size 0 B
I0205 08:14:59.862979      17 mutation.go:188] Done schema update predicate:"dgraph.cors" value_type:STRING directive:INDEX tokenizer:"exact" list:true upsert:true 
I0205 08:14:59.863333      17 draft.go:124] Operation completed with id: opIndexing
E0205 08:14:59.863168      17 draft.go:714] Applying proposal. Error: operation opIndexing is already running. Proposal: {"group_id:1 start_ts:2 schema:<predicate:\"dgraph.type\" value_type:STRING directive:INDEX tokenizer:\"exact\" list:true > " [] "<nil>" "" "<nil>" "<nil>" '\b' '\x00' "<nil>" {} "" '\x00'}.
E0205 08:14:59.863546      17 groups.go:260] Error while proposing initial schema: operation opIndexing is already running
I0205 08:14:59.964163      17 draft.go:180] Operation started with id: opIndexing
I0205 08:14:59.964249      17 index.go:889] Computing prefix index for attr dgraph.type and tokenizers []
I0205 08:14:59.964260      17 index.go:907] Deleting index for attr dgraph.type and tokenizers [exact]
I0205 08:14:59.964327      17 index.go:779] Deleting indexes for dgraph.type
I0205 08:14:59.964339      17 log.go:34] DropPrefix called for [dgraph.type dgraph.type dgraph.type dgraph.type]
I0205 08:14:59.964360      17 log.go:34] Writes flushed. Stopping compactions now...
I0205 08:14:59.989912      17 log.go:34] DropPrefix done
I0205 08:14:59.989972      17 log.go:34] Resuming writes
I0205 08:15:00.096132      17 log.go:34] Rebuilding index for predicate dgraph.type (1/2): Sent data of size 0 B
I0205 08:15:00.106508      17 index.go:942] Rebuilding index for attr dgraph.type and tokenizers [exact]
I0205 08:15:00.114464      17 groups.go:487] Serving tablet for: dgraph.drop.op
I0205 08:15:00.253789      17 log.go:34] Rebuilding index for predicate dgraph.type (1/2): Sent data of size 0 B
I0205 08:15:00.267557      17 mutation.go:188] Done schema update predicate:"dgraph.type" value_type:STRING directive:INDEX tokenizer:"exact" list:true 
I0205 08:15:00.267994      17 draft.go:124] Operation completed with id: opIndexing
E0205 08:15:00.267988      17 draft.go:714] Applying proposal. Error: operation opIndexing is already running. Proposal: {"group_id:1 start_ts:3 schema:<predicate:\"dgraph.drop.op\" value_type:STRING > " [] "<nil>" "" "<nil>" "<nil>" '\r' '\x00' "<nil>" {} "" '\x00'}.
E0205 08:15:00.268221      17 groups.go:260] Error while proposing initial schema: operation opIndexing is already running
I0205 08:15:00.369159      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.369241      17 index.go:779] Deleting indexes for dgraph.drop.op
I0205 08:15:00.369440      17 draft.go:124] Operation completed with id: opIndexing
I0205 08:15:00.377470      17 groups.go:487] Serving tablet for: dgraph.graphql.schema
I0205 08:15:00.377640      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.377722      17 index.go:779] Deleting indexes for dgraph.graphql.schema
I0205 08:15:00.377941      17 draft.go:124] Operation completed with id: opIndexing
I0205 08:15:00.385259      17 groups.go:487] Serving tablet for: dgraph.graphql.xid
I0205 08:15:00.385738      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.385863      17 index.go:889] Computing prefix index for attr dgraph.graphql.xid and tokenizers []
I0205 08:15:00.385911      17 index.go:907] Deleting index for attr dgraph.graphql.xid and tokenizers [exact]
I0205 08:15:00.385999      17 index.go:779] Deleting indexes for dgraph.graphql.xid
I0205 08:15:00.386012      17 log.go:34] DropPrefix called for [dgraph.graphql.xid dgraph.graphql.xid dgraph.graphql.xid dgraph.graphql.xid]
I0205 08:15:00.386033      17 log.go:34] Writes flushed. Stopping compactions now...
I0205 08:15:00.462746      17 log.go:34] DropPrefix done
I0205 08:15:00.462819      17 log.go:34] Resuming writes
I0205 08:15:00.462938      17 index.go:942] Rebuilding index for attr dgraph.graphql.xid and tokenizers [exact]
I0205 08:15:00.470640      17 groups.go:487] Serving tablet for: dgraph.graphql.schema_history
I0205 08:15:00.625717      17 log.go:34] Rebuilding index for predicate dgraph.graphql.xid (1/2): Sent data of size 0 B
I0205 08:15:00.644274      17 mutation.go:188] Done schema update predicate:"dgraph.graphql.xid" value_type:STRING directive:INDEX tokenizer:"exact" upsert:true 
E0205 08:15:00.644425      17 draft.go:714] Applying proposal. Error: operation opIndexing is already running. Proposal: {"group_id:1 start_ts:6 schema:<predicate:\"dgraph.graphql.schema_history\" value_type:STRING > " [] "<nil>" "" "<nil>" "<nil>" '\x17' '\x00' "<nil>" {} "" '\x00'}.
I0205 08:15:00.644539      17 draft.go:124] Operation completed with id: opIndexing
E0205 08:15:00.644573      17 groups.go:260] Error while proposing initial schema: operation opIndexing is already running
I0205 08:15:00.745310      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.745421      17 index.go:779] Deleting indexes for dgraph.graphql.schema_history
I0205 08:15:00.745542      17 draft.go:124] Operation completed with id: opIndexing
I0205 08:15:00.755122      17 groups.go:487] Serving tablet for: dgraph.graphql.schema_created_at
I0205 08:15:00.755679      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.755742      17 index.go:779] Deleting indexes for dgraph.graphql.schema_created_at
I0205 08:15:00.755940      17 draft.go:124] Operation completed with id: opIndexing
I0205 08:15:00.762676      17 groups.go:487] Serving tablet for: dgraph.graphql.p_query
I0205 08:15:00.763174      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.763318      17 index.go:779] Deleting indexes for dgraph.graphql.p_query
I0205 08:15:00.763566      17 draft.go:124] Operation completed with id: opIndexing
I0205 08:15:00.770488      17 groups.go:487] Serving tablet for: dgraph.graphql.p_sha256hash
I0205 08:15:00.770682      17 draft.go:180] Operation started with id: opIndexing
I0205 08:15:00.770782      17 index.go:889] Computing prefix index for attr dgraph.graphql.p_sha256hash and tokenizers []
I0205 08:15:00.770796      17 index.go:907] Deleting index for attr dgraph.graphql.p_sha256hash and tokenizers [exact]
I0205 08:15:00.770850      17 index.go:779] Deleting indexes for dgraph.graphql.p_sha256hash
I0205 08:15:00.770862      17 log.go:34] DropPrefix called for [edgraph.graphql.p_sha256hash edgraph.graphql.p_sha256hash edgraph.graphql.p_sha256hash edgraph.graphql.p_sha256hash]
I0205 08:15:00.770887      17 log.go:34] Writes flushed. Stopping compactions now...
I0205 08:15:00.821329      17 log.go:34] DropPrefix done
I0205 08:15:00.821405      17 log.go:34] Resuming writes
I0205 08:15:00.821684      17 index.go:942] Rebuilding index for attr dgraph.graphql.p_sha256hash and tokenizers [exact]
I0205 08:15:00.826367      17 groups.go:159] Server is ready
I0205 08:15:00.826541      17 access_ee.go:390] ResetAcl closed
I0205 08:15:00.826703      17 access_ee.go:311] RefreshAcls closed
I0205 08:15:00.840753      17 graphql.go:41] ResetCors closed
I0205 08:15:00.842584      17 run.go:813] Updating cors from subscription.
I0205 08:15:00.842616      17 run.go:837] Updating cors origins: [*]
I0205 08:15:00.876431      17 log.go:34] Rebuilding index for predicate dgraph.graphql.p_sha256hash (1/2): Sent data of size 0 B
I0205 08:15:00.893284      17 mutation.go:188] Done schema update predicate:"dgraph.graphql.p_sha256hash" value_type:STRING directive:INDEX tokenizer:"exact" 
I0205 08:15:00.893402      17 draft.go:124] Operation completed with id: opIndexing
I0205 08:15:03.646354      17 admin.go:697] No GraphQL schema in Dgraph; serving empty GraphQL API
I0205 08:15:09.863600      17 draft.go:180] Operation started with id: opRollup
I0205 08:15:58.776549      17 draft.go:606] Creating snapshot at Index: 45, ReadTs: 15
I0205 08:19:07.595885      17 pool.go:162] CONNECTING to dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:19:07.595917      17 node.go:586] Trying to add 0x2 to cluster. Addr: dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:19:07.595923      17 node.go:587] Current confstate at 0x1: nodes:1 
I0205 08:19:07.596398      17 node.go:189] Setting conf state to nodes:1 nodes:2 
I0205 08:19:07.596475      17 node.go:756] [0x2] Done joining cluster with err: <nil>
W0205 08:19:07.599286      17 pool.go:267] Connection lost with dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"
W0205 08:19:08.676254      17 node.go:420] Unable to send message to peer: 0x2. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"
W0205 08:19:18.776099      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:19:28.876267      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
E0205 08:19:36.603184      17 groups.go:1035] While proposing delta with MaxAssigned: 18 and num txns: 0. Error=Server overloaded with pending proposals. Please retry later. Retrying...
W0205 08:19:38.976199      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:19:49.076213      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:19:59.176216      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
E0205 08:20:04.603922      17 groups.go:1035] While proposing delta with MaxAssigned: 18 and num txns: 0. Error=Server overloaded with pending proposals. Please retry later. Retrying...
W0205 08:20:09.276313      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:20:19.376156      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:20:29.376171      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
E0205 08:20:32.604804      17 groups.go:1035] While proposing delta with MaxAssigned: 18 and num txns: 0. Error=Server overloaded with pending proposals. Please retry later. Retrying...
W0205 08:20:39.476150      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0205 08:20:49.576104      17 node.go:420] Unable to send message to peer: 0x2. Error: Unhealthy connection

preved911 · February 5, 2021, 8:46am

Pod logs

alpha-1

++ hostname -f
++ awk '{gsub(/\.$/,""); print $0}'
+ dgraph alpha --my=dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080 --zero dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080,dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080,dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080 --whitelist 192.168.0.0:192.168.255.255
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:19:06 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:19:06 Integration installed: Environment
[Sentry] 2021/02/05 08:19:06 Integration installed: Modules
[Sentry] 2021/02/05 08:19:06 Integration installed: IgnoreErrors
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/02/05 08:19:06 Integration installed: ContextifyFrames
[Sentry] 2021/02/05 08:19:06 Integration installed: Environment
[Sentry] 2021/02/05 08:19:06 Integration installed: Modules
[Sentry] 2021/02/05 08:19:06 Integration installed: IgnoreErrors
I0205 08:19:07.200584      17 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.
W0205 08:19:07.201166      17 run.go:573] --lru_mb is deprecated, use --cache_mb instead
I0205 08:19:07.390165      17 init.go:107] 

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.


I0205 08:19:07.390212      17 run.go:696] x.Config: {PortOffset:0 QueryEdgeLimit:1000000 NormalizeNodeLimit:10000 MutationsNQuadLimit:1000000 PollInterval:1s GraphqlExtension:true GraphqlDebug:false GraphqlLambdaUrl:}
I0205 08:19:07.390267      17 run.go:697] x.WorkerConfig: {TmpDir:t ExportPath:export NumPendingProposals:256 Tracing:0.01 MyAddr:dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080 ZeroAddr:[dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080 dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080 dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080] TLSClientConfig:<nil> TLSServerConfig:<nil> RaftId:0 WhiteListedIPRanges:[{Lower:192.168.0.0 Upper:192.168.255.255}] MaxRetries:-1 StrictMutations:false AclEnabled:false AbortOlderThan:5m0s SnapshotAfter:10000 ProposedGroupId:0 StartTime:2021-02-05 08:19:06.772898192 +0000 UTC m=+0.018507592 LudicrousMode:false LudicrousConcurrency:2000 EncryptionKey:**** LogRequest:0 HardSync:false}
I0205 08:19:07.390330      17 run.go:698] worker.Config: {PostingDir:p PostingDirCompression:1 PostingDirCompressionLevel:0 WALDir:w MutationsMode:0 AuthToken: PBlockCacheSize:1395864371 PIndexCacheSize:751619276 WalCache:0 HmacSecret:**** AccessJwtTtl:0s RefreshJwtTtl:0s CachePercentage:0,65,35,0 CacheMb:0}
I0205 08:19:07.393678      17 storage.go:132] Init Raft Storage with snap: 0, first: 1, last: 0
I0205 08:19:07.394310      17 server_state.go:76] Setting Posting Dir Compression Level: 0
I0205 08:19:07.394350      17 server_state.go:120] Opening postings BadgerDB with options: {Dir:p ValueDir:p SyncWrites:false NumVersionsToKeep:2147483647 ReadOnly:false Logger:0x2e0ddf8 Compression:1 InMemory:false MemTableSize:67108864 BaseTableSize:2097152 BaseLevelSize:10485760 LevelSizeMultiplier:10 TableSizeMultiplier:2 MaxLevels:7 ValueThreshold:1024 NumMemtables:5 BlockSize:4096 BloomFalsePositive:0.01 BlockCacheSize:1395864371 IndexCacheSize:751619276 NumLevelZeroTables:5 NumLevelZeroTablesStall:15 ValueLogFileSize:1073741823 ValueLogMaxEntries:1000000 NumCompactors:4 CompactL0OnClose:false ZSTDCompressionLevel:0 VerifyValueChecksum:false EncryptionKey:[] EncryptionKeyRotationDuration:240h0m0s BypassLockGuard:false ChecksumVerificationMode:0 DetectConflicts:false managedTxns:false maxBatchCount:0 maxBatchSize:0}
I0205 08:19:07.422463      17 log.go:34] All 0 tables opened in 0s
I0205 08:19:07.425788      17 log.go:34] Discard stats nextEmptySlot: 0
I0205 08:19:07.425842      17 log.go:34] Set nextTxnTs to 0
I0205 08:19:07.427798      17 groups.go:99] Current Raft Id: 0x0
E0205 08:19:07.428109      17 groups.go:1143] Error during SubscribeForUpdates for prefix "\x00\x00\vdgraph.cors\x00": Unable to find any servers for group: 1. closer err: <nil>
I0205 08:19:07.428929      17 worker.go:104] Worker listening at address: [::]:7080
I0205 08:19:07.430300      17 run.go:519] Bringing up GraphQL HTTP API at 0.0.0.0:8080/graphql
I0205 08:19:07.430381      17 run.go:520] Bringing up GraphQL HTTP admin API at 0.0.0.0:8080/admin
I0205 08:19:07.430447      17 run.go:552] gRPC server started.  Listening on port 9080
I0205 08:19:07.430481      17 run.go:553] HTTP server started.  Listening on port 8080
E0205 08:19:07.430579      17 groups.go:1143] Error during SubscribeForUpdates for prefix "\x00\x00\x15dgraph.graphql.schema\x00": Unable to find any servers for group: 1. closer err: <nil>
I0205 08:19:07.528384      17 pool.go:162] CONNECTING to dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:19:07.564137      17 groups.go:127] Connected to group zero. Assigned group: 1
I0205 08:19:07.564165      17 groups.go:129] Raft Id after connection to Zero: 0x2
I0205 08:19:07.564225      17 pool.go:162] CONNECTING to dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:19:07.564269      17 pool.go:162] CONNECTING to dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:19:07.564316      17 pool.go:162] CONNECTING to dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:19:07.564351      17 draft.go:230] Node ID: 0x2 with GroupID: 1
I0205 08:19:07.565250      17 node.go:152] Setting raft.Config to: &{ID:2 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc000090690 Applied:0 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x2e0ddf8 DisableProposalForwarding:false}
I0205 08:19:07.566986      17 node.go:326] Group 1 found 0 entries
I0205 08:19:07.567286      17 draft.go:1650] Calling IsPeer
I0205 08:19:07.593339      17 draft.go:1655] Done with IsPeer call
I0205 08:19:07.593373      17 draft.go:1712] New Node for group: 1
I0205 08:19:07.593385      17 draft.go:1719] Trying to join peers.
I0205 08:19:07.593396      17 draft.go:1633] Calling JoinCluster via leader: dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080
I0205 08:19:07.598730      17 draft.go:1637] Done with JoinCluster call
I0205 08:19:07.598791      17 log.go:34] 2 became follower at term 0
I0205 08:19:07.598804      17 log.go:34] newRaft 2 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
I0205 08:19:07.598812      17 log.go:34] 2 became follower at term 1
I0205 08:19:07.598872      17 draft.go:180] Operation started with id: opRollup
I0205 08:19:07.598988      17 draft.go:1084] Found Raft progress: 0
I0205 08:19:07.599131      17 groups.go:807] Got address of a Zero leader: dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080
I0205 08:19:07.599272      17 groups.go:821] Starting a new membership stream receive from dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080.
I0205 08:19:07.607478      17 groups.go:838] Received first state update from Zero: counter:22 groups:<key:1 value:<members:<key:1 value:<id:1 group_id:1 addr:"dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local:7080" leader:true last_update:1612512898 > > members:<key:2 value:<id:2 group_id:1 addr:"dgraph-alpha-1.dgraph-alpha-headless.production-services.svc.cluster.local:7080" > > tablets:<key:"dgraph.cors" value:<group_id:1 predicate:"dgraph.cors" > > tablets:<key:"dgraph.drop.op" value:<group_id:1 predicate:"dgraph.drop.op" > > tablets:<key:"dgraph.graphql.p_query" value:<group_id:1 predicate:"dgraph.graphql.p_query" > > tablets:<key:"dgraph.graphql.p_sha256hash" value:<group_id:1 predicate:"dgraph.graphql.p_sha256hash" > > tablets:<key:"dgraph.graphql.schema" value:<group_id:1 predicate:"dgraph.graphql.schema" > > tablets:<key:"dgraph.graphql.schema_created_at" value:<group_id:1 predicate:"dgraph.graphql.schema_created_at" > > tablets:<key:"dgraph.graphql.schema_history" value:<group_id:1 predicate:"dgraph.graphql.schema_history" > > tablets:<key:"dgraph.graphql.xid" value:<group_id:1 predicate:"dgraph.graphql.xid" > > tablets:<key:"dgraph.type" value:<group_id:1 predicate:"dgraph.type" > > snapshot_ts:15 checksum:12071872251239819759 > > zeros:<key:1 value:<id:1 addr:"dgraph-zero-0.dgraph-zero-headless.production-services.svc.cluster.local:5080" leader:true > > zeros:<key:2 value:<id:2 addr:"dgraph-zero-1.dgraph-zero-headless.production-services.svc.cluster.local:5080" > > zeros:<key:3 value:<id:3 addr:"dgraph-zero-2.dgraph-zero-headless.production-services.svc.cluster.local:5080" > > maxLeaseId:10000 maxTxnTs:10000 maxRaftId:2 cid:"de31e002-671e-4295-8aa4-69a9563db183" license:<maxNodes:18446744073709551615 expiryTs:1615104462 enabled:true > 
I0205 08:19:12.430822      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:17.431183      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:22.431744      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:27.432083      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:32.432319      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:37.432577      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:42.432865      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:47.433971      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:52.434203      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:19:57.434426      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:02.434687      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:07.434941      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:12.435155      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:17.435435      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:22.435657      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:27.435994      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:32.436485      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:37.436895      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:42.438007      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:47.438133      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:52.438269      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.
I0205 08:20:57.438473      17 admin.go:686] Error reading GraphQL schema: Please retry again, server is not ready to accept requests.

ibrahim · February 5, 2021, 10:01am

Looks like you have configured the nodes incorrectly. @joaquin any ideas on what’s wrong with the config?

joaquin · February 5, 2021, 10:19am

I have been looking at this one. I don’t have access to specifically Yandex Cloud, but we have tried numerous configurations other cloud providers’ Kubernetes platforms (GKE, AKS, EKS) without any issues like this.

I can try out using similar affinity and tolerations on EKS and see if can reproduce the issue.

preved911 · February 5, 2021, 10:35am

I have getting this issue only specified version of dgraph. If I change version to 20.07.0, for example, cluster bootstraps well even if I’m using same version of helm chart. So I thought that the problem in specific version dgraph.

ibrahim · February 5, 2021, 2:30pm

Hi @preved911, can you double check your configuration? This error
lookup dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local: no such host"
indicates there is a configuration issue.

Can you try running the following curl command from one of your alpha pods?

curl dgraph-alpha-0.dgraph-alpha-headless.production-services.svc.cluster.local/state

preved911 · February 5, 2021, 7:09pm

joaquin · February 7, 2021, 5:30am

@preved911

Problem: I was able to reproduce this using similar values. I noticed that in the values.yaml, the value for alpha.replicaCount is set to 2.

This value won’t work, because due to Raft Consensus Algorithm, these have to be an odd number, where more than half the number have to come up for availability.

Solution: I tested this out with alpha.replicaCount=3, and it Dgraph will startup successfully with availability.

Side question: I was curious the helm command you used to deploy dgraph. Usually, a release name is required, e.g. helm install my-release dgraph/dgraph, but in looking at logs and screenshot, the helm release name is not included in headless-service domain name or pod hostname.

preved911 · February 7, 2021, 7:38pm

No, it is not resolving problem. I tried with alpha.replicaCount=3 and effect same, second replica continued staying in not ready state.
As I know, statefulset will not create third pod until second does not come in ready state.

I setted additional fullnameOverride: dgraph variable for release name overwrite.
Disabling it does not have effect too.

How else I can check why dgraph replica not come ready?

joaquin · February 7, 2021, 10:14pm

@preved911 Could you try it without the liveliness/readiness probes on alpha?

I have also encountered an issue that happens only with v20.11.0 and v20.11.1 and found disabling the readiness probe will get around this for the moment.

You can also as an alternative use this path for a health check, /probe/graphql.

preved911 · February 9, 2021, 1:49pm

It helped. Cluster successfully initialized without probes. Then I enabled probes, all pods passed checks and went in ready state.

joaquin · February 10, 2021, 1:18am

I opened an issue for the dgraph service in addition to this helm chart issue, as this happens particularly on v20.11.0 and v20.11.1 and not earlier versions. We are looking into this.

For alternative health probes, you can do this as a workaround:

alpha:
  livenessProbe:
    path:  /probe/graphql
  readinessProbe:
    path:  /probe/graphql

hankhsu · February 17, 2022, 10:33am

the alternative readinessProbe /probe/graphql still not working.
is there any other alternative?

Topic		Replies	Views
Panic: concurrent map read+write on v20.11.2 Dgraph status:accepted , ticket:created	6	634	March 22, 2021
Alpha group diverged, continually sends snapshots Dgraph	12	849	March 24, 2021
Dgraph cluster cannot init Dgraph	6	643	July 8, 2021
Did I lose my data for forever? Dgraph dgraph , kind:bug	23	1823	June 16, 2021
Dgraph fails to start on restarts with Kind (Kubernetes) Dgraph	12	1666	October 30, 2020

Dgraph cluster not initialize on v20.11.0 and v20.11.1

Report a Dgraph Bug

What version of Dgraph are you using?

Have you tried reproducing the issue with the latest release?

What is the hardware spec (RAM, OS)?

Steps to reproduce the issue (command/config used to run Dgraph).

Expected behaviour and actual result.

Pod logs

Pod logs

Pod logs

Related topics