Bulk loader in HA Kubernetes deployment

Matthias_Baetens · February 19, 2021, 7:20pm

I Want to Do

I want to load a large amount of data into Dgraph using the Bulk Loader.

What I Did

I have a 3 nodes Kubernetes cluster running the HA K8s deployment with the init containers on the alphas (and the pods are still pending as such). I have launched a Pod inside the cluster with a PV attached that contains the data and tried to launch the following command:

dgraph bulk -f filename.rdf.gz -s schema/dgraph.schema --reduce_shards=1 --zero=dgraph-zero.default.svc.cluster.local:5080

I get back the following errors: Error communicating with dgraph zero, retrying: rpc error: code = Unknown desc = Assigning IDs is only allowed on leader.

I tried scaling back the zero statefulset to 1 node, to no avail. The logs on that side show:
Got error: Assigning IDs is only allowed on leader. while leasing timestamps: val:1

Dgraph Metadata

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

MichelDiz · February 19, 2021, 8:18pm

Can you share the Zero logs (from all if you have multiple)

Matthias_Baetens · February 20, 2021, 9:11am

Sure. I reduced to just 1 Zero (reasoning that potentially, through the service in front of the Pods, it was hitting a different one than the leader and having just one would solve that problem).

Some of the logs:

2021-02-20 09:08:21.393 GMT "Unable to send message to peer: 0x3. Error: Unhealthy connection"
2021-02-20 09:08:22.392 GMT "[0x1] Read index context timed out"
2021-02-20 09:08:24.393 GMT "[0x1] Read index context timed out"
2021-02-20 09:08:25.393 GMT "Unable to send message to peer: 0x2. Error: Unhealthy connection"
2021-02-20 09:10:04.608 GMT Got error: Assigning IDs is only allowed on leader. while leasing timestamps: val:1 "
2021-02-20 09:10:05.400 GMT "Unable to send message to peer: 0x2. Error: Unhealthy connection"

MichelDiz · February 20, 2021, 1:40pm

Don’t reduce it in that way. You have to remove it from the quorum or start with 1 and then add more nodes.

Use https://dgraph.io/docs/deploy/dgraph-zero/#endpoints to remove zeros.

Matthias_Baetens · February 21, 2021, 11:05am

Thanks Michel - that’s a very helpful page. I queried the /state endpoint and found out who the leader is that way. Pointing the bulk loader to that Pod did the trick. Loading now - it looks promising

Topic		Replies	Views
Dgraph Bulk load on version 20.07.02 Dgraph kind:question , bulkloader	3	958	November 9, 2020
Load Data using Dgraph w/ Kubernetes Users kind:question , area:bulk-loader , area:kubernetes	8	1333	March 24, 2024
Issues initializing kubernetes cluster Users	6	546	May 15, 2020
Bulk Load data into Replicated Kubernetes Cluster Users	4	900	August 17, 2018
Cannot load bulk data into replicated cluster Dgraph	3	722	May 31, 2018

Bulk loader in HA Kubernetes deployment

I Want to Do

What I Did

Dgraph Metadata

Related topics