Bulk Load data into Replicated Kubernetes Cluster

Hello,

I have a kubernetes kluster consisting of 1 master and 3 nodes, and I would like to do an initial bulk load on my data.

The documentation indicates that to do bulk load, only the zero server needs to be running.

The documentation provides instructions on how to start the zero, 3 servers and ratel (dgraph-multi.yaml ) with one single command.

Can anybody share what commands to run in order to start only the zero server, then load the data, and then how to start the 3 servers?

I would also appreciate any help with where to copy the rdf and schema file before running the bulk load command.

Thanks in advance

To run zero :
dgraph zero

For server :
dgraph server --lru_mb 2048 --zero localhost:5080

For reference you can visit:

https://docs.dgraph.io/get-started#from-installed-binary

I meant the commands to run the components in the kubernetes cluster.

The command kubectl create -f https://raw.githubusercontent.com/dgraph-io/dgraph/master/contrib/config/kubernetes/dgraph-multi.yaml start all the components but I need to start the zero server first, then the bulk data load and then the servers and ratel.

Thanks

I have splited the dgraph-multi.yaml file into 2 files. The first will create the zero server (dgraph-zero.yaml) and the second will create the server and ratel (dgraph-sever-ratel.yaml).

After running kubectl create -f ./dgraph-zero.yaml to create the zero container, I copied the data to the zero container using the following commands:

kubectl cp ./data.rdf.gz dgraph-zero-0:/dgraph
kubectl cp ./schema.rdf.gz dgraph-zero-0:/dgraph

Then the dgraph load bulk command is executed:
kubectl exec dgraph-zero-0 – dgraph bulk -r data.rdf.gz -s schema.rdf.gz --map_shards 1 --reduce_shards 1 --zero localhost:5080

After this step, I can see that the out folder is created in the container for the zero server.

After this, I assume I can create the containers for the servers and ratel, but I assume I have to copy the data from the zero server to the persistent volume that will be used by the server containers.

I need help with instructions on where to copy the data from the zero server.

Following are the contents from the 2 yaml files I am using:

dgraph-zero.yml

apiVersion: v1
kind: Service
metadata:
name: dgraph-zero-public
labels:
app: dgraph-zero
spec:
type: LoadBalancer
ports:

  • port: 5080
    targetPort: 5080
    name: zero-grpc
  • port: 6080
    targetPort: 6080
    name: zero-http
    selector:
    app: dgraph-zero

apiVersion: v1
kind: Service
metadata:
name: dgraph-zero
labels:
app: dgraph-zero
spec:
ports:

  • port: 5080
    targetPort: 5080
    name: zero-grpc
    clusterIP: None
    selector:
    app: dgraph-zero

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: dgraph-zero
spec:
serviceName: “dgraph-zero”
replicas: 1
selector:
matchLabels:
app: dgraph-zero
template:
metadata:
labels:
app: dgraph-zero
spec:
containers:
- name: zero
image: dgraph/dgraph:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5080
name: zero-grpc
- containerPort: 6080
name: zero-http
volumeMounts:
- name: datadir
mountPath: /dgraph
command:
- bash
- “-c”
- |
set -ex
dgraph zero --replicas 3 --my=$(hostname -f):5080
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
persistentVolumeClaim:
claimName: datadir
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:

dgraph-server-ratel.yaml

apiVersion: v1
kind: Service
metadata:
name: dgraph-server-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:

  • port: 8080
    targetPort: 8080
    name: server-http
  • port: 9080
    targetPort: 9080
    name: server-grpc
    selector:
    app: dgraph-server

apiVersion: v1
kind: Service
metadata:
name: dgraph-server-0-http-public
labels:
app: dgraph-server
spec:
type: LoadBalancer
ports:


apiVersion: v1
kind: Service
metadata:
name: dgraph-ratel-public
labels:
app: dgraph-ratel
spec:
type: LoadBalancer
ports:

  • port: 8000
    targetPort: 8000
    name: ratel-http
    selector:
    app: dgraph-ratel

apiVersion: v1
kind: Service
metadata:
name: dgraph-server
labels:
app: dgraph-server
spec:
ports:

  • port: 7080
    targetPort: 7080
    name: server-grpc-int
    clusterIP: None
    selector:
    app: dgraph-server

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: dgraph-server
spec:
serviceName: “dgraph-server”
replicas: 3
selector:
matchLabels:
app: dgraph-server
template:
metadata:
labels:
app: dgraph-server
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- dgraph-server
topologyKey: kubernetes.io/hostname
containers:
- name: server
image: dgraph/dgraph:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 7080
name: server-grpc-int
- containerPort: 8080
name: server-http
- containerPort: 9080
name: server-grpc
volumeMounts:
- name: datadir
mountPath: /dgraph
command:
- bash
- “-c”
- |
set -ex
dgraph server --my=$(hostname -f):7080 --lru_mb 2048 --zero dgraph-zero-0.dgraph-zero.default.svc.cluster.local:5080
terminationGracePeriodSeconds: 60
volumes:
- name: datadir
persistentVolumeClaim:
claimName: datadir
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:


apiVersion: apps/v1
kind: Deployment
metadata:
name: dgraph-ratel
labels:
app: dgraph-ratel
spec:
selector:
matchLabels:
app: dgraph-ratel
template:
metadata:
labels:
app: dgraph-ratel
spec:
containers:
- name: ratel
image: dgraph/dgraph:latest
ports:
- containerPort: 8000
command:
- dgraph-ratel

Thanks

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.