So we have the file of about 166Gi that needs to be loaded on the dgraph. We had successfully figure out the way to load the data with the 6 node cluster. We have three deployment files for the zero and three deployment files for the alpha. The sample file looks as below:
Alpha zero deployment file:
metadata: labels: io.kompose.service: alpha environment: production name: alpha namespace: ourspacename spec: replicas: 1 selector: matchLabels: io.kompose.service: alpha template: metadata: labels: io.kompose.service: alpha spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - thenodename initContainers: - name: init-alpha image: dgraph/dgraph:latest command: - bash - "-c" - | trap "exit" SIGINT SIGTERM echo "Write to /dgraph/doneinit when ready." until [ -f /dgraph/doneinit ]; do sleep 2; done volumeMounts: - name: alpha-claim0 mountPath: /dgraph containers: - args: - dgraph - alpha - --my=alpha:7080 - --zero=zero:5080,zero-1:5081,zero-2:5082 image: dgraph/dgraph:latest name: alpha resources: limits: cpu: 2000m memory: "50Gi" requests: cpu: 1000m memory: "30Gi" ports: - containerPort: 8080 - containerPort: 9080 volumeMounts: - mountPath: /dgraph name: alpha-claim0 volumes: - name: alpha-claim0 persistentVolumeClaim: claimName: theclaimname
And zero:0 deployment files
apiVersion: apps/v1 kind: Deployment metadata: labels: environment: production io.kompose.service: zero name: zero namespace: ournamespace spec: replicas: 1 selector: matchLabels: io.kompose.service: zero template: metadata: labels: io.kompose.service: zero spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - thenodename containers: - args: - dgraph - zero - --my=zero:5080 - --replicas - "3" - --idx - "1" image: dgraph/dgraph:latest name: zero resources: limits: cpu: 20000m memory: "350Gi" requests: cpu: 16000m memory: "300Gi" ports: - containerPort: 5080 - containerPort: 6080 volumeMounts: - mountPath: /dgraph name: zero-claim0 volumes: - name: zero-claim0 persistentVolumeClaim: claimName: theclaimname
So our process used to be as follows:
- First bring all the 3 replicas of the zero pod up.
- Copy our file into one of the zero pod where we have allocated most memories and cpu.
- After copying the data into zero pod run the following command to do the bulk load:
dgraph bulk -f dataconngraph.rdf -s finalschema.rdf --map_shards=1 --reduce_shards=1 --http localhost:8000 --zero=localhost:5080 > check.log &
After that we used to wait for an 13 hours to complete our process. After the process was completed we used to check the logs tail -f check.log which is as shown below:
After all the process is over there resides an out folder with the size of 446G. Then we used to copy out/0/p/ into each of the alpha pod and we used to have the data populated.
Yesterday we had a production and we were using the image tag as dgraph/dgraph:latest. It seems dgraph also released a new version 20.07.2 yesterday. In dev we had 27.07.1 version.
We are doing all the steps as mentioned above. In the production though when we do the bulk load the size of the out folder is just 81G and there is no error in check.log as well. I have attached my bulk load log below for production:
Is this meant to happen? Was the file size suppose to decrease because of new release? Hope someone could guide with this.