This is a how-to on setting up an automated backup/export system for a multi-node Dgraph deployment on kubernetes.
The setup looks like this:
- Google Cloud Storage bucket is used to store the backup files.
- The GCS bucket is mounted to the dgraph server.
- A K8s cronjob sends the HTTP request to a dgraph server to trigger the exports.
Custom Docker image
We need to include gcsfuse in the dgraph server docker image. So create a new Dockerfile, build the image and upload it to a registry available to your cluster.
FROM dgraph/dgraph:v1.0.9 ENV GCSFUSE_REPO gcsfuse-bionic RUN apt-get update && apt-get install -y --no-install-recommends \ ca-certificates \ curl \ gnupg \ && echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" \ | tee /etc/apt/sources.list.d/gcsfuse.list \ && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - \ && apt-get update \ && apt-get install -y gcsfuse \ && apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Create Google service account and secret
In your Google Cloud project create a service account that has access to the storage bucket, and save it as a JSON file.
Create the secret in your cluster.
kubectl create secret generic dgraph-storage-creds --from-file=credentials.json
Modify Dgraph server statefulset
The following modifications will be needed to your dgraph server statefulset workload.
Under volumes: add
- name: googleservice secret: defaultMode: 420 secretName: dgraph-storage-creds
Under volumeMounts: add
- mountPath: /etc/google name: googleservice readOnly: true
env: - name: GOOGLE_APPLICATION_CREDENTIALS value: /etc/google/credentials.json
lifecycle: postStart: exec: command: - gcsfuse - -o - nonempty - dgraph_exports - /dgraph/export preStop: exec: command: - fusermount - -u - /dgraph/export
Remember to change the image to your custom docker image
And add the following argument to the dgraph server command, change the IP range to your cluster range.
Create a new cronjob based on the following yaml
apiVersion: batch/v1beta1 kind: CronJob metadata: name: dgraph-backup spec: schedule: "15 02 * * *" jobTemplate: spec: template: spec: containers: - name: dgraph-backup image: artooro/curl args: - /bin/sh - -c - curl http://dgraph-server.default.svc.cluster.local:8080/admin/export restartPolicy: OnFailure
Deploy it via
kubectl create -f backup_cron.yaml
Test, verify the export files show up in the storage bucket, and you’re set.
This post assumes the name of your dgraph server service name is dgraph-server.
This is working for us, and I thought I’d share in case it’s a help to others.