Dgraph Helm Data Migration

I just realized that I dropped these awesome instructions in a private thread, so pasting it below in case others may have similar questions on this topic.


Generally speaking for deploying dgraph with helm, you would use an init container where you can populate the p directory before Dgraph alpha pods start.

NOTE: The process below assumes a single shard (group 1) cluster, using the default helm chart values that create alpha-0, alpha-1, alpha-2.

Overview of Helm Chart is here:

Process

Do this process for every alpha pod, e.g. alpha-0, alpha-1, alpha-2:

  1. Deploy Dgraph with initContainers enabled (the other init container default settings are unchanged):
    helm repo add dgraph https://charts.dgraph.io
    helm install "my-release" \
      --set alpha.initContainers.init.enabled=true dgraph/dgraph
    
  2. Optionally, copy files to the Dgraph Alpha pod’s init container with kubectl cp if needed, e.g.
    kubectl cp /path/to/files <name-of-pod-for-alpha>:/dgraph/ -c <name-of-init-container>
    
  3. Login into the desired Dgraph Alpha pod’s init container:
    kubectl exec -ti <name-of-pod-for-alpha> -c <name-of-init-container> -- bash
    
  4. Inside the Dgraph Alpha pod’s init container, if you didn’t copy required files into the container already, you can do that now or optionally curl down needed files. Once ready, run dgraph bulk and then move the resulting directory to the appropriate path, e.g. mv /path/to/p /dgraph.
  5. Inside the Dgraph Alpha pod’s init container, run touch /dgraph/doneinit to signal that we’re ready to start Dgraph Alpha container and no longer need the init container.
  6. Repeat this process for all other alpha pods within this group. You may have to way a few seconds for the initial init container to become available.

Notes

The name of the pods will follow this format below depending on the release name, such as my-release:

  • my-release-dgraph-alpha-0
  • my-release-dgraph-alpha-1
  • my-release-dgraph-alpha-2

You can always list the pod names with kubectl get pods. For the init container name, it will also be based on the release name, e.g. my-release:

  • my-release-dgraph-alpha-init

Thus putting these together you could exec into the init container on alpha-0 with:

kubectl exec -ti my-release-dgraph-alpha-0 -c my-release-dgraph-alpha-init -- bash

You can always get the name of the containers and init containers running in a pod with kubectl describe <name-of-pod>

Let me know if you need further help.

1 Like