Hi Everyone,
I am new to DGraph and was trying to do the bulk loader in kubernetes. I found the yml file from the github https://raw.githubusercontent.com/dgraph-io/dgraph/master/contrib/config/kubernetes/dgraph-ha/dgraph-ha.yaml. In this yml file in the comments sections it is mentioned to do the bulk load we can create an init container and copy the data from our local repository into pod’s repository dgraph repository. So I have downloaded the 1million.rdf.gz file form the github and tried loading it into the dgraph but I am unable to do so. I had to modify the Statefulset service to deployment. My alpha deployment looks likes this:
After I deploy using the above mentioned file I use the following commands to copy my data file. I have made a p directory where i have included my 100million.rdf.gz file.
But I am unable to see my data in the ratel UI. Am I missing something? If this is a silly question please forgive me. I am trying kubernetes and dgraph first time in my life and I need to deliver it asap. I did all the research before posting here. Please if anyone could help me with this that would really be great.
@chewxy
Thank you for saying there is no silly questions. So I had literally put down the 100million.rdf.gz file in my p directory on further research I got to know i need to run the some bulkloader commands
and this will create the out/0/p and need to copy this p folder into alpha … If this is right process where do I need to run the above command in my zero ?
Um, I think you need the live loader, not the bulk loader. - Bulk loader is for before you have the cluster up and running. But once it’s up and running you should use the live loader
Yeah, Liveloader should be the chosen one for this task. But you can also use Bulkloader, in that case, you have to start the Alphas after the bulk. You can’t start the bulkload with Alphas running. Just the zero, also, you have to preserve the Zero instance (never delete the Zero volume).
I believe that if you are running inside a pod, you should use zero:5080 or something similar coming from the SVC.
One more thing, if you gonna use the Liveloader, you have to check your provider. If it is AWS, GCP or similar you have to expose the Alpha and the Zero gRPCs ports. Unless you do the liveloader inside a pod.
I wouldn’t recommend using a Deployment controller, as the nodes are not sticky and will use a randomly named pod. StatefulSets are useful for stateful apps with zero and alpha.
A quick way to get started w/ K8S would be to use helm chart and then use live loader: