I’m searching for experiences of running dgraph nodes in a typical cloud environment to figure out, whether dgraph is the right tool for me. For sure, dgraph calls itself distributed, Open Source and production ready. However, I’d like to run it on an AWS cluster environment, preferred in combination with k8s and a dataset of at least a few billion nodes potentially distributed across this cluster.
Is there anyone who has already setup something similar (maybe for testing purposes) on EC2 or EKS environments or tackled problems like backups or up- and downscaling? I would suppose, that if it is true, that dgraph scales better than other Open Source solution, I would expect that some papers, tutorials or experiences might give me some hints. I’m already lost in choosing the right EC2 instance type/size, an appropriate number of instances or the optimal way of setting up a large, resilient cluster. The question is: is it worth the work at all?
What is the largest cluster you’ve run in a distributed environment and what kind of pitfalls (probably running it in a cloud environment) came up? Any dox available somewhere?
Any thoughts might be helpful. Thanks in advance.