Scaling Dgraph amongst a cluster

We’re sorting out how to scale our Dgraph instance. We have a very large graph structure performing a lot of frequent writes and some heavy queries depending on indexes.

A couple questions:

  1. Which AWS instance type would be the best to deploy Dgraph on? We’re using the i3 series currently. Should we me focusing on systems that have good memory capacity, disk-write speed or a combination of both?
  2. When scaling out Dgraph amongst numerous instances (let’s say 3) we’d like to have zero + dgraph and two further instances with dgraph. Should all the instances be the same size ie. i3.xlarge? We’re looking for the smallest possible cluster cost to support our structure so this is important.
  3. If we’re using Docker, will dgraph be self-managed when it comes to up time? We’re currently using tmux in development as we build out our scaling strategy. The problem is when dgraph goes OOM or crashes it doesn’t reboot itself. We’re wondering if the Docker container will manage the instance by rebooting and maintaining uptime?



1 Like

I’d encourage you to think if you want to add replicas or shard the graph. You can read more about the Dgraph cluster at

Combination of both, SSD’s are best suited. On top of that pick the best machine which has good memory and CPU and is within your budget.

Yeah, Zero doesn’t take much CPU/Memory so I’d think all instances are of similar size.

Orchestration can be handled in multiple ways. You could use Kubernetes which seems to be the thing most people are using nowadays, it can make sure that the containers running Dgraph are always up. A simpler but manual way would be to use a docker_compose.yml. See the file at

Docker engine would make sure that the dgraph server or zero are restarted if they go down. This won’t protect you against the virtual machine itself going down though which is what Kubernetes protects against.