Help with cluster design


(James Hartig) #1

We have 5 regions across 3 continents. What’s the best way to set that a cluster? I assume it would be best to put 1-2 Zero instances on each continent and then 2-3 alpha’s in every region. If I did --replicas=5, how could I ensure that at least 1 replica is in each region? Instead, would it be better to do 3 alpha’s in each region and have each region be their own cluster? If a whole region went down then I’d lose that “group” but it might be better for write latency?

From what I can tell, unless I do a best-effort, read-only query, it’ll need to hit the Zero leader, so if I setup cross-continent zero’s then I’ll have to incur the cost of a cross-region lookup no matter what, even for reads, so is it better to just setup a cluster in a single region only and so I only have to take the latency hit once when sending the query (instead of potentially multiple times as the alpha node needs to request zero or alpha leaders).


(Daniel Mai) #2

Dgraph is highly available and distributed, but that doesn’t necessarily mean a single cluster will run geographically across continents. The recommended HA setup would be running each Dgraph instance within the same region in different data centers. In AWS parlance, that would be the same region in different availability zones.

We are looking into ways to have multiple Dgraph clusters running across different regions in an eventually-consistent manner. One approach would be using Kafka to synchronize writes across two independent clusters located in different regions.

Best-effort queries forgo reading the latest write by utilizing the latest timestamp from that Alpha instead of retrieving the latest timestamp from Zero (the latest timestamp of the cluster).