Dgraph db cluster in 3 data centers

Sergo · February 16, 2023, 10:30am

To facilitate better answering of questions, if you have a question, please fill in the following info. Otherwise, please delete the template.

What I want to do

I’m going launch dgraph in our cloud in 3 data centers. Could you recommend right sets for deploy:

Am I planning right:
9 zero: 3 in DC
3 alpha: 1 in DC

And I’ll get finally 3 replicas and 3 shards in 3 groups.
Should every group be just in one DC or spread among different DCs?

Here prototype of config:
docker-compose.yml (6.2 KB)

MichelDiz · February 16, 2023, 6:00pm

-o 1 --my=zero2:5080

This should be -o 1 --my=zero2:5081 also change the ports in the docker context. And here --zero=zero1:5080,zero2:5081,zero3:5082
Here too alpha --my=alpha3:7082

This whole docker compose looks wrong… where did you get it?

Not sure what it means.
Based in your yml. You have 3 zero nodes and 9 Alpha nodes. You have set the replica to 3.
Which means 3 shard groups. What is DC? Dgraph Cluster?

different Dgraph Clusters? really not sure what you mean.

Try this
ps. Just change my name to yours in the user path.

version: "3.2"

networks:
  dg_net:
    driver: bridge
    ipam:
      config:
        - subnet: 10.5.0.0/16
          gateway: 10.5.0.1

services:
 zero1:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/zero1:/dgraph
   ports:
     - 5081:5080
     - 6081:6080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.21
   healthcheck:
     test: curl -sS http://localhost:6080/state | grep -o '10.5.0.21.*?*forceGroupId' | grep -c 'amDead":false' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph zero --my=10.5.0.21:5080 --replicas 3 --raft="idx=1"
 zero2:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/zero2:/dgraph
   ports:
     - 5082:5080
     - 6082:6080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.22
   healthcheck:
     test: curl -sS http://localhost:6080/state | grep -o '10.5.0.22.*?*forceGroupId' | grep -c 'amDead":false' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph zero --my=10.5.0.22:5080 --replicas 3 --raft="idx=2" --peer 10.5.0.21:5080
 zero3:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/zero3:/dgraph
   ports:
     - 5083:5080
     - 6083:6080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.23
   healthcheck:
     test: curl -sS http://localhost:6080/state | grep -o '10.5.0.23.*?*forceGroupId' | grep -c 'amDead":false' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph zero --my=10.5.0.23:5080 --replicas 3 --raft="idx=3" --peer 10.5.0.21:5080

 alpha1:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha1:/dgraph
   ports:
     - 8081:8080
     - 9081:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.11
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.11:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha2:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha2:/dgraph
   ports:
     - 8082:8080
     - 9082:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.12
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.12:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha3:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha3:/dgraph
   ports:
     - 8083:8080
     - 9083:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.13
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.13:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha4:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha4:/dgraph
   ports:
     - 8084:8080
     - 9084:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.14
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.14:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha5:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha5:/dgraph
   ports:
     - 8085:8080
     - 9085:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.15
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.15:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha6:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha6:/dgraph
   ports:
     - 8086:8080
     - 9086:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.16
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.16:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha7:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha7:/dgraph
   ports:
     - 8087:8080
     - 9087:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.17
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.17:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 alpha8:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha8:/dgraph
   ports:
     - 8088:8080
     - 9088:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.18
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.18:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"
 alpha9:
   image: dgraph/dgraph:v22.0.2
   volumes:
     - /home/micheldiz/dgraph/alpha8:/dgraph
   ports:
     - 8089:8080
     - 9089:9080
   restart: on-failure
   networks:
     dg_net:
       ipv4_address: 10.5.0.19
   healthcheck:
     test: curl -sS http://localhost:8080/health | grep -c 'healthy' > /dev/null
     interval: 10s
     start_period: 10s
     timeout: 5s
     retries: 7
   command: dgraph alpha --my=10.5.0.19:7080 --zero=10.5.0.21:5080,10.5.0.22:5080,10.5.0.23:5080
     --security "whitelist=0.0.0.0/0"

 ratel:
    image: dgraph/ratel:latest
    ports:
      - 8000:8000
    networks:
      dg_net:
        ipv4_address: 10.5.0.20
    command: dgraph-ratel

Sergo · February 16, 2023, 6:59pm

Sorry for mislead: DC it means data center. I planned to spread dgraph cluster among 3 data centers. So that any data will store in 3 replicas. And data should sharded in 3 data centers.

The final aim availability for data in dgraph in case of losing any of data center.

My docker-compose.yml was just for understanding how correct to control for numbers of groups, shards and replicas.

MichelDiz · February 16, 2023, 7:21pm

Got it.

Does it mean 9 zeros? That confused me.

I’d put a replica on each Data center.
So, group 1
Alpha-0 goes to Datacenter 1
Alpha-1 goes to Datacenter 2
Alpha-2 goes to Datacenter 3

group 2
Alpha-0 goes to Datacenter 1
Alpha-1 goes to Datacenter 2
Alpha-2 goes to Datacenter 3

group 3
Alpha-0 goes to Datacenter 1
Alpha-1 goes to Datacenter 2
Alpha-2 goes to Datacenter 3

So you have replication across your data centers. You can set this flag

group=; Provides an optional Raft Group ID that this Alpha would indicate to Zero to join.

To force the Alpha into the group.

But the zero group. This one needs to be very “close” to the Alphas in terms of latency. I mean, the leader.

Sergo · February 16, 2023, 7:41pm

Does it mean 9 zeros? That confused me

Yes, I supposed (apparently wrongly) that for high availability and fault tolerance it would need 3 zero instance per every data center and 1 alpha per data center. In our cloud we have ~tens of thousands instances so it is not a problem to allocate an extra few instances if it could help to add overall performance and reliability.

MichelDiz · February 16, 2023, 7:46pm

The zero group will always be a single group. Only Alphas with sharding will have more subgroups in the RAFT algo logic. You don’t need 9 Zeros. That is too much. You can have 1 Zero per Data center. As long it is an odd number. Pay attention that all your Alphas will talk only to the Zero leader. So it should be close related to everyone at least. You can’t have the Zero leader in China for example and the Alphas trying to reach it out from us-east.

Sergo · February 16, 2023, 7:50pm

Thanks a lot. I understood the scheme finally.

P.S. I mixed up zero with alphas. I mean 3 alpha for every data center

Topic		Replies	Views
Running multiple Dgraph alpha pod and zeros in single host Dgraph kind:question	11	666	March 23, 2023
Cluster not being made in dgraph Users	10	776	February 12, 2020
Setup DGraph Cluster on ( Ubuntu20.04, Jammy) - 3 Alpha and 3 Zero ( HA, Recovery, Snapshot, Visibility ) Dgraph kind:question , dgraph	4	113	July 2, 2024
More about Dgraph Zero - Deploy Documentation	0	490	August 28, 2020
What is zero what is server? Users	13	4749	December 22, 2017

Dgraph db cluster in 3 data centers

What I want to do

Related topics