Issue Connecting Dgraph Alpha Instance to Zero Leader Across Different Servers

sivak · October 15, 2024, 4:30am

What I want to do
I have set up Dgraph Zero and Dgraph Alpha on one instance (xxx.xxx.x.27), and another Dgraph Alpha instance on a different server (xxx.xxx.x.28). I want to connect the Alpha instance on the second server (xxx.xxx.x.28) to the Zero service running on the first server (xxx.xxx.x.27:5080).

What I did

1.Set up Dgraph Zero and Alpha** on the instance xxx.xxx.x.27.
2.Zero is configured to listen on xxx.xxx.x.27:5080.
3. Alpha is configured to connect to the Zero instance at xxx.xxx.x.27:5080.
4.Set up another Dgraph Alpha** instance on a separate server (xxx.xxx.x.28), configured to connect to the Zero instance on xxx.xxx.x.27:5080.

Zero Service Configuration (xxx.xxx.x.27):

ExecStart=/usr/local/bin/dgraph zero --my=xxx.xxx.x.27:5080 --replicas=1 --wal /var/lib/dgraph/zw

Alpha Service Configuration on another VM(xxx.xxx.x.28):;

ExecStart=/usr/local/bin/dgraph alpha --my=xxx.xxx.x.28:7080 --zero=xxx.xxx.x.27:5080 --logtostderr -v=2 -p /var/lib/dgraph/p -w /var/lib/dgraph/w --port_offset=8180

Error :
When trying to connect the Alpha instance from xxx.xxx.x.28 to Zero at xxx.xxx.x.27:5080, I get the following error:

Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.738624 536182 groups.go:750] Found connection to leader: localhost:5080
Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.739088 536182 groups.go:704] No healthy Zero leader found. Trying to find a Zero leader…
Oct 14 11:10:54 AI-ML18 dgraph[536169]: E1014 11:10:54.752493 536182 groups.go:1229] Error during SubscribeForUpdates for prefix “\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x15dgraph.graphql.schema\x00”: unable to find any servers for group: 1. closer err:
Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.840109 536182 run.go:786] Caught Ctrl-C. Terminating now (this may take a few seconds)…
Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.851876 536182 run.go:791] Stopped before initialization completed
Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.854678 536182 groups.go:750] Found connection to leader: localhost:5080
Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.854742 536182 groups.go:704] No healthy Zero leader found. Trying to find a Zero leader…

What could be causing the Alpha instance on xxx.xxx.x.28 to not find a healthy Zero leader, even though it connects? Are there any specific configurations or settings I should adjust to ensure proper connection and healthy leader discovery between Alpha on xxx.xxx.x.28 and Zero on xxx.xxx.x.27? Do I need to configure anything on the Zero instance to support connecting Alphas from multiple servers?

Can anyone help me resolve this issue?

mike42 · October 26, 2024, 10:52am

Your port offset looks a little bit strange. It means an offset (a realation) not the port itself which should be used.

For example, when a user runs Dgraph Alpha with the --port_offset 2 setting, then the Alpha node binds to port 7082 (gRPC-internal-private), 8082 (HTTP-external-public) and 9082 (gRPC-external-public), respectively.

However from which node are the logs you are provided? If they are from the zero node there might be a problem with the command arguments because it tries to connect to the zero via “localhost:5080” on the alpha. There it wouldn’t get any response from localhost because the zero node runs on another host as you explained.

rarvikar · October 28, 2024, 11:03am

Hi @sivak, we’ve also provided a resolution for this on the GH issue created as there’s definitely a problem with the port_offset configuration as per your note. More importantly, there should be no need to even use port_offset if the second Alpha was setup on a different host, with no potential conflict on ports 7080 or 9080 due to a different service also listening to on the same ports.

github.com/dgraph-io/dgraph

Issue Connecting Dgraph Alpha Instance to Zero Leader Across Different Servers

opened 06:58AM - 15 Oct 24 UTC

closed 08:12AM - 24 Oct 24 UTC

sivkr

kind/question

### Question. What I want to do I have set up Dgraph Zero and Dgraph Alpha on …one instance (xxx.xxx.x.27), and another Dgraph Alpha instance on a different server (xxx.xxx.x.28). I want to connect the Alpha instance on the second server (xxx.xxx.x.28) to the Zero service running on the first server (xxx.xxx.x.27:5080). What I did 1.Set up Dgraph Zero and Alpha** on the instance xxx.xxx.x.27. 2.Zero is configured to listen on xxx.xxx.x.27:5080. 3. Alpha is configured to connect to the Zero instance at xxx.xxx.x.27:5080. 4.Set up another Dgraph Alpha** instance on a separate server (xxx.xxx.x.28), configured to connect to the Zero instance on xxx.xxx.x.27:5080. Zero Service Configuration (xxx.xxx.x.27): ExecStart=/usr/local/bin/dgraph zero --my=xxx.xxx.x.27:5080 --replicas=1 --wal /var/lib/dgraph/zw Alpha Service Configuration on another VM(xxx.xxx.x.28):; ExecStart=/usr/local/bin/dgraph alpha --my=xxx.xxx.x.28:7080 --zero=xxx.xxx.x.27:5080 --logtostderr -v=2 -p /var/lib/dgraph/p -w /var/lib/dgraph/w --port_offset=8180 Error : When trying to connect the Alpha instance from xxx.xxx.x.28 to Zero at xxx.xxx.x.27:5080, I get the following error: Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.738624 536182 groups.go:750] Found connection to leader: localhost:5080 Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.739088 536182 groups.go:704] No healthy Zero leader found. Trying to find a Zero leader… Oct 14 11:10:54 AI-ML18 dgraph[536169]: E1014 11:10:54.752493 536182 groups.go:1229] Error during SubscribeForUpdates for prefix “\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x15dgraph.graphql.schema\x00”: unable to find any servers for group: 1. closer err: Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.840109 536182 run.go:786] Caught Ctrl-C. Terminating now (this may take a few seconds)… Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.851876 536182 run.go:791] Stopped before initialization completed Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.854678 536182 groups.go:750] Found connection to leader: localhost:5080 Oct 14 11:10:54 AI-ML18 dgraph[536169]: I1014 11:10:54.854742 536182 groups.go:704] No healthy Zero leader found. Trying to find a Zero leader… What could be causing the Alpha instance on xxx.xxx.x.28 to not find a healthy Zero leader, even though it connects? Are there any specific configurations or settings I should adjust to ensure proper connection and healthy leader discovery between Alpha on xxx.xxx.x.28 and Zero on xxx.xxx.x.27? Do I need to configure anything on the Zero instance to support connecting Alphas from multiple servers? Can anyone help me resolve this issue?

Additionally, as @mike42 noted above, you may also want to inspect the Alpha startup command as the logs indicate the Zero address provided as localhost:5080 instead of xxx.xxx.xxx.27:5080, (unless the Zero and the second Alpha are also running on the same host).

We’ve closed the issue as of now since this is a config issue and not really a bug.
HTH!

Topic		Replies	Views
Cannot connect to zero from alpha on another host Users	8	2020	April 11, 2019
Dgraph cluster setup Dgraph dgraph , cluster , docker	2	520	March 9, 2023
Server Failure - Can alphas find other zeros? Dgraph	2	476	November 17, 2019
HA Cluster Setup Dgraph	3	564	December 19, 2019
Dgraph : Run Directly on the host Dgraph dgraph	5	453	July 17, 2021

Issue Connecting Dgraph Alpha Instance to Zero Leader Across Different Servers

Related topics