I am using dgraph 1.0.16 and I have created a replication=3 with 1 zero and 3 alpha servers(alpha1, alpha2, alpha3). When I stopped other 2 alpha servers(alpha2, alpha3), I am getting below error in the main alpha server(alpha1) which is expected.
E0717 13:21:47.690749 21973 groups.go:853] While proposing delta with MaxAssigned: 33296 and num txns: 0. Error=Server overloaded with pending proposals. Please retry later. Retrying...
W0717 13:21:50.543436 21973 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0717 13:21:55.503609 21973 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
But the problem is when I try to connect the main alpha server(alpha1) from Ratel to search some data the loader keeps loading with the text “Fetching result…” and nothing happens.
When I start the other alpha server(alpha2 or alpha3) then only I get the search results, which is quite confusing.
Isn’t it should search for data from the main alpha server(alpha1) even the other replicated server was down?
Am I missing anything?
Please help.
Hi,
Will you share the logs of the server alpha1 with us?
Thanks
Hey @nshah14285,
Welcome to the channel.
Since you have set up multiple replicas, a majority of the group must be up in order to serve requests.
Hence 2 of 3 alphas should be up to be able to serve requests.
@amanmangal I will share the logs tomorrow.
1 Like
@hackintoshrao out of 3 alphas only one main alpha is up and other 2 alphas was down.
So I am expecting the search for data result from that main alpha as other 2 was down.
@amanmangal
Below is the alpha1 server logs
dgraph zero log
I0718 13:08:37.019255 7410 zero.go:396] Got connection request: cluster_info_only:true
I0718 13:08:37.019477 7410 zero.go:414] Connected: cluster_info_only:true
W0718 13:15:36.983632 7410 pool.go:226] Connection lost with 192.168.0.120:7082. Error: rpc error: code = Unavailable desc = transport is closing
W0718 13:15:41.731425 7410 pool.go:226] Connection lost with 192.168.0.119:7081. Error: rpc error: code = Unavailable desc = transport is closing
Logs when 2 alphas was down
W0718 13:15:36.983681 7465 pool.go:226] Connection lost with 192.168.0.120:7082. Error: rpc error: code = Unavailable desc = transport is closing
W0718 13:15:36.998260 7465 node.go:419] Unable to send message to peer: 0x3. Error: EOF
W0718 13:15:38.018537 7465 node.go:419] Unable to send message to peer: 0x3. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 192.168.0.120:7082: connect: connection refused"
W0718 13:15:41.731781 7465 pool.go:226] Connection lost with 192.168.0.119:7081. Error: rpc error: code = Unavailable desc = transport is closing
W0718 13:15:41.738476 7465 node.go:419] Unable to send message to peer: 0x2. Error: EOF
W0718 13:15:42.758646 7465 node.go:419] Unable to send message to peer: 0x2. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 192.168.0.119:7081: connect: connection refused"
W0718 13:15:48.038569 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:15:52.778602 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:15:58.058537 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:16:02.798560 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:16:08.078488 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:16:12.818665 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:16:18.098548 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:16:22.838698 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:16:28.118613 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:16:32.858608 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:16:38.138652 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:16:42.878591 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:16:48.158601 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:16:52.898695 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
E0718 13:20:55.110526 7465 groups.go:853] While proposing delta with MaxAssigned: 10009 and num txns: 0. Error=Server overloaded with pending proposals. Please retry later. Retrying...
W0718 13:20:58.618655 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:21:03.378642 7465 node.go:419] Unable to send message to peer: 0x2. Error: Unhealthy connection
W0718 13:21:08.638367 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
Logs when alpha1 and alpha2 was up and alpha3 was down
W0718 13:23:28.918459 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:23:38.938668 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:23:48.958617 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:23:58.978619 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:24:08.998654 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:24:19.018619 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
W0718 13:24:29.038595 7465 node.go:419] Unable to send message to peer: 0x3. Error: Unhealthy connection
Makes sense. As @hackintoshrao said, queries may fail unless a majority of Alphas are up. You could try best effort queries. Best effort queries do not require a majority of replicas to be up.
@amanmangal
So it means minimum 2 replicas has to be up then only data will fetch, right?
Best effort queries do not require a majority of replicas to be up
Do you mean the optimized queries?
Best effort queries are optimized to run faster and not necessarily provide the latest result. Given that you only have 3 nodes (and replication is set to 3 too), the queries should return a response.