It work well when I run docker-compose up, but when I kill anyone of the nodes,I can’t search data from the rest. The error is """dispatchTaskOverNetwork: while retrieving connection. error: Unhealthy connection"""
is there any configuration i missed? anyone can tell?
Predicates data is going to be split up between the 3 dgraph instances. When you shut one of them down, there won’t be enough data to complete queries anymore.
You can get around this by using replication. E.g. use a single dgraphzero, and 3 dgraph instances. When you start dgraphzero, set --replicas=3. That way, each instance will serve the same set of predicates and you will be able to survive one of the nodes going down.
Dgraph is ready to use in production, and great if you want scalability and performance.
Note that we are planning to make a new v0.9 release soon, which will have some big changes to the way clients interact with dgraph. Take a look over at Major changes in v0.9.
And I have a little confuse in the concepts between Replication and Group.
1 Is data only replicate between nodes which belong to the same group?or it can replicate across groups?
2 How should I config the nodes for cluster to keep it working when partial nodes fail?
nodes are
partial of one group
all of one group
I have seen a passage on the official websit that
Replication and Server Failure
Each group should typically be served by atleast 3 servers, if available. In the case of a machine failure, other servers serving the same group can still handle the load in that case.
Each predicate will belong to exactly 1 group. Data replication is only between nodes in the same group. There is no cross-group replication.
E.g. you could have 4 nodes, split across 2 groups (2 nodes in each group). Replication would occur within each group, so that each edge in the graph is duplicated between 2 nodes.
The passage on the website is really talking about situations where you need HA (high availability). By setting up servers in groups of 3 (i.e. 3 per group), one server doing down would leave 2 remaining servers. 2 remaining servers could likely still handle normal load.
If HA is critical for you, then you should run at least 3 dgraphzeros, and 3 dgraphs. If you are able to have more nodes, then you could run 3 dgraphzeros and 6 dgraphs split between 2 groups.
If all nodes in one group fail, then you will experience downtime until the nodes are brought back online.
@daviddhc20120601, I haven’t been able to reproduce the leadership issue so I’ve closed as “can’t reproduce”. If you can reproduce the issue and have some logs, I can re-open the ticket and do some further investigation.