what happened to server?
it this normal for server’s log?
all server in cluster had this kind of log.
and the most important, servers’ speed are pretty slow.
i guess something happened. i need your help.
@mrjn
what happened to server?
it this normal for server’s log?
all server in cluster had this kind of log.
and the most important, servers’ speed are pretty slow.
i guess something happened. i need your help.
@mrjn
It was this issue: During replication server consumes all RAM and crashes · Issue #2424 · dgraph-io/dgraph · GitHub
Got fixed in master. Would be present in v1.0.6.
one of the server can not start again while stopped. it seems the commit is out of range. how to fix it to let start .
by the way, when the v1.0.6 would be presented?
Aiming for it this week. Maybe run Dgraph with --replicas=1
flag until then.
i set three servers in cluster, while --replicas=1, there are too many log like : xx predicate is removed xxxx.
You might have to clear out both Zero and Server directories, so the old state is gone.
i need to clear both Zero and Server directories in cluster, and all data are gone.
that’s not acceptable
Actually, you could remove nodes from Zero, by using removeNode
endpoint. You can achieve this without removing dirs.
--replicas
: is the option that controls the replication factor.
If --replicas=1
then each server node will serve different group.
If the replica goes down and can not be recovered, you can clear out both the Zero and Server directories for removing the old state and add a new node to the quorum.
The endpoint can be used to remove a Zero or Dgraph server node.
For example : Start Zero servers :
./dgraph zero --replicas 3 --idx 1
# Stop the leader first. Assuming node with idx 1 is leader :
example: curl "localhost:6081/removeNode?group=0&id=1"
#Restart the node with a peer :
example : ./dgraph zero --replicas 3 --idx 1 --peer localhost:5081
Start a dgraph server pointing it to the restarted node.
For the new one, you should not use the same idx as that of a node that was removed.
how to understand this sentence? can you give me more information about this?
Predicate is being moved : this is to make sure that all the servers have equal data stored in them.
You can
--rebalance_interval duration Interval for trying a predicate move. (default 8m0s)
you can change the duration to some xx hours to avoid predicate move.
https://docs.dgraph.io/deploy#more-about-dgraph-zero
You can check this for more information on
--replicas=1 then each server node will serve different group.
but bulk loader took several hours, i don’t know how long to set rebalance_interval. is this matters?
and --replicas=1 then each server node will serve different group.
first : i don’t know why, if one server died, the dgraph cluster can not server if all the servers have equal data stored in them.
second: if all the servers have equal data stored in them, why does the server need to move not just copy the Predicate, what’s the differences?
how to understand the operation “Predicate is moved”?
what really happened to servers?
Much thanks
If replicas=1, then there’s only one copy of each data. And data is sharded across groups.
So, if a server dies, then that data is not being serviced at all. So, queries which need those predicates won’t be able to function.
The reason move is happening is that the servers don’t have equal data stored in them. Zero is trying to adjust the data between groups so that each group would contain roughly the same amount of data, by moving some predicates from one group to another.