Hi there,
So we have a situation where our the health check on the dgraph cluster (running on GKE) gives:
"ongoing": [
"opPredMove"
]
I’m wondering What does it mean exaclty and how long it will stay in this state? Also, would this effect the dgraph operations ? As we have been facing several errors while writing data to dgraph.
Since we are facing issues in production, any suggestion to fix this asap would be really helpful. thanks!
Dgraph Zero tries to rebalance the cluster based on the disk usage in each group. If Zero detects an imbalance, it will try to move a predicate along with its indices to a group that has lower disk usage. This can make the predicate temporarily read-only. Queries for the predicate will still be serviced, but any mutations for the predicate will be rejected and should be retried after the move is finished.
Zero would continuously try to keep the amount of data on each server even, typically running this check on a 10-min frequency. Thus, each additional Dgraph Alpha instance would allow Zero to further split the predicates from groups and move them to the new node.
Is there a way to set this to a different frequency or to stop predicates moving all together and make it a manual process?