Dgraph Zero not managing index puts heavy burden on users

Experience Report for Feature Request

The current design around removing a zero node from a group (docs) puts a significant burden on the users because the operator or automation would have to manage state outside of dgraph, such as which system (container or pod) is paired with which index.

What you wanted to do

Dgraph zero would manage the idx state so that users are external automation does not have to be burden by operator manual remediation or complex external automation.

What you actually did

Manual Remedication (actually happened)

In a scenario where we have zero0, zero1, zero2, and zero1 had to be removed and replaced, where the user would have to do the following to remediate:

  • In Kubernetes in particular, edit the live deployed statefulSet such that if idx==2, set idx=4. (this actually had to be done for one customer)
  • On non-Kubernetes, something similar would have to take place.

Required Automation by Operator for Remedication (not-yet-invented)

For a potential automated solution required by the user, the user would have to build out an external mechanism outside of dgraph and outside of the orchestration platform, to maintain the state (e.g. consul, etcd, etc) such that, given the initial state:

  • zero0=1,
  • zero1=2,
  • zero2=3,

and afterward, upon the removal and replacement of zero1, the state would update to:

  • zero0=1
  • zero1=4
  • zero2=3

Why that wasn’t great, with examples

Users should not need to build out complex distributed state management solutions to supplement Dgraph, nor should they be required to intervene with custom hacks to their deployed infrastructure on top of their orchestration platform help zero.

Any external references to support your case

The customer was directly affected by this.

1 Like