Hi. I have setup Dgraph in Kubernetes (GKE
1.18.6-gke.4801) using the helm chart and there was no issues with the deployment and all pods are healthy.
All pods and services are running and healthy. No errors in the logs.
But, when I try to port forward the alpha service and try to update the schema or drop all data by hitting either the /alter
or /admin/schema
endpoint, the results are random (since K8s service is load balancing to different alpha pods underneath).
For eg.
Hitting /admin/schema
thrice without changing any payload or URL, I get different results:
Dropping data using /alter
four times without changing any payload or URL, I get different results:
May I know if this is a bug or am I doing something wrong? I would expect a consistent result even if the service is load balancing between pods.
Also, I am not sure what these mean in this context Only leader can decide to commit or abort
since I am just hitting the service and the service is deciding which pod to hit.
I also get this in the logs: Error while retrieving timestamps: rpc error: code = Unknown desc = Assigning IDs is only allowed on leader. with delay: 10ms. Will retry...
and it gets okay automatically
I also noticed these issues which might be because of this as well:
I don’t know if this info is relevant but I am running Dgraph on a cluster having Linkerd as the service mesh and restricted PSP setup.
This is how my values file look:
image: &image
registry: docker.io
repository: dgraph/dgraph
tag: v20.07.1
pullPolicy: IfNotPresent
# pullSecrets:
# - myRegistryKeySecretName
debug: false
name: zero
enabled: true
monitorLabel: zero-dgraph-io
## StatefulSet controller supports automated updates. There are two valid update strategies: RollingUpdate and OnDelete
## ref: https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#updating-statefulsets
updateStrategy: RollingUpdate
## Partition update strategy
## https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#partitions
# rollingUpdatePartition:
## StatefulSet controller supports relax its ordering guarantees while preserving its uniqueness and identity guarantees. There are two valid pod management policies: OrderedReady and Parallel
## ref: https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#pod-management-policy
podManagementPolicy: OrderedReady
## Number of dgraph zero pods
replicaCount: 3
## Max number of replicas per data shard.
## i.e., the max number of Dgraph Alpha instances per group (shard).
shardReplicaCount: 5
## zero server pod termination grace period
terminationGracePeriodSeconds: 60
## Hard means that by default pods will only be scheduled if there are enough nodes for them
## and that they will never end up on the same node. Setting this to soft will do this "best effort"
antiAffinity: soft
## By default this will make sure two pods don't end up on the same node
## Changing this to a region would allow you to spread pods across regions
podAntiAffinitytopologyKey: "kubernetes.io/hostname"
## This is the node affinity settings as defined in
## https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
nodeAffinity: {}
## Extra environment variables which will be appended to the env: definition for the container.
extraEnvs: []
## Configuration file for dgraph zero used as an alternative to command-line options
## Ref: https://dgraph.io/docs/deploy/#config
config.toml: |
whitelist = ',,'
lru_mb = 2048
## Kubernetes configuration
## For minikube, set this to NodePort, elsewhere use LoadBalancer
type: ClusterIP
annotations: {}
## StatefulSet pods will need to have addresses published in order to
## communicate to each other in order to enter a ready state.
publishNotReadyAddresses: true
## dgraph Pod Security Context
enabled: true
fsGroup: 1001
runAsUser: 1001
enabled: true
storageClass: "csi-cephfs"
- ReadWriteMany
size: 32Gi
## Node labels and tolerations for pod assignment
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
nodeSelector: {}
tolerations: []
## Configure resource requests
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
memory: 100Mi
## Custom liveness and readiness probes
customStartupProbe: {}
customLivenessProbe: {}
customReadinessProbe: {}
name: alpha
enabled: true
monitorLabel: alpha-dgraph-io
updateStrategy: RollingUpdate
podManagementPolicy: OrderedReady
## Number of dgraph nodes
replicaCount: 3
## zero server pod termination grace period
terminationGracePeriodSeconds: 600
## Hard means that by default pods will only be scheduled if there are enough nodes for them
## and that they will never end up on the same node. Setting this to soft will do this "best effort"
antiAffinity: soft
## By default this will make sure two pods don't end up on the same node
## Changing this to a region would allow you to spread pods across regions
podAntiAffinitytopologyKey: "kubernetes.io/hostname"
## This is the node affinity settings as defined in
## https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
nodeAffinity: {}
## Extra environment variables which will be appended to the env: definition for the container.
extraEnvs: []
config.toml: |
whitelist = ',,'
lru_mb = 2048
## Kubernetes configuration
## For minikube, set this to NodePort, elsewhere use LoadBalancer
type: ClusterIP
annotations: {}
## StatefulSet pods will need to have addresses published in order to
## communicate to each other in order to enter a ready state.
publishNotReadyAddresses: true
## alpha ingress resource configuration
## This requires an ingress controller to be installed into your k8s cluster
enabled: false
# hostname: ""
# annotations: {}
# tls: {}
## dgraph Pod Security Context
enabled: true
fsGroup: 1001
runAsUser: 1001
enabled: false
files: {}
enabled: false
enabled: false
enabled: true
storageClass: "csi-cephfs"
- ReadWriteMany
size: 100Gi
annotations: {}
## Custom liveness and readiness probes
customStartupProbe: {}
customLivenessProbe: {}
customReadinessProbe: {}
name: ratel
## Enable Ratel service
enabled: true
## Number of dgraph nodes
replicaCount: 1
# Extra environment variables which will be appended to the env: definition for the container.
extraEnvs: []
## Kubernetes configuration
## For minikube, set this to NodePort, elsewhere use ClusterIP or LoadBalancer
type: ClusterIP
annotations: {}
## ratel ingress resource configuration
## This requires an ingress controller to be installed into your k8s cluster
enabled: false
## dgraph Pod Security Context
enabled: true
fsGroup: 1001
runAsUser: 1001
## Configure resource requests
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
## resources:
## requests:
## memory: 256Mi
## cpu: 250m
## Configure extra options for liveness and readiness probes
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#configure-probes)
## Custom liveness and readiness probes
customLivenessProbe: {}
customReadinessProbe: {}
## Combined ingress resource for alpha and ratel services
## This will override existing ingress configurations under alpha and ratel
## This requires an ingress controller to be installed into your k8s cluster
enabled: false
annotations: {}
tls: {}
ratel_hostname: ""
alpha_hostname: ""