[Devops / K8s / Docs] Production checkllist

diggy · December 8, 2019, 7:16am

Moved from GitHub dgraph/4379

Posted by hackintoshrao:

This requirement came up while writing the tutorial and blog on running Dgraph on K8s.

We need a document containing the production checklist with the suggestions to run Dgraph in production:

Recommended hardware for alphas and zeros. This helps us suggest the resources to be allocated for running an alpha or a zero in minimal mode and also for production workloads.
Recommended topology with recommendations for number of alphas and zeros to be run. Running 3 alphas vs 5, running 3 zeros vs one, why should one run odd number of alphas or zeros.
Recommended drives for VMs and K8’s.
Recommended on scheduling the database processes. This answers questions like should I have the run an alpha or a zero in dedicated nodes.
Optimizing for resiliency: Running more instances with not so great hardware resources vs Running fewer instances with higher resources.
Security best practices. TLS for clients, cluster-to-cluster secure communication.
Queries and mutations and best practices: For instance: Running a has() query is costly. What are the performance implications one needs to be aware of while running different queries and mutations?
configuring the database cache: How much should be set?
Recommended file system for production.
Monitoring and alterting.
When does the throughput saturate. This is to help decide when to scale up vertically with more hardware resources (more CPU’s, memory vs adding a new node).
Load balancing practices across various alphas, including using the readiness endpoint so that the request is not sent to clusters that are unhealthy.
Running the clusters across the zones.
Client best practices. Is there something you need to aware of using the clients? Like connection pooling?
Setting the open file descriptors limit to a reasonable baseline

diggy · January 7, 2020, 5:55pm

hackintoshrao commented :

Hey @danielmai,

Did you publish the production checklist which you had recently prepared?
Are there any plans of adding them to the docs?

diggy · April 10, 2020, 8:01am

Sceat commented :

Very interested in this one, just started using Dgraph on Kubernetes with 1 zero and 1 alpha so far!

diggy · April 10, 2020, 1:08pm

dmitryyankowski commented :

Personally i’m going to wait until @slotlocker2 comes out with a GKE Terraform config for Dgraph! (Terraform modules for Kubernetes - AWS EKS by slotlocker2 · Pull Request #5092 · dgraph-io/dgraph · GitHub) I’d like to have a production quality example to go off of, that includes even things like TLS.

Topic		Replies	Views
Production Checklist - Deploy Documentation	0	485	August 28, 2020
Clarification on machine requirements for Dgraph Dgraph kind:question , dgraph	1	590	May 2, 2023
Resource needs for zeroes vs. alphas in kubernetes Dgraph kind:question	1	419	October 20, 2020
[K8s / Devops] Add more configurables to Helm and publish the charts Dgraph dgraph , kind:enhancement , status:accepted , area:operations , area:kubernetes	3	656	August 23, 2020
Struggling getting a production quality DGraph instance setup Users	7	3836	July 11, 2018

[Devops / K8s / Docs] Production checkllist

Related topics