Create and launch AWS Network Load Balancer (either private or public) for Alpha and Zero services instead of the Classic Load Balancer by helm install with customized values yaml file to specify the service as LoadBalancer given proper annotations.
What I did
Create an EKS cluster.
Helm install Dgraph using the chart from Dgraph helm chart on the EKS cluster with customized values.yaml file as below:
Dgraph runs well, the Classic Load Balancer is launched as expected. On AWS CLI, it shows that Classic Load balancer belongs to the previous generation and it suggests migrating to Network Load balancer. My question is whether or not Dgraph supports AWS Network Load Balancer? If yes, what is the correct way to launch it serving Alpha and Zero service the same, otherwise, will the community consider upgrading to support it in the future?
The default driver in a vanilla EKS only supports classic ELB. However, if you install the ALB ingress, now called aws-load-balancer-controller, you get both a ingress-controller that uses ALB, but also allows you to use NLB when you deploy service object of type LoadBalancer.
I have successfully installed aws-load-balancer-controller for my eks cluster and re-deployed Dgraph alpha and zero services of type LoadBalancer as prior, the difference is that there is no any Classic Load Balancer being created this time for alpha and zero. Out of curiosity I changed the annotations for alpha and zero to the code below and re-deploy dgraph to see what was happening, it seems nothing happened.
I’d expect NLB to be created by this change, however when I checked the service, the external-IP for the LoadBalancer was always pending…and still there’s no any load balancer being created
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dgraph-release-dgraph-alpha LoadBalancer 172.20.159.68 <pending> 8080:30657/TCP,9080:32060/TCP 32s
dgraph-release-dgraph-alpha-headless ClusterIP None <none> 7080/TCP 32s
dgraph-release-dgraph-zero LoadBalancer 172.20.143.23 <pending> 5080:31144/TCP,6080:30307/TCP 32s
dgraph-release-dgraph-zero-headless ClusterIP None <none> 5080/TCP 32s
I wonder what I should next to make at least alpha accessible for the external at port 8080 via the AWS ALB.
It’s a great article, thorough and practical. I’m interested in EBS volume increase management most cause my project has billions of data records need to store in Dgraph and the amount is continuously growing. Autoscale up the EBS volume attached to the node is critical.
The EBS driver supports this. One needs to enable it when creating the storage class. There’s even a snapshot feature, but I have not played with this, and so don’t know how easy or practical this is.
After expanding a disk, you can peek at the driver doing its magic with kubectl events in the same namespace.
I had Amazon EBS CSI Driver installed as one of my EKS cluster Add-ons, also had storage class specified, didn’t notice the snapshot feature you mentioned, but I’ll search and play around with it later. kubectl events is a quick and easy way for developers to peek the disk usage, but I’m looking forward to bringing in a more powerful monitoring system such as Prometheus to continuously watch it and give me alert once the threshold gets hit, which I think shall be discussed in another topic later. Thanks very much for your guidance.
If you list the containers, you’ll see one of them is a snapshotter. But there’s no documentation about it from Amazon. There’s a helm chart path only installs the EBS driver w/o the snapshotter last time I tried it. It’s docs are here:
For Dgraph backups, I am able to get by with binary backups, but snapshotting would be nice for that extra layer of disaster recovery.
Oh, Prometheus is something I wanted to explore further. I wrote up some stuff in contributions of dgraph open source, but it is obviously dated by now. Something that I wanted to refresh. Since then, I found a way to automatically install Grafana panels, but have yet to follow up on anything comprehensive. This could make a good blog + docs.
With service meshes, I wanted to not use their canned solutions, and instead something compatible with jaeger, o11y needs, plus service mesh.
When getting into this level of depth, I will have to switch to template system to customize helm config values. For that, it will probably be either helmfile or terraform, or both with a helmfile provider. Helmfile let’s you integrate kustomize as well for any ad-hoc patching needed.
I’m new to Prometheus, really looking forward to seeing your article sharing your exploration and insight of it. Initially I tried eksctl but eventually decided to switch to Terraform because Terraform provides the logs of changes each time you modify/add/delete anything relating to infrastructure stuff, which is very helpful for integrating or decoupling micro-services either for developing or for releasing. Nowadays, many tools provide plugins in order to use other tools. For example, Terraform provides plugin(you can see it as a wrapper) to manage helm chart so that we don’t need to use Terraform code to set up all infrastructure stuff first and then switch to helm to install/upgrade/uninstall application/service, Terraform handles all the operations. I haven’t used helmfile before, but I believe many other tools are adopting the same idea as Terraform does, as double-edged sword, which is also hard for developers to choose at the beginning. Until we played them all shall we find the one that best satisfy our own needs.
BTW, I really want to upgrade to the latest version [v23.0.1] however this issue How to update lambda functions for Dgraph v23.0.1 stops me from moving forward. I don’t know who in the team is taking care of Lambda, but I hope it’s noted.
Lastly, I have not played with these features yet helmfile supports the helm-secrets plugin, so you can keep secrets encrypted, such as AWS secrets manager. So you can use Terraform to randomly generate secrets, store them in AWS secrets manager, then when deploying Kubernetes apps, pull them down from the same source.
For keeping things stored at rest inside Kubernetes encrypted, either KMS driver for kuberetes secrets or external secrets operator. I have yet to play with any of these two.
Hey Joaquin, I’m currently exploring data backup solutions. Binary backups seems like a enterprise solution which doesn’t apply for self-hosted community Dgraph binary backups…Is there any other options at this moment besides Volume Snapshots by Kubernetes that are open to community users?
As for Prometheus, I installed kube-prometheus-stack and figured out how to use the resource you provided Dgraph Prometheus Contrib (outdated) : dgraph/contrib/config/monitoring/prometheus at main · dgraph-io/dgraph · GitHub to launch the dashboard in Grafana for my dgraph cluster. Though outdated, much of the contents are still useful. Thanks to that, I’m able to monitor the some major activities.