Dgraph Cloud Bring Your Own K8s (BYOK) - Setup and Access Requirements

Dgraph Cloud can be hosted in your own private Kubernetes (K8s) cluster (BYOK = Bring Your Own K8s). You may launch and maintain databases using the web portal at https://cloud.dgraph.io, which will create and update resources in your Kubernetes cluster.

This allows us to launch a Dgraph instance in a cloud VPC owned and operated by you, or even on an on-premise Kubernetes cluster.

What are the minimum requirements?

Dgraph Cloud can run in any Kubernetes cluster that is capable of the following:

  • StatefulSet support bound to persistent volumes
  • Kubernetes LoadBalancer resource
  • Dedicated Kubernetes worker nodes to run Dgraph instances (recommended)

We only support Kubernetes version v1.17+. We support the managed Kubernetes offerings from all major cloud providers (including Amazon EKS, Google GKE, and Azure AKS).

Further, the Kubernetes must allow access to certain HTTPS requests into and out of the system for provisioning and maintaining clusters. Furthermore, the Kubernetes control plane must be accessible from Terraform Cloud.

Architecture

A Dgraph Cloud cluster runs multiple Dgraph databases and the management components in a single Kubernetes cluster. Below are some of the relevant management components

  • Slash Ingress - We have built a custom ingress controller on top of Caddy. This component is responsible for verifying API keys, and routing requests to the correct database. This lives in the ‘cluster-ingress’ namespace
  • Cluster Manager - Cluster Manager is responsible for launching and upgrading the various databases. This lives in the ‘cluster-manager’ namespace.
  • Cluster Usage - Cluster Usage records the usage and scale of various databases. This is primarily for billing purposes.
  • Proxy - Responsible for taking timely backups, and facilitates a few administrative tasks.
  • Various monitoring services such as Prometheus client and Velero, which exist in the ‘monitoring’ namespace.

What data enters and leaves the network?

Under the BYOK model, all data and backups remain within your VPC.

Our control network makes an HTTPS request to ‘cluster manager’, in order to perform the following operations (done by the customer’s request and acknowledgment):

  • Launch, scale-up, and destroy databases
  • Upgrading Dgraph versions and updating other images

The Kubernetes cluster must also be accessible to Terraform Cloud, which is used to update images of the cluster manager and other parts of the management services.

We also make the following API requests from Slash Ingress:

  • Verifying API Keys (which are also cached locally)
  • Updating usage metrics for billing purposes
  • We also export monitoring metrics and logs for debugging purposes, and for showing via the web interface

What access is required?

Dgraph will only require access at a Kubernetes level, and will not require access to the machines themselves.

Below is a brief description of the various ways that access is required, and we have also provided a Kubernetes config that can create the appropriate roles.

  • The Dgraph SRE and support team will need access to the cluster for maintenance and debug ability. This may be secured via a jump box or VPN or any other system.
  • Dgraph Cloud manager will require access to update the management components.
  • Dgraph Cluster manager will need to have access to manage namespaces, stateful sets, secrets, and other things required to manage the databases.
  • Access to write and read from a storage bucket to store backups.
  • Various components will need read-only access for monitoring, alerting and billing.

Backend management by the Dgraph team includes:

  • Access to the Dgraph instance logs.
  • Access to Dgraph debug and management endpoints, including metrics, profiling, distributed tracing, and cluster information.
  • Ability to take database backups/restores to/from your backup object storage (e.g., your AWS S3 bucket).
  • Ability to perform instance patches and version upgrades.
  • Ability to run queries only when needed for resolving support tickets. Running a query would require an authenticated JWT provided by you or one generated on your behalf with your permission. The creation of JWT and all queries (including those run by the Dgraph team) are stored in audit logs.

Dgraph version upgrades

Major Dgraph version upgrades will be done by Dgraph Labs at a mutually agreed upon time.

Which components are Dgraph Labs responsible for?

Dgraph Labs will be responsible for maintaining all software related to the management components, and will also have a shared responsibility for maintaining the database health.

Dgraph Labs will not be responsible for the management of the Kubernetes cluster itself, including the following:

  • Load Balancer for data ingress
  • Kubernetes control plane
  • Nodes availability and ensuring sufficient capacity exists
  • Network failures
  • Hardware failures
  • Disk failures
  • Failures in long term block storage

Security features

Dgraph Cloud includes the features encryption-at-rest, backups, access control lists, and audit logs to ensure the security and integrity of your data.

Encryption-at-rest (transparent data encryption) encrypts the database files on disk using your own encryption key. Dgraph currently supports vault to store encryption keys.

Encrypted backups provides full and incremental encrypted backups of data stored in Dgraph directly to your cloud storage (such as AWS S3). Backups are complete cluster-wide snapshots of your data that can be restored on top of your existing cluster or restored to create a separate clone of your backend.

Access control lists gives you the access control on requiring login access in order to access Dgraph. You can set further access control at the predicate level on which groups can query data, mutate data, or alter the schema.

Audit logs gives you visibility on who logged in when and what data they accessed. Audit logs provide a trail of the activity of your database.

2 Likes