Setting up CDC with AWS MSK in Dgraph Cloud

This document covers the steps to set up AWS MSK in your own AWS account to receive change data capture (CDC) events from your dedicated Dgraph Cloud backend.

This document is a tailored and abridged version of the AWS blog post How Goldman Sachs builds cross-account connectivity to their Amazon MSK clusters with AWS PrivateLink | AWS Big Data Blog.

MSK setup in your AWS account

With MSK set up in your AWS account, the steps in this section are done by you.

High-level steps

These steps set up MSK and Endpoints in your AWS account:

  1. Setup MSK
  2. Setup NLB with VPC Endpoint Service
  3. Share details with Dgraph
  4. Accept PrivateLink connection request from Dgraph Cloud

Setup MSK

  1. AWS PrivateLink requires that Service Consumer and Service Provider VPCs should be in the same region.
  2. Set up MSK in the same AWS region where you Dgraph set up with SASL/SCRAM authentication enabled via AWS secrets. To work with MSK, AWS documents require that the secret name must start with ‘AmazonMSK_’.
    For this document we will assume username and password are:
    {
      "username": "kafkauser",
      "password": "kafka-secret-password"
    }
    
  3. Create a topic named dgraph-cdc. Dgraph CDC will push to this topic.
  4. In MSK security group allow traffic from the entire VPC CIDR range. This will allow any NLB IP to access MSK.

Setup NLB with VPC Endpoint Service

  1. Create NLB listening on 9096 with the IP address of the brokers as target nodes. Note: Adding a listener at least on port 9096 is very important. We have seen issues when NLB doesn’t have the 9096 listener port opened.
  2. In NLB go to the Integrated Services tab and click create on VPC Endpoint Services (AWS PrivateLink). Select Acceptance required as Yes.
  3. Using kafka-console-consumer.sh and kafka-console-producer.sh, verify that we can produce and consume from NLB. To set up kafka env refer Set up Kafka env.
  4. In EC2, go to Endpoint Services and select the endpoint service we just created. Go to Allow-listed principals and click Add principal to allow list. Add the following principal for Dgraph Cloud:

    arn:aws:iam::636790907320:root

Share details with Dgraph

Please share the following details with Dgraph:

  1. Service Endpoint name
  2. Kafka username and password SASL credentials
  3. AWS region where MSK is set up
  4. Broker DNS names. You can fetch this from View client information from the MSK service in AWS Console.

Once these are shared Dgraph Cloud can proceed with the next steps.

Initial Setup for VPC Endpoint

Dgraph will set up VPC Endpoint and initiate a connection. Once we confirm, go to VPC Service Endpoint which is integrated NLB. Go to Endpoint Connections and click on Accept endpoint connection request for the pending connection.

Once this is done, CDC can be configured on your Dgraph Cloud backend.

References