How to write data to dgraph cluster

luke · June 25, 2023, 2:59pm

I have created a cluster as follows:

zero: 192.168.1.1:5080
alpha1: 192.168.1.1:7080
alpha2: 192.168.1.2:7080
alpha3: 192.168.1.3:7080

At present, the cluster has been running successfully. Now I use golang to write data to dgraph. From the official tutorial: https://github.com/dgraph-io/dgo,
conn, err := grpc.Dial("localhost:9080", grpc.WithInsecure())
When I write data to dgraph I must specify address and port . In this way, isn’t all data written to the same node? So how do I ensure that the other three nodes in the cluster also have data?

These are my startup directives

dgraph zero --my=192.168.1.1:5080
dgraph alpha --my=192.168.1.1:7080 --zero=192.168.1.1:5080
dgraph alpha --my=192.168.1.2:7080 --zero=192.168.1.1:5080
dgraph alpha --my=192.168.1.3:7080 --zero=192.168.1.1:5080

MichelDiz · June 25, 2023, 6:49pm

If you’re not using replicas, the other Alpha instances will remain idle until the Dgraph Cluster recognizes the need to utilize them. This will be dependent on disk usage. However, you can employ the API to manually move predicates to other Alpha instances. Doing so will enable you to achieve the desired result.

When you have replicas or when you manually partition predicates by moving them to other Alphas, the Dgraph Cluster takes care of properly distributing the data. There’s no need to worry about which Alpha to send the mutation to.

Nonetheless, I would recommend implementing a “Round-robin scheduling” logic. It’s beneficial to maintain this practice even when starting small, in case your application scales up in the future. Round-robin, in this context, would help alleviate the load on a single Alpha instance. If you send all mutations to one place, it could overwhelm that instance alone. Therefore, load balancing is of paramount importance. This is the approach we take in Liveloader.

luke · June 26, 2023, 2:35am

Thanks a lot!.
As you said, if I use --replicas,

dgraph zero --my=192.168.1.1:5080  --replicas 3
dgraph alpha --my=192.168.1.1:7080 --zero=192.168.1.1:5080
dgraph alpha --my=192.168.1.2:7080 --zero=192.168.1.1:5080
dgraph alpha --my=192.168.1.3:7080 --zero=192.168.1.1:5080

Does it mean that the data can be guaranteed to be backed up on each node? Then all the pressure is on the zero node. If I adopt the Round-robin scheduling method you mentioned,which means that the program needs to randomly connect to the dgraph client when writing data. Is there any official recommended method?

MichelDiz · June 27, 2023, 12:00am

The data is replicated across the Alpha replicas.

Not exactly. The zero has just a few jobs. Like timestamps, UID leasing, data balancing and so on. So it won’t be in pressure.

Not that I remember. But this is just an optimization. In general queries are distributed by nature. But the deserialization happens on a single node. And if you have an app with many clients, many requests. This could be a temperature point. So it’s better to distribute. To lessen the burden on the deserialization.

luke · June 27, 2023, 1:56am

Thank you a lot! And I have a more comprehensive understanding of dgraph. In addition, if you have official channels in Asia? As far as I know, there is a great demand for graph databases in Asia, but communication is not very convenient due to the time difference between us.

MichelDiz · June 27, 2023, 2:07am

There is a group in WeChat. But I have no clue. You can find the link around here in the community.

Topic		Replies	Views
Cluster Setup - Deploy Documentation	1	632	June 24, 2023
How to connect a golang client to multiple alpha servers Dgraph	4	1536	March 8, 2019
Production Checklist - Deploy Documentation	0	485	August 28, 2020
More about Dgraph Zero - Deploy Documentation	0	461	August 28, 2020
Some questions of understanding with alpha Users	1	533	December 27, 2019

How to write data to dgraph cluster

Related topics