Why is mutation slow when many nodes link to one

yeahvip · August 3, 2021, 3:24am

In my scenario, many nodes should link to one, but mutation speed drops very sharp, how can I acclerate mutation speed in this situation?
Here is my experiment:
First I generate 10,000,000 nodes and they all link to node 1.
with open(“a.rdf”,“w”) as f:
for i in range(2,10000000):
a = f"_:{i} _:1 . \n"
f.write(a)
I use dgraph live and the mutation time is 1m37.731695169s

when few nodes link to one node, the mutation speed is sharply faster.
with open(“b.rdf”,“w”) as f:
for i in range(1,5000):
for j in range(60000,62000):
a = f"_:{i} _:{j} . \n"
f.write(a)
I use dgraph live and the mutation time is 54.934653541s
In my scenario I should make many nodes link to one , how can I accelareate mutation speed?

MichelDiz · August 3, 2021, 3:58am

Curious, this looks like slower than the bigger one. Right? Almost a minute. As the 10 million load has a minute and 37.

Are you sending all this in a single transaction?

In general, NVMe and load balancing.

yeahvip · August 3, 2021, 5:54am

This is the simple experiment to show the situation. In our project, if more than one million nodes links to one node , the mutation time sharply increase. In our project we send almost 10000 rdfs in a single transaction. In upper experment we use dgraph live.
dgraph live -r a.rdf dgraph live -r b.rdf

but the result that the speed is slow is not changed. How can I accelerate the mutation speed in this situation? In our deployment environment we only have HDD.

MichelDiz · August 3, 2021, 3:51pm

Horizontal scaling(more Alphas across machines), NVMe and load balancing. If you can’t have SSD or NVMe, try horizontal scaling.

Also, send batches of 5k max per Alpha.

purist180 · August 4, 2021, 2:04am

The difference between “a.rdf” , “b.rdf”

In a.rdf there is 10,000,000 nodes and they all link to node 1

The time consumed to write these two files is several times different.

I think the question is If a node has a large number of edges (like 10,000,000), will it slow down the writing speed? ?

Assuming that the speed of B is normal, why is the speed of A much slower than that of B? Is it caused by a large number of edges connected to a node in A?

Topic		Replies	Views
Mutate performance optimization Dgraph	10	1599	December 6, 2019
What I do wrong that my write performance is so bad Dgraph mutation	13	1399	December 19, 2018
Batch mutations are very slow Dgraph kind:question	11	839	November 25, 2020
Does dgraph support asynchrony when importing rdf into the library Users	10	528	March 24, 2023
Become unable to execute mutation suddenly in write-heavy workload Users	6	1127	July 22, 2018

Why is mutation slow when many nodes link to one

Related topics