How dgraph handles situations where more than 10 million points point to the same label

yeahvip · February 18, 2022, 3:19am

In our scenario we need to label a large number of points and perform node search through labels. But when storing tag-to-node relationships，mutation speed is extremely slow may because of the fanout of badger. How can I design schema or dgraph to be able to find all 10 million points by one label?

I use dgraph live to take the mutation, the time spent is 1h12m27.907689547s . The correct answer should be 10000000,but ratel returned 19000.
When I make the reverse relation, eg:<Alice-1> <label> <USA> . The same file only spent 3m0.41087837s to finish the mutation, and the count(uid) is 10000000. Should dgraph can hanle this situation?

Below is the rdf we use:

<USA> <country> “America” .
<Alice-0> <name> “Alice-0” .
<USA> <label> <Alice-0> .
<Alice-1> <name> “Alice-1” .
<USA> <label> <Alice-1> .
<Alice-2> <name> “Alice-2” .
<USA> <label> <Alice-2> .
<Alice-3> <name> “Alice-3” .
……till to Alice-10000000

shema:
<country>: string @index(hash) .
<name>: string @index(hash) .
<label>: [uid] @count .

Mutation is too slow to get name by country

{
  node(func: eq(<country>,"America")) {
    uid
    <label>{
    count(uid)
  }
  }
}

the result returned is
···
{
“node”: [
{
“uid”: “0x8ac7230489e80001”,
“label”: [
{
“count”: 19000
}
]
}
]
}
···
Hope to get your answer！

yeahvip · February 18, 2022, 7:18am

In order to reproduce this problem, we query the count(uid) in real time during mutation process. I find that when one node linked to two million nodes, with the process of mutation, count(uid) restarted from 0. I believe this may be caused by fan out of lsm, but how I fix this problem?

In many scenarios I should locate data by specific label, can dgraph handle this situation? If not, how can I fix it?

Topic		Replies	Views
Why is mutation slow when many nodes link to one Users	4	540	August 4, 2021
Node capacity sizing for 100+M vertices Dgraph	2	472	March 11, 2020
Performance: Index VS Label Users	5	1580	November 21, 2017
Mutations can break queries on unrelated nodes Dgraph dgraph , area:mutations	4	666	May 13, 2020
Why does dgraph get only part of my data? Users mutation	2	507	April 2, 2020

How dgraph handles situations where more than 10 million points point to the same label

Related topics