How many sides can each node have at most?

bqf9979 · January 8, 2020, 8:52am

When I insert 10000 edges, it takes 46 seconds to insert, and it is slower to insert later. Is this the performance bottleneck of the database?
Here is the last execution result of the insert:

{
  "data": {
    "code": "Success",
    "message": "Done",
    "uids": {}
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 4867594,
      "processing_ns": 78759405205
    },
    "txn": {
      "start_ts": 5050055,
      "commit_ts": 5050063,
      "preds": [
        "1-keyword_uids"
      ]
    }
  }
}

bqf9979 · January 8, 2020, 8:54am

Here is some query what I do:

{
  "set":{
	  "uid": "0x231861",
	  "keyword_uids": [{"uid": "0x7dd"},{"uid": "0x1ee3"}]
	}
}

There may be more than 100000 side in each node

bqf9979 · January 8, 2020, 10:13am

If there are 2 million nodes and each node has tens of thousands of sides, is there any good scheme or suggestion? Thank you!

bqf9979 · January 8, 2020, 10:15am

By the way, this is the schema I created

bqf9979 · January 8, 2020, 1:24pm

I have also tested other insertion methods and inserting list data into predicates. It turns out that with the increase of data volume, the time of each insertion will be longer than that of the previous one. In this way, if there are hundreds of thousands of data in a list, the cost of inserting data will be huge. I think this is totally unacceptable for users with more list data. What is it Solution?

amanmangal · January 9, 2020, 2:22am

Would you provide us details such as what is the configuration of machine that you are using? I expect the insertion to get a little slower over time given that you have count index. If it is getting too slow, then this is something we should look into and figure out. Could you provide us with example dataset and we would run the live loader on it.

bqf9979 · January 9, 2020, 3:17am

The machine has 48 cores and 32 GB of memory, and the remaining machine resources are still relatively large. If it is an index problem, is it theoretically not affecting the insert speed without adding count and reverse, I will try to wait for a while to feedback the results

bqf9979 · January 9, 2020, 3:29am

After my actual test, if you don’t add count and reverse, the speed does not slow down, and the insertion speed of 10,000 edges is also very fast, about 1s. It seems that this can only be done, what is better? Any suggestions?

amanmangal · January 9, 2020, 2:58pm

Yeah, that makes sense. We are looking into how we can improve the performance of updating the count index.

I think it would be faster to create the count index once the initial ingestion is complete, though, understand that the cluster will not available for writes while indexes are getting built. I am working on improving the speed of building the index as well as how we can make the cluster available for writes while indexes are getting built.

Topic		Replies	Views
Performance issue with count index Dgraph	2	462	November 19, 2021
Why is mutation slow when many nodes link to one Users	4	550	August 4, 2021
How many nodes can i insert at most Users dgraph	2	768	February 16, 2022
Cost of each mutation grows as more mutations are in a transaction Dgraph dgraph , kind:enhancement , status:accepted , priority:p1 , area:performance	2	550	February 21, 2019
Node capacity sizing for 100+M vertices Dgraph	2	481	March 11, 2020

How many sides can each node have at most?

Related topics