CPU unutilised by dgraph alpha

What I did

I am running Dgraph alpha on a c5ad.8xlarge AWS instance with 16 CPUs / 32 vCPUs, 64 GB RAM and 20 GBit network. Then I am hammering it with concurrent queries to see how many concurrent clients that node can handle before saturating. I would have expected CPU to hit 100% when I increase the number of concurrent clients, but I have found that I can only get it to use 22 CPUs.

The same happens for a c5ad.16xlarge instance with 32 CPUs / 64 vCPUs and 128 GB RAM. Only 22 of these CPUs are ever used.

Question: Can you explain what prevents Dgraph alpha from using all available CPU cores?

The network is not the bottleneck as it has one order of magnitude more capacity than what is transferred over the wire. Response gRPC messages are in the MBs. I have perf tested the network link and it could transfer 20 GBit per second.

The local SSD that stores the dataset is also not the bottleneck as the entire dataset fits into the OS file cache. I have measured no disk IO.

Dgraph metadata

dgraph version

Dgraph version : v21.03.0
Dgraph codename : rocket
Dgraph SHA-256 : b4e4c77011e2938e9da197395dbce91d0c6ebb83d383b190f5b70201836a773f
Commit SHA-1 : a77bbe8ae
Commit timestamp : 2021-04-07 21:36:38 +0530
Branch : HEAD
Go version : go1.16.2
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs.
For discussions about Dgraph , visit https://discuss.dgraph.io.
For fully-managed Dgraph Cloud , visit https://dgraph.io/cloud.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2021 Dgraph Labs, Inc.

Not sure, try to add more Alphas in the same machine. If it won’t work, maybe there is some limit somewhere in the core code.

Right, more alphas on a node would be a workaround to utilize all CPUs.

Still, might be interesting for core code devs to look into.

Just a tip, badgerdb (the underlying embedded database) has a good amount of tunables, accessible via the badger flag of dgraph. This includes compactor numbers and goroutine counts, and may be able to help you use all of your CPUs more effectively.

1 Like

That sounds like a promising route to go, I’ll give that a try.

Yep, those tweaks should have a deeper docs and tests(of scenarios).

@Sagar_Choudhary this might be relevant reading your High memory utilization on alpha node (use of memory cache) post as your nodes have 32 vCPUs. My experience with v21.03 is that at most 22 of those will be used by Dgraph. Have you made similar observations?

@EnricoMi You might want to try increasing the value of GOMAXPROCS here. We’re explicitly setting the GOMAXPROCS to 128 which means golang will use 128 cores at max. The default value of this variable is equal to the number of cores available.

See runtime package - runtime - pkg.go.dev

Thanks for the hint. How does the 128 explain only 22 CPUs are used? Shouldn’t we see a limit at 128 CPUs here?

@EnricoMi I am not really sure but you could unset that value and see if that changes the CPU usage.