High memory utilization on alpha node (use of memory cache)

Sagar_Choudhary · November 25, 2021, 12:10pm

We have set up Dgraph on EKS. we have loaded ~220GB of data on alpha and took r5a.8xlarge ec2 i.e 256GB ram.
it is showing 210GB cache memory on Grafana and actually, only 4.9 GB is used out of 256GB.
what is this cache memory? I have checked the documentation as well Dgraph internally not supported cache. link (Cached Results - GraphQL)

each alpha is on a different ec2 server. we have a total of 5 alphas and facing the same issue in every instance.

below are the screenshot of both.
Grafana (ec2 server stats):-

Alpha Memory Stats:-

AWS console:-

Sagar_Choudhary · November 27, 2021, 2:45am

waiting for respone

EnricoMi · November 27, 2021, 11:20am

What exactly do you mean with “we have loaded ~220GB of data on alpha”? The graph data are stored on disk, the memory that you refer to here is RAM, cache memory looks like the Unix filesystem cache that caches recently read files. This looks like a general Unix question, not a specific Dgraph question.

What is the “problem” here? What is your expectation?

Sagar_Choudhary · November 29, 2021, 10:16am

We have set up Dgraph on EKS and data load means data size on the graph. we have taken 256 GB instances for the alpha node. now last screenshot is of stats AWS ec2 like it is consuming only 4.9 gb memory but when I check the same instance on grafana then it is showing 210 gb cached memory and it is consuming 215 gb memory out of 246gb.

iluminae · November 29, 2021, 1:29pm

Maybe it would help to share which stats from what sources you are talking about. Like, is it Prometheus my_metric_name from the default host agent? Is it wired memory you are talking about? Does it line up with values in top?

EnricoMi · December 2, 2021, 8:24pm

So I presume you want to understand what that 210 GB cached memory is. Is that your question?

I reckon this is the filesystem cache of the Linux kernel.

To validate that, can you ssh into that machine and

run top?
run sudo bash -c "sync; echo 1 > /proc/sys/vm/drop_caches"?

The latter will flush the filesystem cache and you should see a significant drop in your metrics. This would show the fraction of the cache memory that is used by the OS filesystem cache.

When you start your instance, how do you get the data onto the alpha node? Do you mount a network share or do you copy / bulk load / live load the data?

Also interesting to know: which Dgraph version do you use?

Sagar_Choudhary · February 14, 2022, 2:51am

I upload data through the live loader.
actually, If I use a 128gb machine then dgraph consumes 122-124GB of memory.
Now, I have switched to a smaller machine 16cpu and 64GB memory and dgraph consume 62 GB memory out of 64GB and gradually response time of dgraph becomes very high.

I have tried lru_mb = 30gb but still didn’t get any improvement.

EnricoMi · February 14, 2022, 7:24pm

This is expected as outlined above. LInux uses the entire un-used RAM for the filesystem cache.

Sagar_Choudhary · February 16, 2022, 8:53am

but what about dgraph response time. how to fix that.
and also, I am getting memory full alert on alpha nodes

Topic		Replies	Views
Dgraph Alpha Eating Up All RAM Dgraph	7	594	September 9, 2021
Extreme memory usage when constantly query and mutate data Dgraph	5	1713	February 5, 2020
When writing data, dgraph takes up too much memory Dgraph area:performance	7	814	January 20, 2021
Dgraph high memory usage with frequent attribute updates Dgraph	1	368	June 2, 2023
Memory consumption irregularities Issues	2	591	January 17, 2023

High memory utilization on alpha node (use of memory cache)

Related topics