Continuous increase in memory consumption on self hosted dgraph pods

aditya_shelar · April 18, 2023, 6:42am

I am running dgraph on kubernetes cluster. I am able to successfully upload large data
(140k quads) using dgraph live command.

chart_version = dgraph-0.0.19
App version = v21.12.0
kubernetes pod memory = 200Mi limit.

Here when I try to delete the uploaded file data, using the query below, the dgraph alpha pods run out of memory and go in crash loop. On each restart after sometime the error is displayed as follows:

The node was low on resource: memory. Container test-dgraph-dgraph-alpha was using 3172816Ki, which exceeds its request of 200Mi.

This keeps on increasing on subsequent restarts. Also the memory usage of the pod which is 200Mi increases.

The schema structure is one Project type which is connected to all other types( which are around 15 types) with a reverse project edge.

Delete query:

upsert {
            query {
              var(func: uid(0x67d6b3)) { #here the uid is of project type
                p_uid as uid
                ~project {
                  s_uid as uid
                }
                }
            }
            mutation {
              delete {
                uid(p_uid) * * .
                uid(s_uid) * * .
              }
            }
          }

We also tried upgrading the chart version to 0.0.20 and App version v22.0.2, which gave the following error

Error: cannot patch "test-dgraph-dgraph-ratel" with kind Deployment: Deployment.apps "test-dgraph-dgraph-ratel" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app":"dgraph", "chart":"dgraph-0.0.20", "component":"ratel", "release":"test-dgraph"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable && cannot patch "test-dgraph-dgraph-alpha" with kind StatefulSet: StatefulSet.apps "test-dgraph-dgraph-alpha" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden && cannot patch "test-dgraph-dgraph-zero" with kind StatefulSet: StatefulSet.apps "test-dgraph-dgraph-zero" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden

Can you help me with what can be the reason of increasing memory usage? Or any suggestions for the delete query.
And also how to resolve the upgrade error.

MichelDiz · April 18, 2023, 11:11pm

You must be doing something very wrong, perhaps you have insufficient memory and processing power. If there is a bottleneck, it will take longer to release resources. The ideal range for loading 21 million RDFs is between 19GB to 30GB of RAM. If it is failing with 140k, then there is something seriously wrong with your stack.

Regarding the query, if it is too broad, it will need to fill up memory with all the content to be deleted. I cannot recall how optimized the upsert mutation is, but I believe it expands in memory. If you don’t use it carefully and don’t have enough resources to handle such a broad query, it is not recommended to do it this way.

As for Ratel, there have been some changes in its deployment, and you need to carefully analyze the changes. I noticed something wrong with “persistentVolumeClaimRetentionPolicy.” Ratel doesn’t need a volume, so please examine your YAML file carefully.

If you are deleting all. Just delete the whole deployment.

Cheers.

aditya_shelar · April 19, 2023, 12:27pm

We did try to increase the memory to 5GB. Was still facing the same issue.
Below in error, where we approx 200Mb and in error it required approx 3Gb.

The node was low on resource: memory. Container test-dgraph-dgraph-alpha was using 3172816Ki, which exceeds its request of 200Mi.

So we did upgrade the pod to 5Gi memory, but as I mentioned it kept on increasing on subsequent restarts. So even after the upgrade we got the following error:

The node was low on resource: memory. Container test-dgraph-dgraph-alpha was using 8102548Ki, which exceeds its request of 5Gi.

Now it required approx 8Gb of memory.
This memory issue occurs after the delete query given in the question, also the data is not deleted.

aditya_shelar · April 25, 2023, 8:20am

@MichelDiz any inputs on this?

MichelDiz · April 25, 2023, 8:27pm

Sorry, I don’t get what kind of inputs you need. Can you elaborate?

Topic		Replies	Views
Memory use and crashes when live loading Dgraph	11	779	November 3, 2020
Dgraph Alpha Eating Up All RAM Dgraph	7	596	September 9, 2021
Dgraph storage doubled in less than 24 hours Dgraph	2	332	March 18, 2021
Consistent Increase in memory usage for zero leader Dgraph area:performance	7	1391	October 13, 2020
[K8s / Devops] Need to investigate OOMs of pods with setup done using Helm charts Dgraph dgraph , status:accepted , area:operations	3	626	August 14, 2020

Continuous increase in memory consumption on self hosted dgraph pods

Related topics