Extreme memory usage when constantly query and mutate data


(Igor Miletic) #1

Seems like constantly doing queries and mutations memory that Alpha nodes use is extremely hight.
(some discussion is Can Dgraph do 10 Billion Nodes? as well)

In our case to handle about 6M nodes needs 30GB of memory on each of 3 Apha’s running in the cluster. This is too much and it increase as amount if data increases.

As shown on the picture below, extreme value was with about 6M nodes in database. Memory dropped when I dropped data from database. This shows that memory is getting up and up as more and more data are in database.

All details and tests can be found here:

Is there possibility to someone take a look of this and do memory profiling or testing on your test environment?

This is show stopper for us as we will have a way more data than 6M and having 200GB of RAM to handle only 20M nodes will be too expensive and non sense.


(Pawan Rawal) #2

Thanks for sharing the well documented test setup @pjolep. We are investigating this and would have an update for you soon.


(Igor Miletic) #3

In addition, might be useful. I realized that mutations are using memory. On the picture you will see A point where memory dropped when turned off mutations (so, same amount of data, same number of queries).

Point B happened when I executed roll restart of all Alpha nodes.


(Daniel Mai) #4

Can you try setting the environment variable GODEBUG=madvdontneed=1 when running the Dgraph binaries?

I asked around in the Gophers #performance Slack channel and was pointed to this open issue report about memory not being released by Go runtime:

Go is releasing memory to the OS, but that isn’t reflected in the resident set size calculations. You can check the estimated memory counted as LazyFree by checking /proc/<pid>/smaps. The Linux documentation for LazyFree memory says this:

The memory isn’t freed immediately with madvise(). It’s freed in memory
pressure if the memory is clean.

Below, you’ll see the memory charts for the same workload to a regular dgraph alpha (blue line) and a GODEBUG=madvdontneed=1 dgraph alpha (orange line). The process memory of the orange line goes down.


(Igor Miletic) #5

After a day of working with GODEBUG=madvdontneed=1 looks like nothing changed, still the memory that is seen as used by Kubernetes is about 6-7 GB higher then real used by Alpha nodes.