Seems to be a memory leak

Report a Dgraph Bug

What version of Dgraph are you using?

20.07.1-rc1

Have you tried reproducing the issue with the latest release?

already the latest release

What is the hardware spec (RAM, OS)?

40 GB of RAM on Debian 9

Steps to reproduce the issue (command/config used to run Dgraph).

I have a 3 node cluster on 3 machines (a zero + an alpha in each node), running with docker-compose,
i try to add some data with GraphQL API, in my case a User model with a friendship relation with itself,
the schema of User model is something like this:

type User {
    id: String! @id
    username: String @search(by: [hash])
    friends: [User]
}

Expected behaviour and actual result.

each addUser query, increases alpha memory usage significantly, so that after less that 1000 queries, alpha container crashes with “out of memory” error.

@mbande , could you provide details on how you are adding the users into the cluster? Is it curl, dgraph client or something else.

Also, as I understand, this is a 3 container setup, with each container having 1 zero and 1 alpha.
Please also share the docker-compose or equivalent you used to bring up the cluster. I am looking to see if we can reproduce this on our end.

@Paras I’m using python requests like this:

def run_graphql(query, variables=None):
    request_body = dict({'query': query})
    if variables:
        request_body['variables'] = variables
    result = requests.post(
        url=dgraph_alpha_url,
        headers={'Content-type': 'application/json'},
        json=request_body
    )
    result.raise_for_status()
    return result.json()

and this is docker-compose.yml:

version: "3.2"
services:
  zero:
    image: dgraph/dgraph:v20.07.1-rc1
    volumes:
      - /data/dgraph/v20.07:/dgraph
    ports:
      - 5080:5080
      - 6080:6080
    command: dgraph zero --my=dgraph-2:5080 --replicas 1 --peer dgraph-1 --idx=2
  alpha:
    image: dgraph/dgraph:v20.07.1-rc1
    volumes:
      - /data/dgraph/v20.07:/dgraph
    ports:
      - 7080:7080
      - 8080:8080
      - 9080:9080
    depends_on:
      - zero
    command: dgraph alpha --my=dgraph-2:7080 --lru_mb=8192 --zero=dgraph-1:5080,dgraph-2:5080,dgraph-3:5080

it seems that memory consumption, is directly related with input size.
when i add a user with all of his friends (e.g. 500 friends) in one query, memory usage grows endlessly.
but when i add user friends chunk by chunk (e.g. of size 10), memory usage stops at some point.

@Paras @martinmr Can you guys replicate this? If you can, @ibrahim and team can look at it in tomorrow morning’s session.

This appears similar to this issue: https://dgraph.atlassian.net/browse/DGRAPH-1898

In that issue, the write-ahead log grows with the mutations. Maybe bigger mutations cause a further impact in the size of the write-ahead log or make it harder to clean it up. I’ll try to reproduce the issue.

Yeah, it appears to be that. I see similar behavior in 20.07.1-rc1 with each mutation having 5000 unique nodes. But in the head of the release/v20.07 branch the memory consumption is more stable and does not seem to increase linearly with the number of mutations. In this branch we are loading the write-ahead log in mmap mode by default.

@martinmr i add --badger.vlog=mmap to alpha config, but the problem exists yet. even setting --badger.vlog=disk does not help.

That’s setting the value for the p directory. The latest commit on that branch has a way to set the value for the w directory.

I checked out v20.07 branch and build it, then binded the generated dgraph binary in docker container.
the new container is much much better in memory usage. thank you all and keep up good work :slight_smile:

memory usage history after about 2 hours of continuous mutations:
Screenshot from 2020-09-16 14-36-30
as you can see, memory leak would prevent alpha to run for a long time and it will crash for OOM.
@martinmr is there any flag that can improve this? e.g. --badger.vlog=disk or something like that.
or i should let alpha crash and then restart it.

i checked both --badger.vlog=disk and --badger.tables=disk as alpha flag, without any improvement sign.

did you try the default --badger.tables=mmap,mmap?

Can you provide us with a memory profile? See instructions at https://dgraph.io/docs/howto/retrieving-debug-information/#memory-profile

Would be good to run this on master, because we have all the jemalloc optimizations there.

@mbande It would be great if you could try this in master. We have made a lot of memory optimizations including manual allocations since golang’s memory management is not aggressive enough.

Once this PR (https://github.com/dgraph-io/dgraph/pull/6463) is merged, running make install will download jemalloc and compile Dgraph with it.

I’ll try to do a longer test on my own. I didn’t see a big memory increase yesterday but I also didn’t run the cluster for a long period of time.

Ran the cluster on master using more mutations and I can reproduce the behavior. It looks like the issue is with zero. Below is the output of the memory profile:

 go tool pprof -source_path /home/martinmr/go/src/github.com/dgraph-io -trim_path / http://localhost:6180/debug/pprof/heap                                                        <<<

Fetching profile over HTTP from http://localhost:6180/debug/pprof/heap
Saved profile in /home/martinmr/pprof/pprof.dgraph.alloc_objects.alloc_space.inuse_objects.inuse_space.032.pb.gz
File: dgraph
Build ID: df33deaa2cbf50e81c6b75cafb82e0beea5e4429
Type: inuse_space
Time: Sep 16, 2020 at 1:02pm (PDT)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 2014.61MB, 99.54% of 2023.94MB total
Dropped 82 nodes (cum <= 10.12MB)
Showing top 10 nodes out of 18
      flat  flat%   sum%        cum   cum%
 1931.40MB 95.43% 95.43%  1931.40MB 95.43%  github.com/dgraph-io/dgraph/dgraph/cmd/zero.(*Oracle).commit
   83.20MB  4.11% 99.54%    83.20MB  4.11%  github.com/dgraph-io/badger/v2/skl.newArena
         0     0% 99.54%    83.20MB  4.11%  github.com/dgraph-io/badger/v2.Open
         0     0% 99.54%    83.20MB  4.11%  github.com/dgraph-io/badger/v2/skl.NewSkiplist
         0     0% 99.54%    83.20MB  4.11%  github.com/dgraph-io/dgraph/dgraph/cmd.Execute
         0     0% 99.54%  1935.45MB 95.63%  github.com/dgraph-io/dgraph/dgraph/cmd/zero.(*Server).CommitOrAbort
         0     0% 99.54%  1935.45MB 95.63%  github.com/dgraph-io/dgraph/dgraph/cmd/zero.(*Server).commit
         0     0% 99.54%    83.20MB  4.11%  github.com/dgraph-io/dgraph/dgraph/cmd/zero.init.0.func1
         0     0% 99.54%    83.20MB  4.11%  github.com/dgraph-io/dgraph/dgraph/cmd/zero.run
         0     0% 99.54%  1935.45MB 95.63%  github.com/dgraph-io/dgraph/protos/pb._Zero_CommitOrAbort_Handler
(pprof) list commit
Total: 1.98GB
ROUTINE ======================== github.com/dgraph-io/dgraph/dgraph/cmd/zero.(*Oracle).commit in dgraph/dgraph/cmd/zero/oracle.go
    1.89GB     1.89GB (flat, cum) 95.43% of Total
         .          .    128:           ki, err := strconv.ParseUint(k, 36, 64)
         .          .    129:           if err != nil {
         .          .    130:                   glog.Errorf("Got error while parsing conflict key %q: %v\n", k, err)
         .          .    131:                   continue
         .          .    132:           }
    1.89GB     1.89GB    133:           o.keyCommit[ki] = src.CommitTs // CommitTs is handed out before calling this func.
         .          .    134:   }
         .          .    135:   return nil
         .          .    136:}
         .          .    137:
         .          .    138:func (o *Oracle) currentState() *pb.OracleDelta {

the keyCommit map seems to grow without bound. @mrjn I believe you already noticed this issue before. Any ideas on how to deal with this issue?

@mbande Feel free to try to reproduce the issue on your own but I think we have enough info on our side. I believe this is an issue regardless of the size of the transactions but smaller transactions might just mask this issue better.

So I looked into how this map is populated. It’s populated with each transaction and it’s purged when a snapshot happens. Snapshots are controlled via the --snapshot_after flag which takes the number of transactions at which to take a snapshot. When your mutations are very big, this map fills up faster and takes more time to be purged, which is what @mbande observed.

So there two workarounds:

  • Have smaller mutations. Having smaller mutations is recommended anyways as it reduces the chances of conflicts.
  • Change the value of --snapshot_after to a smaller value so that the map is cleared more often.
1 Like

@martinmr as it is documented that the default value for --snapshot_after is 5000 transactions, i do more qraphql mutations with no decrease in memory usage, is this expected?

We could implement an mmap based balanced binary search tree, instead of using Go map. I’m doing something similar for xidmap, implementing a trie in my branch mrjn/trie.

@martinmr: Would be a fun project for you to pick up – brush up on your CS fundamentals.

I upgraded to master version and started it with default options, with no substantial change in memory usage pattern:
Screenshot from 2020-09-21 10-03-21