Report a Dgraph Bug
What version of Dgraph are you using?
We are using v20.03.4 in production but we have also tried the master branch docker image as well as the newer v20.03 releases.
Have you tried reproducing the issue with the latest release?
Yes
What is the hardware spec (RAM, OS)?
In production when the issue was observed we were running 3 dgraph alpha nodes and 1 zero node each with 256 GB of memory running CentOS 7.
Steps to reproduce the issue (command/config used to run Dgraph).
dgraph version information:
Dgraph version : v2.0.0-rc1-726-g0c7f9a426
Dgraph codename :
Dgraph SHA-256 : be551d773bca16181fb117444c2153ac1651bf0460577b4b321f0f5f70b40867
Commit SHA-1 : 0c7f9a426
Commit timestamp : 2020-09-17 11:27:36 -0700
Branch : master
Go version : go1.14.1
jemalloc enabled : false
docker-compose.yml file:
version: "3.2"
services:
zero:
image: dgraph/dgraph:${DGRAPH_VERSION}
container_name: dg_zero
logging:
options:
max-size: ${DOCKER_MAX_LOG_SIZE}
volumes:
- type: bind
source: /mnt/tmp/dgraph
target: /dgraph
ports:
- 5080:5080
- 6080:6080
networks:
- net3
restart: on-failure
command: dgraph zero --my=zero:5080 --rebalance_interval ${DGRAPH_REBALANCE_INTERVAL} --v ${DGRAPH_LOG_LEVEL}
alpha1:
image: dgraph/dgraph:${DGRAPH_VERSION}
container_name: dg_alpha1
logging:
options:
max-size: ${DOCKER_MAX_LOG_SIZE}
volumes:
- type: bind
source: /mnt/tmp/dgraph
target: /dgraph
ports:
- 8080:8080
- 9080:9080
networks:
- net3
restart: on-failure
command: dgraph alpha --my=alpha1:7080 --lru_mb=${DGRAPH_LRU_MB} --zero=zero:5080 --idx=1 -p p1 -w w1 --v ${DGRAPH_LOG_LEVEL} --whitelist 0.0.0.0/0
alpha2:
image: dgraph/dgraph:${DGRAPH_VERSION}
container_name: dg_alpha2
logging:
options:
max-size: ${DOCKER_MAX_LOG_SIZE}
volumes:
- type: bind
source: ${DGRAPH_DATA_PATH}
target: /dgraph
ports:
- 8081:8081
- 9081:9081
networks:
- net3
restart: on-failure
command: dgraph alpha --my=alpha2:7081 --lru_mb=${DGRAPH_LRU_MB} --zero=zero:5080 -o 1 --idx=2 -p p2 -w w2 --v ${DGRAPH_LOG_LEVEL} --whitelist 0.0.0.0/0
alpha3:
image: dgraph/dgraph:${DGRAPH_VERSION}
container_name: dg_alpha3
logging:
options:
max-size: ${DOCKER_MAX_LOG_SIZE}
volumes:
- type: bind
source: ${DGRAPH_DATA_PATH}
target: /dgraph
ports:
- 8082:8082
- 9082:9082
networks:
- net3
restart: on-failure
command: dgraph alpha --my=alpha3:7082 --lru_mb=${DGRAPH_LRU_MB} --zero=zero:5080 -o 2 --idx=3 -p p3 -w w3 --v ${DGRAPH_LOG_LEVEL} --whitelist 0.0.0.0/0
ratel:
image: dgraph/dgraph:${DGRAPH_VERSION}
container_name: dg_ratel
logging:
options:
max-size: ${DOCKER_MAX_LOG_SIZE}
volumes:
- type: bind
source: ${DGRAPH_DATA_PATH}
target: /dgraph
ports:
- 18000:8000
networks:
- net3
command: dgraph-ratel
networks:
net3:
.env file:
DOCKER_MAX_LOG_SIZE=50m
DGRAPH_VERSION=master
DGRAPH_LRU_MB=2048
DGRAPH_REBALANCE_INTERVAL=8m
DGRAPH_LOG_LEVEL=2
DGRAPH_DATA_PATH=/mnt/tmp/dgraph
Expected behavior and actual result.
We were expecting to be able to generate a large graph and then continually mutate and query the graph afterwards. In production the graph was created using the bulk loader
and 3 alpha nodes were initialized using this data. After the alpha nodes are started we expected to be able to mutate and query this graph without issue. We created a stress test to reproduce the memory issues we were observing in production.
Using the docker-compose.yml
file and associated .env
file above, along with our dgraph-stress-test code (GitHub - JoelWesleyReed/dgraph-stress-test: Test dgraph scalability by loading synthetic data.), to generate a large graph will eventually result in one of the alpha nodes to be killed by the OS due to excessive memory usage. We have been able to capture a pprof heap file shortly before the OS kills the process and it is included below. If the stress test code is stopped and dgraph is restarted, alpha memory utilization is initially low, but as queries are executed via ratel, memory utilization again rises until dgraph becomes non-responsive and an alpha is eventually killed by the OS.
In our production stack each alpha node has 256 GB of memory availabe and dgraph seems to be consuming all of it even though the heap profile is not reporting more than ~100 GB of usage. From our prometheus / grafana profiling data we can see that go_memstats_heap_sys_bytes
reports ~214 GB right before the alpha node crashes and by observing memory usage through htop
and the process_resident_memory_bytes
profiling the RES memory does not appear to align with the heap profiling because the RES memory is much higher (~221 GB) and consuming most of the systems resources. We have been unable to determine what is causing so much memory to be used during mutations. Each mutation has ~1000 quads with many of them referencing an upsert query including in the transaction.
pprof top20 and list:
(pprof) top20
Showing nodes accounting for 78801.12MB, 99.20% of 79436.92MB total
Dropped 170 nodes (cum <= 397.18MB)
Showing top 20 nodes out of 24
flat flat% sum% cum cum%
75612.27MB 95.19% 95.19% 75612.27MB 95.19% github.com/dgraph-io/dgraph/posting.(*List).Uids.func1
1431.72MB 1.80% 96.99% 2181.75MB 2.75% github.com/dgraph-io/badger/v2/pb.(*TableIndex).Unmarshal
750.03MB 0.94% 97.93% 750.03MB 0.94% github.com/dgraph-io/badger/v2/pb.(*BlockOffset).Unmarshal
733.22MB 0.92% 98.85% 733.22MB 0.92% github.com/dgraph-io/dgraph/protos/pb.(*UidBlock).Unmarshal
240.38MB 0.3% 99.16% 973.60MB 1.23% github.com/dgraph-io/dgraph/protos/pb.(*UidPack).Unmarshal
32.50MB 0.041% 99.20% 1452.66MB 1.83% github.com/dgraph-io/dgraph/posting.(*pIterator).next
0.50MB 0.00063% 99.20% 974.10MB 1.23% github.com/dgraph-io/dgraph/protos/pb.(*PostingList).Unmarshal
0.50MB 0.00063% 99.20% 2182.79MB 2.75% github.com/dgraph-io/badger/v2/table.OpenTable
0 0% 99.20% 974.10MB 1.23% github.com/dgraph-io/badger/v2.(*Item).Value
0 0% 99.20% 2182.79MB 2.75% github.com/dgraph-io/badger/v2.newLevelsController.func1
0 0% 99.20% 2181.75MB 2.75% github.com/dgraph-io/badger/v2/pb.(*TableIndex).XXX_Unmarshal
0 0% 99.20% 2181.75MB 2.75% github.com/dgraph-io/badger/v2/table.(*Table).initBiggestAndSmallest
0 0% 99.20% 2181.75MB 2.75% github.com/dgraph-io/badger/v2/table.(*Table).initIndex
0 0% 99.20% 2181.75MB 2.75% github.com/dgraph-io/badger/v2/table.(*Table).readTableIndex
0 0% 99.20% 77064.92MB 97.01% github.com/dgraph-io/dgraph/posting.(*List).Uids
0 0% 99.20% 77064.92MB 97.01% github.com/dgraph-io/dgraph/posting.(*List).iterate
0 0% 99.20% 974.60MB 1.23% github.com/dgraph-io/dgraph/posting.(*List).readListPart
0 0% 99.20% 989.64MB 1.25% github.com/dgraph-io/dgraph/posting.(*pIterator).moveToNextPart
0 0% 99.20% 989.64MB 1.25% github.com/dgraph-io/dgraph/posting.(*pIterator).moveToNextValidPart
0 0% 99.20% 974.10MB 1.23% github.com/dgraph-io/dgraph/posting.unmarshalOrCopy
(pprof) list Uids.func1
Total: 77.58GB
ROUTINE ======================== github.com/dgraph-io/dgraph/posting.(*List).Uids.func1 in dgraph/posting/list.go
73.84GB 73.84GB (flat, cum) 95.19% of Total
. . 1108: return out, nil
. . 1109: }
. . 1110:
. . 1111: err := l.iterate(opt.ReadTs, opt.AfterUid, func(p *pb.Posting) error {
. . 1112: if p.PostingType == pb.Posting_REF {
73.84GB 73.84GB 1113: res = append(res, p.Uid)
. . 1114: }
. . 1115: return nil
. . 1116: })
. . 1117: l.RUnlock()
. . 1118: if err != nil {
(pprof) quit