Panic: concurrent map read+write on v20.11.2

Report a Dgraph Bug

What version of Dgraph are you using?

Dgraph Version
$ dgraph version
[Decoder]: Using assembly version of decoder
Page Size: 4096

Dgraph version   : v20.11.2
Dgraph codename  : tchalla-2
Dgraph SHA-256   : 0153cb8d3941ad5ad107e395b347e8d930a0b4ead6f4524521f7a525a9699167
Commit SHA-1     : 94f3a0430
Commit timestamp : 2021-02-23 13:07:17 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

Have you tried reproducing the issue with the latest release?

it is the latest stable release - but have not tried it on master.

What is the hardware spec (RAM, OS)?

kubernetes(gke) 3 alphas, 3 groups - 20core, 30GB ram each, 512Gi ssds.

Steps to reproduce the issue (command/config used to run Dgraph).

normal helm chart install.

I found this during normal operation - thought I should report it. I am seeing this panic on my testing system, across 2 of my 3 alphas (each in their own group of 1)

fatal error: concurrent map iteration and map write

goroutine 4068994 [running]:
runtime.throw(0x1db1ab4, 0x26)
        /usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc065a49930 sp=0xc065a49900 pc=0xa60d32
        /usr/local/go/src/runtime/map.go:853 +0x554 fp=0xc065a499b0 sp=0xc065a49930 pc=0xa3a0b4
runtime.mapiterinit(0x1bbaf20, 0xc06a2a31a0, 0xc065a49a88)
        /usr/local/go/src/runtime/map.go:843 +0x1c5 fp=0xc065a499d0 sp=0xc065a499b0 pc=0xa39a65, 0x18, 0x18, 0xc000101800, 0xffffffffffffffff, 0x0, 0x0, 0x0)
        /ext-go/1/src/ +0x1fc fp=0xc065a49b18 sp=0xc065a499d0 pc=0x16227fc
        /ext-go/1/src/*incrRollupi).rollUpKey(0x2bb6c40, 0xc06ff52920, 0xc06a2c24a0, 0x18, 0x18, 0x0, 0x0)
        /ext-go/1/src/ +0x6f fp=0xc065a49be0 sp=0xc065a49b18 pc=0x162042f*incrRollupi).Process.func1(0xc06cf18460, 0x1)
        /ext-go/1/src/ +0x24e fp=0xc065a49cc8 sp=0xc065a49be0 pc=0x1628c8e*incrRollupi).Process(0x2bb6c40, 0xc05d273c20)
        /ext-go/1/src/ +0x4f1 fp=0xc065a49fd0 sp=0xc065a49cc8 pc=0x1620db1
        /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc065a49fd8 sp=0xc065a49fd0 pc=0xa9a721
created by*node).startTask
        /ext-go/1/src/ +0x4ee

forgot to mention, I was playing with the cache settings - here is what cache settings were used, if that matters to this:

    cache_mb: 6144
    cache_percentage: 30,50,20,0

So wait you were starting Dgraph using k8s, and it just… failed?

well I assume it failed in response to being used. I am inserting data to it using dgo.

Seems like a bug. @hardik

This should be resolved by fix(postingList): Acquire lock before reading the cached posting list by jarifibrahim · Pull Request #7632 · dgraph-io/dgraph · GitHub

awesome, thanks!

1 Like