Subgraph: Segfault in GetUids()

Moved from GitHub dgraph/5472

Posted by jarifibrahim:

What version of Dgraph are you using?

v20.03.2 and cc9795734359d816606fe454ba997b7800a6bdfd

Have you tried reproducing the issue with the latest release?


What is the hardware spec (RAM, OS)?


Steps to reproduce the issue (command/config used to run Dgraph).

Run GitHub - dgraph-io/flock: Twitter on Dgraph and the issues shows up after some time.
The issue might not be reproducible easily.

Update - This issue shows up always within 10 minutes of running flock.

Expected behaviour and actual result.

I expected No crash.

I0519 11:22:52.637364      20 mvcc.go:80] Rolled up 136000 keys
I0519 11:23:01.269323      20 mvcc.go:80] Rolled up 137000 keys
I0519 11:23:10.957573      20 mvcc.go:80] Rolled up 138000 keys
I0519 11:23:17.025191      20 draft.go:1424] Num pending txns: 4
I0519 11:23:17.093727      20 draft.go:892] [0x1] Set Raft progress to index: 135829.
I0519 11:23:25.601400      20 mvcc.go:80] Rolled up 139000 keys
I0519 11:23:39.138569      20 mvcc.go:80] Rolled up 140000 keys
I0519 11:23:45.018406      20 mvcc.go:80] Rolled up 141000 keys
I0519 11:23:50.043304      20 mvcc.go:80] Rolled up 142000 keys
I0519 11:24:00.137204      20 mvcc.go:80] Rolled up 143000 keys
I0519 11:24:05.373182      20 mvcc.go:80] Rolled up 144000 keys
I0519 11:24:17.025183      20 draft.go:1424] Num pending txns: 4
I0519 11:24:17.064221      20 draft.go:892] [0x1] Set Raft progress to index: 140582.
I0519 11:24:21.175987      20 mvcc.go:80] Rolled up 145000 keys
unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x15eefab]

goroutine 862934 [running]:
runtime.throw(0x1b8660b, 0x5)
        /usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc02d602b38 sp=0xc02d602b08 pc=0xa3cff2
        /usr/local/go/src/runtime/signal_unix.go:401 +0x3de fp=0xc02d602b68 sp=0xc02d602b38 pc=0xa534ee*SubGraph).formResult(0xc07e463b00, 0xc049e17c80, 0xc02c090340, 0x0, 0x0)
        /home/ibrahim/Projects/go/src/ +0x18b fp=0xc02d602cc0 sp=0xc02d602b68 pc=0x15eefab*SubGraph).processGroupBy(0xc07e463b00, 0xc03d69e8a0, 0xc094616530, 0x1, 0x2, 0x7f02bd06b940, 0xc02d602d98)
        /home/ibrahim/Projects/go/src/ +0x78 fp=0xc02d602d30 sp=0xc02d602cc0 pc=0x15f0078*SubGraph).valueVarAggregation(0xc07e463b00, 0xc03d69e8a0, 0xc094616530, 0x1, 0x2, 0xc07e463800, 0x0, 0x0)
        /home/ibrahim/Projects/go/src/ +0xe42 fp=0xc02d602fc0 sp=0xc02d602d30 pc=0x16077a2*SubGraph).populatePostAggregation(0xc07e463b00, 0xc03d69e8a0, 0xc094616530, 0x2, 0x2, 0xc07e463800, 0x0, 0x1)
        /home/ibrahim/Projects/go/src/ +0x1a9 fp=0xc02d603020 sp=0xc02d602fc0 pc=0x1607a09*SubGraph).populatePostAggregation(0xc07e463800, 0xc03d69e8a0, 0xc0944e8900, 0x1, 0x1, 0x0, 0x0, 0x1)
        /home/ibrahim/Projects/go/src/ +0xcd fp=0xc02d603080 sp=0xc02d603020 pc=0x160792d*Request).ProcessQuery(0xc02d603530, 0x1de1840, 0xc0624a2810, 0xc049634000, 0x7f02f11ecb28)
        /home/ibrahim/Projects/go/src/ +0xd61 fp=0xc02d603240 sp=0xc02d603080 pc=0x1612cc1*Request).Process(0xc02d603530, 0x1de1840, 0xc0624a2810, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ibrahim/Projects/go/src/ +0x7c fp=0xc02d6032d0 sp=0xc02d603240 pc=0x161323c, 0xc0624a2810, 0xc02ce46320, 0x0, 0x0, 0x14)
        /home/ibrahim/Projects/go/src/ +0x241 fp=0xc02d6037a8 sp=0xc02d6032d0 pc=0x162b431*Server).doQuery(0x2946028, 0x1de1840, 0xc0624a2810, 0xc034b0c540, 0x0, 0x0, 0x0, 0x0)
        /home/ibrahim/Projects/go/src/ +0x4e9 fp=0xc02d6039d0 sp=0xc02d6037a8 pc=0x162a929*Server).Query(0x2946028, 0x1de1840, 0xc0624a2690, 0xc034b0c540, 0x2946028, 0xc0624a2690, 0xc08b1d4a80)
        /home/ibrahim/Projects/go/src/ +0xa6 fp=0xc02d603a20 sp=0xc02d6039d0 pc=0x162a386, 0x2946028, 0x1de1840, 0xc0624a2690, 0xc02ce3d440, 0x0, 0x1de1840, 0xc0624a2690, 0xc05d810480, 0x166)
        /home/ibrahim/Projects/go/pkg/mod/ +0x217 fp=0xc02d603a90 sp=0xc02d603a20 pc=0xf519b7*Server).processUnaryRPC(0xc000484160, 0x1df08e0, 0xc04bd3fb00, 0xc0384de000, 0xc03ee42960, 0x272ad38, 0x0, 0x0, 0x0)
        /home/ibrahim/Projects/go/pkg/mod/ +0x460 fp=0xc02d603e18 sp=0xc02d603a90 pc=0xf344c0*Server).handleStream(0xc000484160, 0x1df08e0, 0xc04bd3fb00, 0xc0384de000, 0x0)
        /home/ibrahim/Projects/go/pkg/mod/ +0xd97 fp=0xc02d603f48 sp=0xc02d603e18 pc=0xf38427*Server).serveStreams.func1.1(0xc04c65ed50, 0xc000484160, 0x1df08e0, 0xc04bd3fb00, 0xc0384de000)
        /home/ibrahim/Projects/go/pkg/mod/ +0xbb fp=0xc02d603fb8 sp=0xc02d603f48 pc=0xf4538b
        /usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc02d603fc0 sp=0xc02d603fb8 pc=0xa6dc81
created by*Server).serveStreams.func1
        /home/ibrahim/Projects/go/pkg/mod/ +0xa1

jarifibrahim commented :

Similar failure

I0519 11:43:17.029347      20 draft.go:892] [0x2] Set Raft progress to index: 149153.
I0519 11:43:28.537112      20 mvcc.go:80] Rolled up 221000 keys
I0519 11:43:32.778882      20 mvcc.go:80] Rolled up 222000 keys
I0519 11:43:34.646335      20 mvcc.go:80] Rolled up 223000 keys
unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x15eef88]

goroutine 1089670 [running]:
runtime.throw(0x1b8660b, 0x5)
        /usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc001358b38 sp=0xc001358b08 pc=0xa3cff2
        /usr/local/go/src/runtime/signal_unix.go:401 +0x3de fp=0xc001358b68 sp=0xc001358b38 pc=0xa534ee*List).GetUids(...)
        /home/ibrahim/Projects/go/src/*SubGraph).formResult(0xc050825200, 0xc06d3aaec0, 0xc05442d3e0, 0x0, 0x0)
        /home/ibrahim/Projects/go/src/ +0x168 fp=0xc001358cc0 sp=0xc001358b68 pc=0x15eef88*SubGraph).processGroupBy(0xc050825200, 0xc03a10de30, 0xc052989cd0, 0x1, 0x2, 0x7f1f001248e0, 0xc001358d98)
        /home/ibrahim/Projects/go/src/ +0x78 fp=0xc001358d30 sp=0xc001358cc0 pc=0x15f0078*SubGraph).valueVarAggregation(0xc050825200, 0xc03a10de30, 0xc052989cd0, 0x1, 0x2, 0xc050824f00, 0x0, 0x0)
        /home/ibrahim/Projects/go/src/ +0xe42 fp=0xc001358fc0 sp=0xc001358d30 pc=0x16077a2*SubGraph).populatePostAggregation(0xc050825200, 0xc03a10de30, 0xc052989cd0, 0x2, 0x2, 0xc050824f00, 0x0, 0x1)
        /home/ibrahim/Projects/go/src/ +0x1a9 fp=0xc001359020 sp=0xc001358fc0 pc=0x1607a09*SubGraph).populatePostAggregation(0xc050824f00, 0xc03a10de30, 0xc0529ee9f0, 0x1, 0x1, 0x0, 0x0, 0x1)
        /home/ibrahim/Projects/go/src/ +0xcd fp=0xc001359080 sp=0xc001359020 pc=0x160792d*Request).ProcessQuery(0xc001359530, 0x1de1840, 0xc0a6fc61e0, 0xc054d78000, 0x7f1f9f4d0108)
        /home/ibrahim/Projects/go/src/ +0xd61 fp=0xc001359240 sp=0xc001359080 pc=0x1612cc1*Request).Process(0xc001359530, 0x1de1840, 0xc0a6fc61e0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/ibrahim/Projects/go/src/ +0x7c fp=0xc0013592d0 sp=0xc001359240 pc=0x161323c, 0xc0a6fc61e0, 0xc00130e280, 0x0, 0x0, 0x14)
        /home/ibrahim/Projects/go/src/ +0x241 fp=0xc0013597a8 sp=0xc0013592d0 pc=0x162b431*Server).doQuery(0x2946028, 0x1de1840, 0xc0a6fc61e0, 0xc07f834150, 0x0, 0x0, 0x0, 0x0)
        /home/ibrahim/Projects/go/src/ +0x4e9 fp=0xc0013599d0 sp=0xc0013597a8 pc=0x162a929*Server).Query(0x2946028, 0x1de1840, 0xc0a9b0ff50, 0xc07f834150, 0x2946028, 0xc0a9b0ff50, 0xc007f2ca80)
        /home/ibrahim/Projects/go/src/ +0xa6 fp=0xc001359a20 sp=0xc0013599d0 pc=0x162a386, 0x2946028, 0x1de1840, 0xc0a9b0ff50, 0xc02793e420, 0x0, 0x1de1840, 0xc0a9b0ff50, 0xc080a0c780, 0x166)
        /home/ibrahim/Projects/go/pkg/mod/ +0x217 fp=0xc001359a90 sp=0xc001359a20 pc=0xf519b7*Server).processUnaryRPC(0xc0003f0f20, 0x1df08e0, 0xc047611980, 0xc041df2d00, 0xc007b6df20, 0x272ad38, 0x0, 0x0, 0x0)
        /home/ibrahim/Projects/go/pkg/mod/ +0x460 fp=0xc001359e18 sp=0xc001359a90 pc=0xf344c0*Server).handleStream(0xc0003f0f20, 0x1df08e0, 0xc047611980, 0xc041df2d00, 0x0)
        /home/ibrahim/Projects/go/pkg/mod/ +0xd97 fp=0xc001359f48 sp=0xc001359e18 pc=0xf38427*Server).serveStreams.func1.1(0xc04800c6b0, 0xc0003f0f20, 0x1df08e0, 0xc047611980, 0xc041df2d00)
        /home/ibrahim/Projects/go/pkg/mod/ +0xbb fp=0xc001359fb8 sp=0xc001359f48 pc=0xf4538b
        /usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc001359fc0 sp=0xc001359fb8 pc=0xa6dc81
created by*Server).serveStreams.func1
        /home/ibrahim/Projects/go/pkg/mod/ +0xa1


jarifibrahim commented :

I saw this crash again while running flock. Alpha logs alpha subgraph panic · GitHub

jarifibrahim commented :

I ran dgraph with race detector but that didn’t detect any races.

jarifibrahim commented :

I ran dgraph with --badger.tables=disk --badger.vlog=disk and with badger cache disabled (in alpha and the reindexing code). Alpha still crashed. This seems to be a dgraph issue.

jarifibrahim commented :

I did multiple experiments today and I have come to the conclusion that this is an issue with my local setup. I tried reproducing the crash on multiple computers but it only crashes on one computer. I tried on 4 different computers and the tests fails on only one of them.
On the faulty machine, the crash was also seen on old commits such as Rename z package to testutil. (#3730) · dgraph-io/dgraph@60d1c13 · GitHub .

Since this looks like an issue with my local setup and not with dgraph, I’m going to lower the priority.