Index out of range on goroutine fanout for UID handling

Moved from GitHub dgraph/4207

Posted by kkessler:

What version of Dgraph are you using?

v1.1.0

Have you tried reproducing the issue with the latest release?

No

What is the hardware spec (RAM, OS)?

kubernetes on GCP - large nodes but that should not matter.

Steps to reproduce the issue (command/config used to run Dgraph).

default setup from helm chart in contrib/ - no overrides used.
do a search ala:

{
  query(func: eq(projectid,"blah")) @filter(has(myfield) AND match(name,"MyCluster1",8)) {
    expand(_all_)
  }
}

Expected behaviour and actual result.

expected: returns nodes that match the query and filter
actual:

panic: runtime error: index out of range

goroutine 1744 [running]:
github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings.func1(0x0, 0x1b, 0xc011f67b00, 0x9d9dc6)
	/tmp/go/src/github.com/dgraph-io/dgraph/worker/task.go:614 +0x15cc
github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings.func2(0xc011ba13e0, 0xc011bc8500, 0x0, 0x1b)
	/tmp/go/src/github.com/dgraph-io/dgraph/worker/task.go:691 +0x3a
created by github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings
	/tmp/go/src/github.com/dgraph-io/dgraph/worker/task.go:690 +0x3b3

One interesting thing to note is I only will get this to blow up 1/x times - which probably means when the API call hits one of the specific alpha servers. Its possible its environmental to the state of a server but I figured its a bug no matter what since we probably should not be panicking.

martinmr commented :

Can you share the exact commit that was used to build your version of Dgraph? It should be printed when you start an alpha. Otherwise it’s hard to try to debug this using the stacktrace you provided.

kkessler commented :

Certainly: (its just the v1.1.0 docker container downloaded from docker.io - here is the output:

[Decoder]: Using assembly version of decoder
I1028 04:00:26.228713       1 init.go:98]

Dgraph version   : v1.1.0
Dgraph SHA-256   : 7d4294a80f74692695467e2cf17f74648c18087ed7057d798f40e1d3a31d2095
Commit SHA-1     : ef7cdb28
Commit timestamp : 2019-09-04 00:12:51 -0700
Branch           : HEAD
Go version       : go1.12.7

martinmr commented :

It looks like the n function argument is not calculated properly, leading to the panic.
The proper solution should be decoupling the multiple uses of the n argument so that the logic is clearer and find where it’s not computed appropriately.

@kkessler Is it possible to get a minimum dataset from you that reproduces this issue? We’ll try from our end but having the data that causes the existing error would be of great use. Thanks.