E0929 12:34:58.529130 1 groups.go:322] Error while proposing node removal: Node 0x4 not part of group
github.com/dgraph-io/dgraph/conn.(*Node).ProposePeerRemoval
/tmp/go/src/github.com/dgraph-io/dgraph/conn/node.go:594
github.com/dgraph-io/dgraph/worker.(*groupi).applyState.func1
/tmp/go/src/github.com/dgraph-io/dgraph/worker/groups.go:320
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1337
Hm, this was possible to reproduce when you run some heavy query that will use all memory of the node.
We learned to be careful with memory that Alpha nodes use.
At the moment only I can suggest you o try something like:
run alpha nodes with lower memory (e.g. 4GB)
write some query with (e.g. with @recurse that will use all available memory)
In this cases our Kubernetes was not able to properly recover dgraph.
This issue was created when we were testing with our realtime data, then we realized that DGraph is not stable when it kick memory limits from any reason. Now we are trying to avoid it.
Try this, this is the best I can help at the moment, since we killed environment where we had this issue.
@jarifibrahim I am not even sure if this stack trace was copied in its entirety.
Segmentation faults always print the stack. There are two segmentation faults but only one stacktrace. Also, for each element in the stacktrace, the method and line are printed in that order. But for the last element, there’s only the method name. So it doesn’t look like this stack trace is complete.
@igormiletic Would you happen to have kept the entire logs for this alpha? Thanks.