GRPC Error on many simultaneous connections

Moved from GitHub dgraph-js/109

Posted by paulrostorp:

Hi all,
I’m getting Error: 14 UNAVAILABLE: Received RST_STREAM with error code 7

When running a high number of parallel queries using the dgraph-js client.
I get no errors with a few simultaneous queries, but past a certain threshold (occurs above ~= 600-1200) the transactions return the above error.

My dgraph cluster is running at very low resource utilization (i.e not even close to being overloaded) and there is nothing of use in the alpha’s logs…

I’m using dgraph-js@2.1.0, my dgraph cluster is v2.0.0-rc1

Do you have any ideas how to debug this ?

paulrostorp commented :

I setup another cluster with dgraph:v1.2.1 and the error is still occurring the same. I also setup tracing and no errors in jaeger…

Still nothing in the Alpha logs though…

paulrostorp commented :

Further experimenting led me to discover that if I run my node program as multiple parallel processes, each run 500 simultaneous queries, everything works perfectly.
I literally just did this:
yarn run dev & yarn run dev & yarn run dev & yarn run dev & yarn run dev & yarn run dev

So this leads me to believe that this has something to do with a node limit for grpc…

paulrostorp commented :

Tracing for the error shows:

paulrostorp commented :

I enabled GRPC logs on server (GRPC_GO_LOG_SEVERITY_LEVEL=info
GRPC_GO_LOG_VERBOSITY_LEVEL=2) and now I see errors :

INFO: 2020/03/06 12:40:16 transport: returning. connection error: desc = "transport is closing"

shynome commented :

from Error: 8 RESOURCE_EXHAUSTED: Bandwidth exhausted · Issue #1158 · grpc/grpc-node · GitHub

After moving from node 12 to node 13, we no longer have the issue.

@paulrostorp maybe you can have a try