Error while writing: write tcp 172.18.0.2:8080->172.27.65.10:58088: write: broken pipe

Error while writing: write tcp 172.18.0.2:8080->172.27.65.10:58088: write: broken pipe

Error while writing: write tcp 172.18.0.2:8080->172.27.65.10:51716: write: connection timed out

Has anyone ever encountered this problem? The installation method is docker

Yes, I do encounter this error. No Idea what is causing it but it leads to up to 1min downtime for our whole production infrastucture.

Anyone help please?

Hi @FaxBoy @maaft
Have you solved it please? I also have this problem now.

Do you run deep Lambda queries? In other words, do your queries trigger Lambda field resolvers which then query other Lambda fields using GraphQL?

Even we are facing a similar issue when a user or an application runs a huge query. Leader alpha service is getting restarted very frequently. When we checked our error log we found the below error messages before the service failure

Dec 13 11:43:30 dgraph: E1213 11:43:30.390352 x.go:354] Error while writing: write tcp 192.168.x.x:xxxx->192.168.x.x:xx965: write: broken pipe

Can anyone help to fix this issue?

I’m going to be a little vague, due to lack of context. I don’t know what you’re using. Local DGraph(Docker? K8s? local binary?)? in the cloud? Using lambdas(how?)? Is that our py client? What are you doing? Context always helps.

It looks like you are encountering a network error while trying to write data to a TCP connection. This error can be caused by a variety of factors, including network congestion, a faulty network connection, or an issue with the destination server running Dgraph. Maybe a container failing?

In general, a “broken pipe” error indicates that data could not be sent or received over the network connection.

One potential solution to this problem is to try increasing the connection timeout value. This will give the server more time to establish a connection before timing out. Additionally, you could try using a different network connection.

If the problem persists, it may be helpful to gather more information about the specific circumstances under which the error is occurring. This could include checking the network logs and monitoring the performance of your server(bare metal or not) to see if there are any patterns or trends that could help identify the root cause of the issue.

If you are still unable to resolve the issue, it may be helpful to consult with a network or system administrator who has experience troubleshooting similar problems. They will be able to provide more specific advice based on the specific details of your situation.

If you are facing a similar issue where the leader alpha service is frequently restarting when a user or application runs a large query, it is possible that you have insufficient resources available to handle the query. For example, if the query is too large or has too many indexes, it may exceed the available resources and cause the cluster to fail. In this situation, the best solution is to shut down the cluster and restart it to recover from the failure. It is also important to ensure that your cluster is properly sized and configured to handle the workload and avoid future failures.

I have detailed replication scenario here. Load Data using Dgraph w/ Kubernetes - #9 by Ann_Zhang

This happens when I was using live loader to load data into Dgraph.

I have uninstalled Dgraph cluster and cleaned all the data bound to all the pods, and re-installed Dgraph cluster on my EKS, but still that the tcp error happened in a infinite loop. However, using curl command to apply schema, then use graphql query to mutate data (insert or update) has no problem. So the problem focus on Live Loader import data.

I notice the same problem with other software like Grafana.

So I would get more or less the same error message:

Jan 29 13:07:38 ubuntu-server grafana[12120]: logger=context xxxxx t=2025-01-29T13:07:38.456408785+01:00 level=error msg=“Error writing to response” err=“write tcp 127.0.0.1:3002->127.0.0.1:37254: write: broken pipe”

Port 3002 is the HTTP port of Grafana. Port 8080 in your case is the HTTP port of Dgraph.

The issue is most likely some connection issue between the service (Dgraph) and the Reverse Proxy, like Nginx or HAProxy.

You can check the connectivity via:

netstat -tnp | grep 8080

An error like broken pipe in this case suggests there is an issue between either the client (browser) or upstream server upstream server unexpectedly closed the connection, while the proxy was busy proxying the data.

Now the root cause is hard to say, it could be an issue in Nginx or HAProxy, it could also be the client that crashes. Or it could be the server that is restarting network or sysctl kernel configs that are reloaded. There are many reasons why such a network pipe can break during the transmission of data.