Mutation failed because Dgraph execution failed because : context deadline exceeded

Report a GraphQL Bug

What edition and version of Dgraph are you using?

Edition:

  • SlashGraphQL
  • Dgraph (community edition/Dgraph Cloud)

If you are using the community edition or enterprise edition of Dgraph, please list the version:

Dgraph Version
$ dgraph version
 
PASTE YOUR RESULTS HERE

Have you tried reproducing the issue with the latest release?

Steps to reproduce the issue (paste the query/schema if possible)

Ran a simple mutation:

mutation setHomepage {
  addUserSetting(input:[{
    name:"someName"
    value:"someValue"
    user:{username:"myUserName"}
  }]) {
    numUids
  }
}

Expected behaviour and actual result.

Expected mutation to run instead got the response:

mutation addUserSetting failed because Dgraph execution failed because : context deadline exceeded

Waited just a little bit and ran the same thing again with no problems. :man_shrugging:

I expected my shared instance to be more reliable after the upgrade to a HA shared instance than a standalone instance.

This context deadline exceeded makes no sense to me. Can anybody explain what this error even means?

2 Likes

I’m also getting this error in mutations, particularly if triggering it multiple times over a small period of time. The weird thing is Apollo isn’t even capable of catching the error, so my React app just crashes. I’m wondering if this has to do with the number of concurrent mutations performed on a specific node id or if it just can’t handle many mutations at the same time (which is worrisome since I plan on serving my app to multiple users). I’m using a shared instance by the way.

1 Like

This error occurs if the db client does not get an answer from the database in the expected time frame.
See here for what a context is.

The described error is very common on my local instance when I initially (live) load a dgraph database with a large number of mutations. I found that it usually happens when dgraph is doing log compaction and thus can not react the client requests.

@dmai what is your take on this? Is the problem due to the high load when myself and another here have not even launched our apps publicly and seeing this now?

Is it because of the load by others on our shared instance?

Is there any throttling recommendations or load that is too high for a shared cloud setup?

It is hard to know how high the load even is when the metrics on the cloud UI show very little details and to get more metrics one has to pay big bucks for dedicated HA

2 Likes

If it’s helpful for anyone: I ran into this same error when testing querying an Interface entity (without also requesting the related Type that uses that Interface). Only the first query returned an error. The second query was fine.

interface TestShape {
    id: ID!
    shape: String!
}

interface TestColor {
    id: ID!
    color: String!
}

type TestFigure implements TestShape & TestColor {
    id: ID!
    shape: String!
    color: String!
    size: Int!
}

# Triggered The Error
# Querying an Interface, not a Type
# And also not returning the optional TestFigure type (like query below) within the Interface query
query GetTestColor {
  queryTestColor {
    __typename
    color
    id
  }
}
# Did not trigger the error
# Querying the Interface while also requesting the TestFigure Type
query GetTestColor {
  queryTestColor {
    __typename
    color
    id
    ... on TestFigure {
      __typename
      id
      color
      shape
      size
    }
  }
}

I don’t know what is going on behind the scenes, but I can imagine that, in this specific scenario, since I’m not querying a specific Type directly, Dgraph is having to do some computations/lookups/something through the entire database to make the connections needed for this Interface query - and that there is a timeout…

…but only the first time. Second time = relationships are stored and the query is fine and quick.

So did you ever found a solution for this?
What is the solution for this and what’s really going on ?
I’m getting this every now and then, after I just jumped into the shared instance plan, and we are about to go live we our application

Nope. Still a current issue.

@amaster507 do you know if you can do Read Only and Best Effort via GraphQL? or if it is already being done?

GraphQL depends upon transaction consistency. If you disable all mutations you might be able to configure read only under the hood

Hummm, that’s a mutation issue. So theres no relation with what I was thinking.

In that case maybe we should have a “retry” method. But there’s a problem, it asks for a “numUids” response. It would not be possible to do this just in time. Not sure…

That really feels an issue about resources.As GraphQL requires an immediate response, it is pending. It’s not “async”. Did you check if the mutation goes through? if the data was written.

Possible Causes:

  • Network Latency
  • Firewall Rules / Cloud Security Rules
  • Resource Contention
  • Slow I/O

From Why am I seeing `context deadline exceeded` errors – HashiCorp Help Center. I also agree with this text.

And with an add mutation, to pass rules you have to query state after mutation