Running Concurrent Update Mutations to the same node(@id) causes RPC error and transaction to be aborted. Doesn't Auto Retry

Moved from GitHub dgraph/5341

Posted by guhan-v:

What version of Dgraph are you using?

v20.03.1

Have you tried reproducing the issue with the latest release?

Yes

What is the hardware spec (RAM, OS)?

High Availability K8s cluster as built with (https://github.com/dgraph-io/dgraph/blob/master/contrib/config/kubernetes/dgraph-ha/dgraph-ha.yaml)

Steps to reproduce the issue (command/config used to run Dgraph).

Schema:

type Workspace {
    workspaceId: String! @search(by: [hash]) @id
}

interface Id {
    key: String! @search(by: [hash]) @id
    onWorkspace:  [Workspace]!
    hasTraits: [Traits]
    hasGroupTraits: [GroupTraits]
}

type AnonymousId implements Id {
    email: [Email] @hasInverse(field: anonymousId)
    userId: [UserId] @hasInverse(field: anonymousId)
}

Mutations:

{ upd1: updateAnonymousId(input: {
      filter: { key: {eq: "id2"}},
      set: { onWorkspace: [{workspaceId:"${workspaceId}"}] 
}) { anonymousid{
      key
    }
  }
  }
{ upd2: updateAnonymousId(input: {
      filter: { key: {eq: "id2"}},
      set: { onWorkspace: [{workspaceId:"${workspaceId2}"}] 
}) { anonymousid{
      key
    }
  }
  }

If you run those 2 mutations concurrently(Might need to run many of them at the same time). It collides on updating the same node then produces the following error:

mutation updateAnonymousId failed because Dgraph mutation failed because rpc error: code = Aborted desc = Transaction has been aborted. Please retry

Expected behaviour and actual result.

Expected Behaviour: - Automatically retrying or dealing with multiple mutations trying to update the same node

Actual Result: - Gives an error when trying to update and fails the entire transaction. Error: mutation updateAnonymousId failed because Dgraph mutation failed because rpc error: code = Aborted desc = Transaction has been aborted. Please retry

In the straight GRPC GQL± spec there is an expectation that you would retry the transaction by yourself with a backoff, but the GQL spec seems to be a layer of abstraction in front of that. As such it feels like the mutations shouldn’t need to be retried, but rather the server side would retry it.

MichaelJCompton commented :

Accepting this. This means it’ll be picked soon and we’ll scope out how we might enable retry - maybe an optional argument in the mutation. At this stage it’s unclear, but looks like a good feature.

MichaelJCompton commented :

related RFC : Transactions in GraphQL