We’re using a shared Slash instance running v21.03.0-46-g7346eac24, and we suddenly started getting several strange errors from it. Here’s a sampling from our API server’s logs:
2 UNKNOWN: hash mismatch the claimed startTs|namespace
2 UNKNOWN: : context deadline exceeded
14 UNAVAILABLE
2 UNKNOWN: Please retry again, server is not ready to accept requests
I can access this Dgraph instance via Ratel, where queries seem to work fine.
I’ve narrowed the problem down to the transaction type. Read-only queries are working from both our API server and Ratel with no problems. Mutations via Ratel are working, but all mutation attempts from our API server are failing with:
2 UNKNOWN: hash mismatch the claimed startTs|namespace
We’re using the JavaScript gRPC client, and my working theory is that the error occurs when we try to start a new write transaction on our DgraphClient instance:
client.newTxn({ readOnly: false })
Is it expected that a client can get into this state? Should we be attempting to catch these errors and instantiate a new DgraphClient instance?
@tron Are you using the latest Dgraph client? Based on your example code, it looks like you’re using on of the dgraph-js or the dgraph-js-http client. Both of these clients are currently at v21.03.1.
@dmai Thanks for your suggestion. Upgrading the dgraph-js client to v21.03.1 resolved the issue for us.
Now a different question: Can we control precisely when our Slash instance is upgraded? Naturally we’d like to avoid surprises like this in the future.