Queries regarding export/schema update

Hi team,

I will be running dgraph in a cluster of 3 alpha instances, 1 ratel, 1 zero. The dgraph will be receiving read requests through rpc, write requests through events from a queue. I have two questions -

  1. During the above events, if I take export of the data, will the read requests fail and writes stop, until the duration where export is completed?
  2. During the above events, if I want to update the schema, will the read requests fail and writes stop, until the duration when schema update is done?

I personally ran tests where I tried to replicate the above scenarios. During export, no errors were found, but during schema update, some small percentage of reads failed. I wanted to know from the dgraph team what will happen in above two cases.

@praneelrathore Dgraph supports snapshot isolation and all operations are performed on a snapshot. Each export request is associated with a timestamp (readTs) and the export will contain only the data which was added before the readTs. This readTs is the latest timestamp when the export request is received.

So when an export starts, we’re exporting the data at the given readTs. If you insert new data while an export is running, your export won’t have that data. Any data with a timestamp more than the readTs will be invisible to the export request.

During export, no errors were found, but during schema update, some small percentage of reads failed.

Schema update would drop some data and your reads would wait for that schema update to complete. What was the error that you saw?

2 Likes

@ibrahim Thank you so much for quick response as always. Thanks for clarifying the export part.

During schema update, I got “connection refused” while dialling for alpha through grpc, but I am not sure whether they are only due to the schema update happening, as in my environment, they can be very well due to the exceeding open file limits of the system. I am trying a few more things on this part, I will post here if I find some evidence that the reads indeed failed due to the schema update.

Hi @praneelrathore, can you please elaborate on your schema update operation. If you are facing an error persistently, gives us some details to reproduce it.

Hi @praneelrathore,

I tried to replicate the scenario you are suggesting in script. I am able to export data or mutate schema and read/write in concurrently. The script I wrote for this is here
If you are able to replicate the issue that you are facing locally, it would be helpful to share error message and error logs, if any on alpha.
Hope this helps.

1 Like

Hey @Rahul @minhaj

I did some testing around this scenario, and I was also able to export data/mutate schema while concurrent read and writes. I guess the connection refused error was just due to the open file limit. Although I grep-ed the alpha logs just for “error” strings after these operations and found these -

Applying proposal. Error: Pending transactions found. Please retry operation. Proposal: "mutations:<group_id:1 start_ts:41519 schema:<predicate:\"entity_type\" value_type:STRING > schema:<predicate:\"entity_value\" value_type:STRING directive:INDEX tokenizer:\"exact\" tokenizer:\"fulltext\" count:true > 
TryAbort 1 txns with start ts. Error: <nil>
TryAbort: No aborts found. Quitting.
tryAbortTransactions for 1 txns. Error: <nil>

When do these logs get published? Is it something I need to worry about?

Didn’t get your question, logs get published where? I don’t think you should worry about it.

This seems to be a scenario when there was a pending write which may have been aborted because of some reason. So Dgraph will retry that aborted transaction before applying any new transaction.

Hope this helps.

1 Like

@Rahul The logs were on the alpha server, as stated in the previous reply.

Anyway thanks for the clarifications. Really appreciate the team’s quick replies :slight_smile:

3 Likes