Backups: Large Backups Hold Session Open Long Time without Feedback

Experience Report for Feature Request

This was suggested by a customer:

Is there an option to have the command return a 200 - “Backup has started job-id-####” and then poll for completion of that backup job periodically? I’m not sure keeping an HTTP connection alive for more than a few seconds, w/o any data xfer, makes much sense here.

What you wanted to do

Large backups of 50+ GB can hold a connection for a long time until the backup is finished. During this time there’s no feedback, you don’t know what is happening if something went wrong.

Could there be an interface where the backup process returns 200 with a message that the backup has started with a job-id? Afterward, the customer could then poll for status using a job_id, which can inform the user where it is and if it was completed.

What you actually did

The customer cannot do anything at this point, except wait and pray.

Why that wasn’t great, with examples

As above, the customer has to wait and hope the backup has succeeded and that connection does not get disconnected.

Any external references to support your case

(references upon request)

2 Likes

Also found this issue, when try to backup small data, it works fine. But when the database is large, like 50G, backup stuck forever and there is no way to track the problem. I guess there do exists a timeout problem.

For restore, I think it works in a brilliant way. Restore is an asynchronous processing, once call this API, it will return response quickly, offering restore_id. And we can track restore status by this ID. If there is any way to track backup, Dgraph will be perfect!

1 Like