Going production soon - please need help on common tasks

labs20 · December 13, 2018, 6:42pm

Hi. We’re about to go live and I have some questions that I would very much appreciate some light on it:

How to properly do a “hot” backup (with the server running and users accessing it)?
How to properly do a “normal” backup, with the server down?
How to perform a backup restore ?
How to properly shutdown the whole thing?
What is the minimum amount of RAM recommended?
And CPUs? (In order to build a reasonable Amazon or alike server)
Any other advices about fail over plans or such?

Thanks!

dmai · December 13, 2018, 7:39pm

You can do a complete export of your data and schema while the cluster is running. The data export is in RDF NQuad format would need to be loaded back into Dgraph.

Binary backups and restores will be available as an enterprise feature.

Data exports are not an option if the cluster isn’t running. You could copy the p and w directories for Alphas and zw directories for Zeros as a backup.

If you have an export you can load the data into Dgraph using the Live Loader or the Bulk Loader.

If can send a SIGTERM (Ctrl-C) to each Dgraph instance which will perform a proper shutdown. You could also call /admin/shutdown on an Alpha.

We recommend 8GB minimum. Dgraph can service queries and mutations concurrently and will utilize all the available cores.

Dgraph is highly available; the cluster will be able to service requests as long as the majority of each group in the cluster is up. In the event that a member goes down and cannot be recovered (e.g., the machine is unrecoverable and/or the data on that node is corrupted), you can use the /removeNode API to remove the bad member from the cluster and start up a new member in its place.

labs20 · December 13, 2018, 7:47pm

Thanks for your reply!

tlmichael · December 14, 2018, 10:51am

Which hosting providers do you recommend? 8GB seems very high for personal projects (that would cost $40/mo on DigitalOcean). Right now I’m running on a 1GB RAM ($5/mo DO droplet) for my personal project (4GB with company project), but I mostly fetch the same data so can easily cache it. But I’d like to think that there must be better options available. Speaking of which, have you considered running your own “Cloud Dgraph Database”?

I would never worry about the data I store on Datastore disappearing, but I unfortunately can’t say the same about Dgraph (since I have to do it myself, which isn’t my strong suit, and I don’t really enjoy hosting applications, I just enjoy developing them )… so I often end up storing the important data twice (Dgraph for my application + Datastore for backup), which isn’t really ideal.

labs20 · December 14, 2018, 12:21pm

This will only shutdown the alpha, right? About the zeros?

makitka · December 14, 2018, 3:20pm

on 8G machines (3 t2.large amazon instances cluster) we have regular OOM errors, running both zero and alpha on the same instance. sometimes it happens on hundreds of thousands, sometimes on million items.
on 16GB instance (m5.xlarge) it seems to run ok for millions items (didn’t test it on tens of millions yet)

Topic		Replies	Views
Thoughts on Administrative Management of Dgraph? Users	2	316	September 18, 2023
Backup and restore options in the Community Edition Dgraph dgraph , kind:enhancement , status:accepted , area:documentation	2	1085	September 25, 2020
Having issues with binary backups Users	25	1413	January 14, 2020
How to backup dgraph Dgraph kind:question , dgraph	1	479	July 6, 2023
Help needed for Dgraph setup : Linux server Dgraph kind:question	4	371	September 21, 2020

Going production soon - please need help on common tasks

Related topics