How does data size vs. query/mutate rate vs. complexity of queries/mutations affect RAM?

Dgraph’s crutch of RAM usage is still a mystery to me. Trying to plan which Slash pricing plan works best for us, is hard not knowing how we will scale once we go live and start to exponentially add new users.

When we imported our initial data set we had about 8 million Nquads, and the RDF file is right under 1/2 Gb. Not sure how this relates to disk size accounting for our boatload of schema indexes. As we consume this data just the two of us developing we are bumping the 4Gb RAM stat. How would this scale?

What we are expecting:

  • To increase query/mutation frequency as we onload our 500 users
  • The query complexity to remain somewhat consistent as it is controlled by the UI
  • To increase in data size gradually. It has taken ~2 years to occur the 500Mb of data we currently have but as we are releasing new features, we foresee users using our data in new ways and giving us more data to store for them to link with our existing data.

What we do not know is how this translates to RAM usage. I can understand the Storage metric and the Data Transfer metric, but we really have no way to plan for RAM consumption on scale. We are heavily looking at the dedicated backends as we believe it will be more cost effective in the long run than counting credits when we are currently burning about 1K credits a day between the two of us.

And to confound this a little more (like that is needed), can anyone better explain the RAM usage on the Alpha vs. Zero? Does it stay somewhat consistent to 25% between Alpha to Zero, or does that vary based upon load? Does more indexes increase RAM on alpha only or also on Zero?

I am also concerned about:

How does this pan out? I have seen stated that HA means 6 alphas, so do I need to multiply these prices X6 for HA?

If I double the amount of Alphas does the RAM usage per alpha cut in half respectively based on the same load?

Note: Tagged as Slash, but it would be appropriate question for Dgraph in general.

2 Likes

Small Correction to your post, an HA cluster is 6 nodes (3 alphas and 3 zeros). In the slash pricing, we offer the Alphas and Zeros together, so you only need to multiply by 3X.

I’m moving this to Dgraph for the remaining questions.

2 Likes