How does data size vs. query/mutate rate vs. complexity of queries/mutations affect RAM?

amaster507 · August 21, 2020, 4:36pm

Dgraph’s crutch of RAM usage is still a mystery to me. Trying to plan which Slash pricing plan works best for us, is hard not knowing how we will scale once we go live and start to exponentially add new users.

When we imported our initial data set we had about 8 million Nquads, and the RDF file is right under 1/2 Gb. Not sure how this relates to disk size accounting for our boatload of schema indexes. As we consume this data just the two of us developing we are bumping the 4Gb RAM stat. How would this scale?

What we are expecting:

To increase query/mutation frequency as we onload our 500 users
The query complexity to remain somewhat consistent as it is controlled by the UI
To increase in data size gradually. It has taken ~2 years to occur the 500Mb of data we currently have but as we are releasing new features, we foresee users using our data in new ways and giving us more data to store for them to link with our existing data.

What we do not know is how this translates to RAM usage. I can understand the Storage metric and the Data Transfer metric, but we really have no way to plan for RAM consumption on scale. We are heavily looking at the dedicated backends as we believe it will be more cost effective in the long run than counting credits when we are currently burning about 1K credits a day between the two of us.

And to confound this a little more (like that is needed), can anyone better explain the RAM usage on the Alpha vs. Zero? Does it stay somewhat consistent to 25% between Alpha to Zero, or does that vary based upon load? Does more indexes increase RAM on alpha only or also on Zero?

I am also concerned about:

How does this pan out? I have seen stated that HA means 6 alphas, so do I need to multiply these prices X6 for HA?

If I double the amount of Alphas does the RAM usage per alpha cut in half respectively based on the same load?

Note: Tagged as Slash, but it would be appropriate question for Dgraph in general.

gja · August 24, 2020, 7:07am

Small Correction to your post, an HA cluster is 6 nodes (3 alphas and 3 zeros). In the slash pricing, we offer the Alphas and Zeros together, so you only need to multiply by 3X.

I’m moving this to Dgraph for the remaining questions.

Topic		Replies	Views
Indexes and costs Misc	4	684	March 10, 2021
About dgraph 20.11 query efficiency？ Dgraph	6	578	December 22, 2020
RAM usage very high when schema is posted GraphQL	5	681	May 4, 2021
Extreme memory usage when constantly query and mutate data Dgraph	5	1695	February 5, 2020
Help needed for Dgraph setup : Linux server Dgraph kind:question	4	371	September 21, 2020

How does data size vs. query/mutate rate vs. complexity of queries/mutations affect RAM?

Related topics