What are the simplest ways to improve write throughput in a live dgraph cluster? I’m looking for some rules of thumb to follow as the write volume we must accommodate increases. For instance, which of might make a difference?
- Avoiding certain types of indexes that take a long time to build
- Splitting large mutations into chunks
- Increasing memory allocated to server
- Having many relatively small predicates (would this help the cluster to execute multi-predicate writes in parallel?)
- Increasing the number of servers in the cluster (when would more help and when would the returns diminish?)
In general, indexes would slow down writes. Regex and term/fulltext indexes can generate a lot of additional edges and slow down writes. So you should use only what is needed.
It’s better to batch your mutations, dgraph live uses a batch size of
1000 by default.
This should make a difference.
It would help if you shard your data into groups because then writes can be distributed across the nodes in your cluster instead of all of them getting executed on the same instance. Note that even on the same instance we have mutations being executed in parallel.
If those servers are just replicas then it would decrease the write throughput but if they were shards serving different groups then it should increase throughput.
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.