Moved from GitHub dgraph/4813
Posted by marvin-hansen:
TL’DR:
Add in-memory mode to boost performance and use in-memory (predicate) compression to pack as much data as possible in the given memory. That smokes a lot of benchmarks and helps a lot with graph OLAP.
Experience Report
Right now, we get about a 6 - 10 ms latency with a 5 node Dgraph cluster, and that’s nice, but only the case because we load balance across all alphas and the current graph size is small enough that it fits completely in memory. Specifically, we run a count query against all predicates and fetch them all in a quiet hour so they are in memory when needed. That works very well, latency is very low and it’s relatively easy to script.
What you wanted to do
For our unified graph, we actually wanted one single DB for operations and analytics in one cluster.
But we had some troubles with DGraph and some hard questions about scalability that are still not answered so we moved forward.
Why that wasn’t great, with examples
We simply could not figure out if DGraph is going to sustain performance beyond a few billion predicates. There are no case studies of very large deployments, and that indicates there are most likely none.
What you actually did
We decided to factor out analytics and instead use an in-memory graph that goes all the way up to 500 billion ops per second and, in fact, have to integrate a secondary graph for analytics because of the alternative already ships with all important graph analytic algorithms so it makes sense.
What would be the greatest possible solution?
Please consider adding an persistent in-memory mode to DGraph.
Persistent in-memory means, the entire graph is in memory AND synced on disk so you get both, the speed from querying everything from memory AND the safety of having all data on disk in case a node goes down or needs a restart, say for an update.
Preserve “correctness” and transactional safety but improve performance with an option to hold the entire graph in memory and sync memory to disk. A bit like Redis-graph, but actually useful.
I know, one USP of DGraph is exactly that you can manage graphs way bigger than available memory because it only retains the keys in memory but stores the actual values on disk. Sometimes, you either don’t have a graph that big, or equally possible, you have a lot of memory so you can hold the entire graph in memory. Because memory prices are falling already for years, you get so much cheap memory on GCP, that in-memory analytics becomes a no-brainer.
We are actually moving all analytics to a persistent in-memory graph, because, it’s actually 30% cheaper than an equivalent GPU accelerated column-based storage but equivalent in performance on most tasks. In terms of system complexity, in-memory is way simpler to handle and scales much either because allocating memory or adding more high-mem nodes can be done with auto-scaling in Kubernetes.
As a rule of thumb, you need twice the memory relative to the graph size plus ~20% for the system. If a graph is about 150 GB in size, you need at least 300 GB memory plus some 50GB reserved for the system, so in total, you need 350GB. With predicate compression, however, it is certainly feasible to pack twice the graph size in the same memory and that’s a huge additional relative to the total gain in performance. Obviously, memory usage is high but we are good with that.
Any external references to support your case
Trillion Triple Benchmark:
https://dzone.com/articles/got-a-minute-check-out-anzographs-record-shatterin