Discussion on pluggable architecture for Dgraph's storage engine

Reference to earlier discussion - https://github.com/dgraph-io/dgraph/pull/127

I suppose we are deliberating two related but different questions.

Should Dgraph have pluggable architecture for its storage backend?

This question is more with to do with the philosophy behind the software and what we want it to be than the actual design.

There are many popular and successful open source project both with and without pluggable architectures. I would continue with the example of two of the most successful open source databases – MySQL and Postgres.

  • MySQL does have pluggable storage engines – which means when a person needs to pick up MySQL they also have to make another choice about their requirements. MySQL’s philosophy is “user knows best – give them options and let them choose”. It works well, but this flexibility has some costs.

  • Usability is a feature - The user must be knowledgeable enough to make the right decision for their needs. As the number of such pluggable options becomes significant in number – say even 2 or 3 pluggable components would get us 4 to 8 different possible combinations. A non-expert user would find it overwhelming.

  • The performance of MySQL as the database – hence the reputation – is limited by the quality of their pluggable components. Which may not be the best place to be if MySQL does not have inherent control over those components.

  • The product is bound by the subset of the functionality of dependencies. If the pluggable component A has 3 features and pluggable component B has 5 features, the product – in order to be extensible and flexible across all pluggable components – can only chose the 3 overlapping features.

  • Postgres on the other hand chose to not make their components configurable – which means when a person picks up Postgres, they do not have the flexibility about the internals. Postgres’ philosophy is “We know best – lets model and worry about the internal architecture as best as we can”. While this has the downside of not being flexible enough, these lack of flexibility has some advantages.

  • The user can just pick up Postgres, know SQL, and run with it.

  • It also allows postgres to have complete control over the storage engine, and tweak it as they see fit.
    In Dgraph’s context, tomorrow we can get rid of the third party engine and write our own, if there is any value (eg. performance, additional feature etc.)

  • It makes understanding, contributing to Postgres arguably comparatively easier as there are less moving pieces.

I am not trying to convince you that a graph database with pluggable storage engine is not a valid approach as it absolutely is. Just that dgraph is choosing to go with another approach also valid – historically for other successful projects and hopefully for us as well.

Which storage backend to use – RocksDB vs. Bolt vs Custom vs xyz … ?

I do not know enough about Rocks, Bolt, Go, Cgo or even Graphs for that matter to have a qualified opinion – so I’ll refrain to have a judgement on it one way or the other. I, however trust @mrjn and the rest of dgraph team @core-devs spent enough brain cycles to arrive at a robust decision. I’ll let them do the talking on it, and convince you, if they choose to do so. I would however mention, in my experience, at the early stage of a startup/project few good faith decisions are to be made based on hunch, past experiences, theory and common sense.

Thanks @mohitranka for summarizing the discussion along with relevant data points about what others are doing.

I think Cayley, TitanDB etc., already have gone down the path of multiple backends – So, there are already plenty of options to choose from, if that’s what one wants to achieve.

With Dgraph, we want to achieve a tighter coupling between the system and the disk – what that means might not be immediately clear given the current code base. Currently, there is an obvious divide between Dgraph and underlying storage on disk. But, we want to reserve the right to make changes to this in ways that might make this division less so.

You can think of it in that way, but more importantly, RocksDB performs. It’s already being used at Google, Facebook and other DB companies (CockroachDB). BoltDB has design issues that I consider deal breaker for what we’re trying to achieve here with Dgraph – distributed, low latency, high throughput graph database. And I don’t want to qualify it to; it’s high throughput only if you don’t do writes.

BoltDB has no other advantage on top of RocksDB, except that it avoids Cgo. One has to copy values from C to Go in RocksDB. Initially, I’d expected that avoiding a value copy might be one of the advantages, given BoltDB is in Go space. But, that’s not the case. BoltDB also requires a copy to use the values safely. https://github.com/boltdb/bolt#using-keyvalue-pairs

We’re just cutting a release for v0.4, which will provide binaries that make it easier to run Dgraph. Embedding Dgraph in the client application is not recommended as of now. The Dgraph binary clients should provide good enough performance and avoid bringing in all the dependencies of Dgraph in your application.

Quick point about async: Every write to Dgraph first gets written to a commit log and flushed out to disk. Then it gets applied to the posting list. Periodically, posting lists get merged back to RocksDB. The writes are async. This is okay because even if the machine crashes, the commit logs would still have those writes, and when the posting list is initialized again, it would pick up those writes and apply them back. So, there’s no data loss.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.