Dgraph and RocksDB

Dgraph uses RocksDB, which is a library to make it easy to store key-value pairs on disk. On a very basic level, it’s no different than any file library, which lets you write bytes to disk and then read them back later. The naming of RocksDB makes it feel like as if it’s a database – but it’s not. It’s just an advanced library.

This is a common source of confusion among people who’re comparing Dgraph against Cayley or other data stores which also use RocksDB (like say CockroachDB). The main difference here is that Dgraph manipulates and manages the data it writes to RocksDB, just like it would treat a file on disk. All of our advanced features like data replication, data sharding, data movement, etc., all are handled at Dgraph layer. RocksDB only enables us an interaction with the local disk – it doesn’t do any of these other things that one would expect from a distributed highly available scalable database.

Cayley doesn’t currently do any of these data management stuff. So, if you want to distribute Cayley among different servers, you’ll have to use something like Cassandra, not RocksDB. Again, this is because RocksDB is just an interface to local disk. So if you’re on a single machine, then using RocksDB with Cayley makes sense. But, not if you want to then distribute the data. Cayley wouldn’t help you there.

To the point about HyperDEX, Dgraph does not use any database below it. Below Dgraph is only local disk. That’s how we can control our network calls, our query latencies and be performant.

Hope that sheds some light on the issue.

5 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.