Read, write, space amplifications

Some articles / video that I find helpful. Maybe you already know them very well.


There is a paper on this topic which I had shared with @mrjn

Towards Accurate and Fast Evaluation of Multi-Stage Log-structured Designs, Hyeontaek Lim and David G. Andersen, Carnegie Mellon University; Michael Kaminsky, Intel Labs


Thanks for that paper, @sanjosh. Just the first 2 sections only, made it clear why layers are important to write amplification. @ganesh is working on doing the layering as we speak.

If I am not mistaken, cockroachdb seems willing to live with the limitations of using cgo to talk to rocksdb. Is there some compelling reason for you to rewrite rocksdb ? How much effort and time is acceptable, given the estimated time it will take to stabilize the new code ? If your use-case is simpler, are there easier alternatives like enhancing an existing golang kv-store like boltdb.

Once we have Badger in place, people would be able to go-get Dgraph. That in itself is a great reason to do this project. On top of that, it allows profiling for us, all the way down to disk. Any work to optimize Dgraph as a system would be fruitful – as opposed to trying to get around Cgo with RocksDB. We could better control the memory usage; work towards embeddable Dgraph etc. Keeping a clean manageable stack has long-term advantages.

Well, I think one strong engineer is sufficient to work on Badger. But, soon enough, if Badger performs as we expect it to, the community can jump in and start helping with bug reports, contributions etc.

BoltDB’s underlying design is practically single-threaded. You can’t optimize a bad design, you can only re-design it (rewrite the whole thing).

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.