Benchmark request: badger v2 vs mongoDB

diggy · May 28, 2019, 11:49pm

Moved from GitHub badger/833

Posted by tegk:

Could not find any benchmark.

How is the insert performance compared to MongoDB?

diggy · May 29, 2019, 7:20pm

campoy commented :

MongoDB and Dgraph are wildly different databases, so talking about “insert performance” is quite risky.
Are we inserting new relationships? new nodes? new pieces in a document? Those will have quite different performances on MongoDB.

Do you have some specific usecase for which we might be able to come up with a meaningful benchmark?

diggy · May 30, 2019, 8:23am

tegk commented :

I am inserting a key value eg (“1”, “test@test.com”) and that couple million times.

diggy · May 30, 2019, 6:53pm

campoy commented :

Hey @tegk,

I just realized that this issue was under the badger repo and not the dgraph one.
This benchmark makes much more sense in that case hehe

We do not have this kind of benchmark at the moment but I do agree it would be an interesting one to perform.
I’ll change the title of this issue and add it to our roadmap, although we could definitely use help with these, so if you’re interested in contributing I’d love to assist you with it.

BTW, even before the benchmark we do suspect badger to be faster than mongoDB, as we’ve increased the performance of our writers substantially over the latest month or so.
A blog post will be coming up soon, but in the meantime you can check out @manishrjain’s twitter thread on it here.

Thanks for the proposal!

diggy · June 7, 2019, 1:31pm

tegk commented :

Badger v1.5.3 is approximately taking 78% less time than MongoDB to write 1 million key value pairs for my use case. Waiting for the v2 release and the stream writer to test out again.

diggy · June 16, 2019, 5:20pm

recoilme commented :

I may find some benchmarks here (Put/Get/Concurrency):

Currently it supports:

pogreb Embedded key-value store for read-heavy workloads written in Go
goleveldb LevelDB key/value database in Go.
bolt An embedded key/value database for Go.
badgerdb Fast key-value DB in Go
slowpoke Low-level key/value store in pure Go
pudge Fast and simple key/value store written using Go’s standard library

diggy · June 20, 2019, 10:24pm

martinmr commented :

Used pogreb-bench to compare badger v1.5 against badger v2:

v1.5

Number of keys: 1000000
Minimum key size: 16, maximum key size: 64
Minimum value size: 128, maximum value size: 512
Concurrency: 2
Running badgerdb benchmark...
Put: 20.065 sec, 49839 ops/sec
Get: 2.813 sec, 355552 ops/sec
Put + Get time: 22.877 sec
File size: 1.94GB

v2

badger 2019/06/20 15:14:11 INFO: All 5 tables opened in 426ms
badger 2019/06/20 15:14:11 INFO: Replaying file id: 4 at offset: 219030900
badger 2019/06/20 15:14:11 INFO: Replay took: 4.796µs
Number of keys: 1000000
Minimum key size: 16, maximum key size: 64
Minimum value size: 128, maximum value size: 512
Concurrency: 2
Running badgerdb benchmark...
Put: 28.065 sec, 35632 ops/sec
Get: 2.860 sec, 349689 ops/sec
Put + Get time: 30.924 sec
File size: 2.41GB

I ran the benchmarks and the results are consistent. v2 puts are slower. I couldn’t use the stream writer because the keys are randomly generated and the stream writer requires that they are in order.

diggy · June 20, 2019, 10:58pm

martinmr commented :

@recoilme Hi. I opened Add support for go mod. by martinmr · Pull Request #1 · recoilme/pogreb-bench · GitHub

It contains some minor fixes, and adds support for go modules, which is being used to use version 2 of the badger API.

diggy · June 21, 2019, 8:15am

recoilme commented :

@martinmr lgtm, thank you!

diggy · June 21, 2019, 8:24am

manishrjain commented :

pogreb bench is alright for very serial writes which don’t know anything about the usage behavior. But it isn’t a great way to actually benchmark Badger.

It doesn’t use anything in Badger, which makes it faster. For e.g., for serial writes, we’d typically use a batch writer. Instead pogreb uses one txn per write, which is slower because no batching of writes can happen.

Similarly, the values can be colocated with the keys, considering how small they are. So, I’d set ValueThreshold to the max size of value (+1 to avoid off by one maybe), so we don’t need to additionally retrieve the value from the value log.

@martinmr : We have read/write benchmarks in badger-bench that you can use instead, which have the benefit of aiming for write throughput and read throughput, etc.

diggy · June 21, 2019, 5:17pm

martinmr commented :

@recoilme: I opened another PR (Use WriteBatch for Put operation in badger DB. by martinmr · Pull Request #2 · recoilme/pogreb-bench · GitHub) to use the WriteBatch in badger instead of using a separate transaction each time. Performance is greatly improved so you might want to rerun your benchmarks and update the results for badger.

@manishrjain I noticed that and tried to change it to use the StreamWriter but it didn’t work because the keys are not in order. I didn’t know about WriteBatch but I’ve changed pogreb-bench to use it. I just thought pogreb-bench was useful to compare badger to other DBs. I’ll use the benchmarks in badger-bench for more accurate testing.

diggy · June 24, 2019, 7:18am

recoilme commented :

Hello, @martinmr ! Thank you for your PR! Please take a look at Use WriteBatch for Put operation in badger DB. by martinmr · Pull Request #2 · recoilme/pogreb-bench · GitHub

Topic		Replies	Views
Why we choose Badger over RocksDB in Dgraph - Dgraph Blog Blog	8	1645	April 14, 2021
Badger vs LMDB vs BoltDB: Benchmarking key-value databases in Go - Dgraph Blog Blog	6	3587	October 10, 2018
BadgerDB v1.5.5 and v2.0.0-rc1 are now released Badger	2	2122	June 26, 2019
V1 vs V2 Benchmarks? Dgraph	0	397	November 29, 2019
Releasing BadgerDB v2.0 - Dgraph Blog Blog	3	811	February 4, 2020

Benchmark request: badger v2 vs mongoDB

Related topics