Badger Compression Feedback

Congratulations on the Badger 2.0 release!

Being ‘interested’ in data compression I have a little feedback:

It uses the zstd compression algorithm when Badger is built with Cgo enabled. When built without Cgo enabled, it uses the snappy algorithm.

This seems like an odd choice. IMO compatibility shouldn’t be determined by compilation settings. I would humbly like to point out that there is pure Go zstd implementation. While performance is only close to the cgo version, if you go for the fastest setting there shouldn’t be much of a difference.

I don’t know the reason for choosing Snappy, but LZ4 typically outperforms Snappy.

If you are willing to go with a “non-standard” scheme I have written S2 which is a Snappy extension that compresses better than Snappy and typically decompresses faster. You can see direct comparisons in the “Block compression” section. S2 can decompress Snappy blocks but not the other way around.

I assume you have verification of blocks since snappy blocks have no integrity check and datadog zstd doesn’t read or write CRC info.

If you have some representative ‘blocks’ I can do a comparison between the different schemes.

Edit: (New users can only post 2 links)… well… here is my post with links: https://gist.github.com/klauspost/8597feb49515b3811da17bee399b0e18

Thanks for the feedback @klauspost. We decided not to use the Go ztd implementation because it’s in the beta status: https://github.com/klauspost/compress/tree/master/zstd#status.

@ibrahim had done some benchmarks including lz4 in the past: https://github.com/dgraph-io/badger/pull/1069. The closed PR has some notes, but aside from that I’m not sure if there are any other updates.

You can set the specific compression algorithm you want to use in the db options instead of relying on the defaults.

Your benchmark should be based on real data, otherwise you are just benchmarking very specific and un-representative data.

Basically whatever you put in your table will color your output. That is why I requested some “real” data and not just some (honestly badly) generated stuff.

I have updated the benchmark slightly https://gist.github.com/klauspost/248df4f53a99e68c31c4c2137c27d8e8

		k := fmt.Sprintf("%016x", rng.Uint64())
		v := fmt.Sprintf(`{"value":"%d","another":"%016x","key-%x":%t}`, i,  rng.Uint32(), rng.Uint32(), rng.Uint32()&1==0)

Is this representative? No, definitely not, but maybe a bit more.

Name runs time/op speed
/Compression/Snappy-12 300001 4043 ns/op 1008.08 MB/s
/Compression/S2-12 157894 7689 ns/op 530.13 MB/s
/Compression/S2_Better-12 80535 14851 ns/op 274.47 MB/s
/Compression/LZ4-12 255319 4661 ns/op 874.52 MB/s
/Compression/ZSTD_-_Datadog-12 29482 40431 ns/op 100.81 MB/s
/Compression/ZSTD_-Go-_Fastest-12 42849 25882 ns/op 157.48 MB/s
/Compression/ZSTD_-Go-_Default-12 34090 34145 ns/op 119.37 MB/s
/Decompression/Snappy-12 1387281 865 ns/op 4712.13 MB/s
/Decompression/S2-12 1451026 833 ns/op 4891.96 MB/s
/Decompression/LZ4-12 2047778 580 ns/op 7025.87 MB/s
/Decompression/ZSTD_-_Datadog-12 148146 8255 ns/op 493.73 MB/s
/Decompression/ZSTD_-_Go-12 93022 12589 ns/op 323.79 MB/s
Compression Ratio Snappy 1.7531182795698925
Compression Ratio S2 1.7531182795698925
Compression Ratio S2 (better?) 1.741880341880342
Compression Ratio LZ4 1.7861524978089396
Compression Ratio datadog ZSTD 3.0972644376899696
Compression Ratio Go ZSTD (fastest) 3.2195892575039493
Compression Ratio Go (default) ZSTD 3.1474903474903475

You can of course go on and tweak the input until it favors one or the other compressor.

Go zstd being better at encoding and worse at decoding is pretty consistent. Decode speed seems to be limited because of the small buffer size and a particular function.

Snappy definitely benefits from its assembly compressor and is something that I should look into for S2 if I get the time. S2 “better” doesn’t like this specific data input.

LZ4 definitely wins the decompression race.

Hey @klauspost! Thank you for your feedback and the benchmarks. I really appreciate the effort you’ve put into benchmarking the compression algorithms. The benchmarks I had done were based on a data block size of 4 KB which is the default block size in badger. We perform block-based compression in badger.

I don’t know the reason for choosing Snappy, but LZ4 typically outperforms Snappy.

We chose to use Snappy instead of LZ4 because in my benchmarks Snappy showed better compression ratio than Lz4. But in your benchmarks LZ4 has better compression ratio. I will look into this. Maybe the value change seems to affect this?

Your benchmark should be based on real data, otherwise you are just benchmarking very specific and un-representative data.

Since badger stores data in term of blocks (usually of 4KB) I did benchmarks on that data. I understand the compression algorithm’s performance depends on the data but since we were going to compress blocks, I chose to use an actual block from badger. We wanted to benchmark the compression of blocks, not the compression capability of the algorithm in general.

LZ4 definitely wins the decompression race.

Yes, LZ4 was faster but its compression ratio was lower than ZSTD. That’s why we chose to use ZSTD as the default compression algorithm (with CGO) in badger.

@klauspost if there’s any way we can improve the compression in badger, feel free to send a PR. You definitely have more knowledge about compression algorithms than I do :slight_smile:

Hi!

blocks (usually of 4KB)

The data blocks are still 4KB - that is of course important and relevant to your use case. The important part is what is in the blocks.

Here is another important one, in-compressible data:

Value is generated like this:

		var v = make([]byte, 200)
		io.ReadFull(rng, v)
Name runs time/op speed
BenchmarkComp/Compression/Snappy-12 2030412 595 ns/op 6491.33 MB/s
BenchmarkComp/Compression/S2-12 666688 1848 ns/op 2089.87 MB/s
BenchmarkComp/Compression/S2_Better-12 240003 5029 ns/op 767.93 MB/s
BenchmarkComp/Compression/LZ4-12 999991 1253 ns/op 3082.17 MB/s
BenchmarkComp/Compression/ZSTD_-_Datadog-12 77922 15272 ns/op 252.89 MB/s
BenchmarkComp/Compression/ZSTD_-Go-_Fastest-12 180271 6474 ns/op 596.58 MB/s
BenchmarkComp/Compression/ZSTD_-Go-_Default-12 123710 8892 ns/op 434.33 MB/s
BenchmarkComp/Decompression/Snappy-12 18461481 64.4 ns/op 60015.38 MB/s
BenchmarkComp/Decompression/S2-12 24999999 46.8 ns/op 82521.36 MB/s
BenchmarkComp/Decompression/LZ4-12 n/a
BenchmarkComp/Decompression/ZSTD_-_Datadog-12 799983 1398 ns/op 2763.42 MB/s
BenchmarkComp/Decompression/ZSTD_-_Go-12 5217450 228 ns/op 16961.09 MB/s

LZ4 refuses to return something that is bigger, while all the others do. So LZ4 will need some indicator that the data is uncompressed.

Also saw that snappy/s2 buffers were not the correct size, so I updated those in the gist.

TLDR; From this simple mucking about, I’d say to stick with Snappy for now, but change zstd from datadog to Go version.

I cannot tell you which should be default since they are vastly different in performance/compression. I guess Snappy is “mostly harmless”, but zstd offers much bigger improvements.

LZ4 is so close to Snappy in these tests that it may not be worth it to add another format.

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.