Decompression block pool is inefficient

diggy · June 20, 2020, 9:58pm

Moved from GitHub badger/1378

Posted by damz:

What version of Go are you using (`go version`)?

$ go version
go version go1.14.4 linux/amd64

What operating system are you using?

Linux

What version of Badger are you using?

master

Steps to Reproduce the issue

Profile a badger using database, you will likely see a number of allocations both in table.decompress and in zstd.Decompress, something like this:

What’s going on?

There are actually three issues:

The block size is hardcoded

The block pool is expecting blocks to be 4kB, regardless of the Options.BlockSize setting of the database (it actually allocates 5kB instead of 4kB as an effort to avoid spurious allocations during decompression);

The zstd library expects a slice of the correct length, not just capacity

The table.decompress passes zstd a block that has a capacity of 5kB, but a length that can be anything. On the other hand zstd expects the block length to be enough (see that it passes the length to the C function here:

github.com/DataDog/zstd

zstd.go

89f69fb7d


      
          		cWritten := C.ZSTD_decompress_wrapper(
          			C.uintptr_t(uintptr(unsafe.Pointer(&dst[0]))),
          			C.size_t(len(dst)),
          			C.uintptr_t(uintptr(unsafe.Pointer(&src[0]))),
          			C.size_t(len(src)))

The block pool keeps a reference to the block structure alive

In addition (but it is a minor issue), the way the block pool is implemented makes the block pool keep a reference to the block struct alive:

github.com/dgraph-io/badger

table/table.go

d37ce3691


      
          	if atomic.AddInt32(&b.ref, -1) == 0 && b.isReusable {
          		blockPool.Put(&b.data)
          	}

The &b.data is a reference to the block, so the whole struct is kept alive, including everything that it itself references.

diggy · June 21, 2020, 7:02am

jarifibrahim commented :

The block size is hardcoded
The block pool is expecting blocks to be 4kB, regardless of the Options.BlockSize setting of the database (it actually allocates 5kB instead of 4kB as an effort to avoid spurious allocations during decompression);

Yes, this is correct. The block size is hardcoded but the assumption is that the pool will be filled up with correct sized blocks eventually. It starts with 4KB but if you’ve set the block size to 10KB, the blocks picked up from the pool will be not used and ZSTD library will create new 10KB sized blocks which we will insert in the pool. So eventually, the pool will have 10 KB blocks.

The zstd library expects a slice of the correct length, not just capacity
The table.decompress passes zstd a block that has a capacity of 5kB, but a length that can be anything. On the other hand zstd expects the block length to be enough.

We’re creating blocks of 5 KB length. The ZSTD library will use these

github.com

dgraph-io/badger/blob/3f4761d229515a0208d847b6635011ed5187f75f/table/builder.go#L121-L130


      
          var blockPool = &sync.Pool{
          	New: func() interface{} {
          		// Create 5 Kb blocks even when the default size of blocks is 4 KB. The
          		// ZSTD decompresion library increases the buffer by 2X if it's not big
          		// enough. Using a 5 KB block instead of a 4 KB one avoids the
          		// unncessary 2X allocation by the decompression library.
          		b := make([]byte, 5<<10)
          		return &b
          	},
          }

Also, see Go Playground - The Go Programming Language
Am I missing something?

The block pool keeps a reference to the block structure alive
In addition (but it is a minor issue), the way the block pool is implemented makes the block pool keep a reference to the block struct alive:

Interesting. I thought this would keep only the reference to the data []byte slice. If this keeps the reference to the struct alive, how can we insert just the data in the pool? I intentionally used the pointer to void the slice header copies.

diggy · June 26, 2020, 1:54pm

damz commented :

Interesting. I thought this would keep only the reference to the data byte slice. If this keeps the reference to the struct alive, how can we insert just the data in the pool? I intentionally used the pointer to void the slice header copies.

I don’t think there is a good way, other than carrying over the pointer everywhere.

With a little bit of unsafe, if we assume that blocks are all the same size, you could only store a pointer to the first element, and rebuild the slice header when getting the slice. But it is not very clean either.

Topic		Replies	Views
Badger Compression Feedback Badger	6	1607	January 2, 2020
Use pure Go zstd implementation Badger badger , kind:enhancement	24	4451	December 23, 2021
DB Space preallocation control Badger	1	197	March 11, 2024
Is it possible to set a maximum amount of RAM badger will use? Badger	3	3028	February 4, 2020
BadgerDB consume too much disk space Badger kind:question	2	2543	March 28, 2022