Moved from GitHub badger/922
Posted by 256dpi:
TL;DR
It would be wonderful if badger would support some sort of optimized way to handle deletion of a range of keys. Inspiration can be taken from other projects like RocksDB which implements a special range delete tombstone: DeleteRange: A New Native RocksDB Operation | RocksDB.
Use Case
I’m currently working on quasar which builds on badger to implement a persistent queue. There is a small runnable example here: quasar/example/main.go at master · 256dpi/quasar · GitHub. The whole thing works pretty well and in a basic setup with one producer and one consumer the steady throughput is about 100’000 messages/second on my computer. I think with badger not being necessarily optimized for that workload it’s pretty neat!
Since the queue has a fixed size, there is a “cleaner” process that periodically deletes processed messages to free space in the queue. This process iterates over the keys and removes them in using one or two transactions: quasar/ledger.go at master · 256dpi/quasar · GitHub.
Unfortunately, this approach is very expensive and is responsible for up to 30% of the CPU time used by the program:
The flame graph incorrectly attributes all badger.(*Iterator).prefetch
calls to the initial badger.(*Iterator).Seek
call. The normal graph does not make that connection and I assume they also come from the subsequent badger.(*Iterator).Next()
calls.
With the support of something similar to range delete tombstones in RocksDB this would be possibly a lot faster. Also I would favor an approach that is not externally as for example db.DropPrefix
is now. In my use case writes and deletes can happen at the same time and thus I would not want to block the database for cleanings.
I imagine an interface like the following:
func (txn *Txn) DeleteRange(from, to []byte) error {}
The same functionality could potentially also be used to delete based on a prefix:
func (txn *Txn) DeletePrefix(prefix []byte) error {
return txn.DeleteRange(prefixStart(prefix), prefixEnd(prefix))
}
I didn’t to any research yet on how this could be implemented in badger. Maybe somebody from the team has already thought about this. Anyway, thanks for considering this!