Feature Request: Allow scanning value log sequentially in file order

Moved from GitHub badger/1368

Posted by eloff:

I’m considering running BadgerDB on HDDs for a use case where I only insert items, and then sometimes scan the entire database. Obviously with limited random access abilities of a HDD, these scans would be many orders of magnitude more efficient if they could walk the value log sequentially in file order.

Am I right that there isn’t currently a way to do this or did I miss something?

Could such an iterator be implemented easily in BadgerDB? I think so from what I know of the architecture, but I’m not familiar with the internals.

Interestingly, if there was a feature like Feature Request: add a hook to customize SST compaction · Issue #1367 · dgraph-io/badger · GitHub one could maybe abuse the callback and the compaction to not only scan the entire value log in order, but to compact it at the same time.

jarifibrahim commented :

Hey @eloff, I’d like to understand this better.

these scans would be many orders of magnitude more efficient if they could walk the value log sequentially in file order.

Badger stores data in sorted form. How would reading a value log file help you? Also, value log is the write ahead log file and sst is where the index and keys are stored.

eloff commented :

Hi @jarifibrahim,

I might have an incorrect mental model here, so please bear with me.

What I’m thinking is the value log contains all key-value pairs, which may also be duplicated in the SST (or just the key and a pointer into the value log). From what I understand the value log is the write-ahead log, so it must have the keys as well.

So if one has a use case where you want to scan all key-value pairs, and the values tend to be stored in the value log and not the SST, then it’s more efficient to walk the value log in file-order then to walk the keys in the SST and look up the value via random access.

The limitations are of course the data won’t be ordered by key and you’d need to handle deleted key-values manually., When you find the key-value pair you don’t know at that moment if you’ll find a delete later on in the value log. If your database is append-only then you don’t have to worry about that at all.

My use case specifically is conducting a brute force search over all items in an append-only database. Scanning the value log sequentially would be the most efficient way to do that and would allow the OS to perform aggressive read-ahead. On a spinning rust type device, the performance difference would be multiple orders of magnitude.

jarifibrahim commented :

@eloff Thanks for the detailed explanation. What you’re looking for is similar to vlog.Iterate. badger/value.go at e013bfd25899ec014d7117fa6e94b17a114d9921 · dgraph-io/badger · GitHub

We should be able to wrap this function in a public API and expose it so that people can iterate the value log file in a sequential manner. This has to be a read-only operation.

1 Like

That looks exactly like what I’m talking about, having a public API around that would be great.

@ibrahim this looks like a good first issue for @Naman or anyone else starting to work on Badger. Do you have an ETA in mind for this?

1 Like

I’ve tagged this as a good first issue.