Iteration over keys added in transaction

What did you do?

Added many (like 10000 ~ 100000) new keys in one transaction, then tried to iterate over them multiple times. The keys are small (i.e. 10-30 bytes) and values are either completely empty (nil) or also very small.

What did you expect to see?

Creation of iterator in this case should have approximately the same cost as I would get by creating the same iterator for the fresh transaction after committing the one I am now in.

What did you see instead?

Very slow iterator creation.

Description

I dug in the code a bit and found that the problem arises because each iterator creation performs sorting of all the pending writes from scratch here.

In my project I perform a number of iterations over keys with a specific prefix, and currently use prefix property from the IteratorOptions struct for this. Because of the problem I describe, my current workaround is to create an iterator and use it multiple times using seek to perform all iterations using the same iterator object, but this looks a bit counterintuitive and makes the code less readable.

Another, more serious problem, with such approach, is that it does not allow iterations interleaved by writes (i.e. iterate, write new records, iterate again), because old iterator will never iterate over newly added keys.

I want to make a pull request to badger-db to improve the performance for such use cases. My idea is to swap the pendingWrites map[string]*Entry from here with the tree-map structure, which would keep the keys in a sorted order throughout the lifetime of the transaction, thus elimination the need for resorting on each iterator creation.

I think about the following tree map implementation options (the links are not clickable because forum forbids posting more than two links for a new user):

  • ncw/gotemplate
  • emirpasic/gods

the latter one seems to be the most reliable.

What are your thoughts on this?

Please see this commit:
https://github.com/milaboratory/badger/commit/eafc3fdd30caecedac1b88853fccd3da63d93bdf

Basically I totally eliminated the need for copying with the help of Clone() method from the Google’s BTtee library.

Please let me know your thoughts. I will be happy to make any additional modifications to make this change aligned with badgerdb style and make it possible to merge those changes back to badgerdb’s repo.

1 Like