Custom comparison function

Moved from GitHub badger/776

Posted by achille-roussel:

Hello,

Today the key comparison function used by the database is bytes.Compare. In cases where we’re dealing with fixed-size integer or ascii keys this makes a lot of sense, but I have a use case for using variable size integers as keys (similar to big.Int), and the order of the keys as defined by bytes.Compare is not the natural order of the integer values.

I’d like to explore modifying the package to make the comparison function configurable at the database level, but I wanted to get feedback from someone more familiar with the implementation in case there were any major concerns.

achille-roussel commented :

@jarifibrahim thanks for labelling the issue.

Do you have any pointers regarding what I should be watching out for to make this change?

jarifibrahim commented :

@achille-roussel I am not very sure but I see that we’re using the comparekeys function everywhere


Try replaying this with your custom comparator and see if it works. I think it would work considering the fact that the underlying representation of keys remains the same.

campoy commented :

Hey @achille-roussel,

Since this would indeed make the database slower for all cases, I want to better understand the reasoning behind the feature itself.

Do you think it’d be possible to somehow convert your keys instead of modifying the way we compare them? What is exactly the use case you’re implementing?

Thanks!

achille-roussel commented :

The use case I have is using variable length integers as keys (big.Int encoded as a little-endian byte sequence). To have the sequence of keys be ordered by the value of the big.Int I need to provide a different comparison function.

Currently the program is using rocksdb which supports configuring a custom comparison function, but it comes with its own set of issues which is why I’m exploring badger as an option to replace it.

campoy commented :

I understand it, would encoding your keys in a different format be too penalizing for your performance, or are there other issues you’re concerned about?

Let us get v2.0 released and then we can reconsider this PR - I think it does have merit but I want to make sure we don’t introduce any performance hit for our existing users.

Do you have a deadline in mind for this project?

achille-roussel commented :

For now we can continue with rocksdb, the switch to badger would be part of research we do for optimization purposes so we don’t have a strict deadline for this part of the work, if it brings improvements we can ship this later.

In the mean time, I’ll dig into the benchmarks that show performance regressions to see if there are ways to optimize the change to make it more acceptable.

campoy commented :

Thanks for understanding, @achille-roussel