(Slightly tangential note here, but something to also consider)
Wanted to chime in about my take on semantic versioning in Badger, which I feel can be applied to other DB libraries as well. Semantic Versioning takes a purist and hard stance on what can be via APIs, without consideration for user expectations, industry standards and in particular, databases.
The biggest breakage for DB library users is not a breaking API change. It’s a breaking data layout change on disk. A breaking API change requires a single PR to fix up a users’ code. Annoying, but an hour later (in most cases), you’re done. If that doesn’t work, one can just go back by reverting that PR.
But, a data layout change requires them to take a backup of their entire data set, then re-import it with the new release. They most likely can’t go back, at least not easily.
One could lump every API breakage and data breakage into a major version release, but that would cause the major version to be increased much faster than what DB users expect to see. Case in point, RocksDB and MongoDB after so many years of development are at 6 and 4 respectively. Badger, in its short lifetime, has surely made more than 3 breaking changes to the APIs since the v1.0 release.
One could further argue the need for so many breaking changes. However, I think it is inevitable if your guiding principle is to keep your code base clean (instead of introducing many if branches to deal with backwards compatibility) and your project is understaffed. Both of them have been true for Badger.
So, assuming breaking changes are inevitable and a major version release every few months isn’t expected from a DB, here’s how I think about semantic versioning from a DB standpoint:
- A major version release happens when data layout on disk changes. This is the most painful to the end-user.
- A patch version release happens when there’s a bug fix. This is the simplest upgrade for the end-user.
- Hence, a minor version release should happen on either an API change or a feature release. Every one likes features. An API change from my perspective isn’t as painful as semantic versioning literature makes it to be, particularly in a compiled language. Of course, it can be painful, if for e.g., a used feature was removed. But, that’s rare and can just be part of the major release. However, an API shuffling to provide a bug fix (something we did in Badger), can be part of a minor release (imported and fixed relatively easily by the end-user on upgrade).