Badger release process

Updated decision - Badger release process - #16 by ibrahim

We currently merge Badger master into Dgraph before its patch and major release. Going forward we propose to do away with this model.

For Dgraph patch releases, we want to avoid this practice as it introduces lot of changes that do not belong to a patch release. The expectation from a patch release is to fix problems observed by users and do it frequently (vs a major release) without compromising its stability. Minimizing the changes that go into a patch release is a step in this direction.

For Dgraph major releases, we want to benefit from the wide use of Badger in the open source community and avoid using Dgraph as the primary vehicle to try out the changes and uncover issues.

We are proposing to move Badger to a calendar release version system used by Dgraph. This would allow us to not mix Badger bugs that needs to be incorporated into Dgraph patch release with Badger improvements or new features that needs to go into Dgraph major release. This would increase the number of Badger releases.

We would target Badger major releases a month before Dgraph major release. This would allow us to use a stable released version with Dgraph. This would start with Dgraph 20.11 release. Patch releases of 20.11 would use the corresponding patch release of badger as needed.

We still need to continue with our current focus on adding stress, longevity and scale testing to do a better job on uncovering bug fixes before they are released to customers. For dgraph 20.03 patches and 20.07 major release, we would use a stop-gap badger branch that would have the bug fixes made in June and July

cc: @ibrahim, @Paras, @LGalatin

1 Like

Something to note from CalVer blog post: Why there would be no Dgraph 2.0: Goodbye Semantic Versioning - Dgraph Blog

For client-versioning, there’s a special exception for the Go client . dgo versions would include an extra digit in the YY number to allow for breaking changes within the year. In a hypothetical scenario, they could be called v200.03.0, v201.07.0, v202.11.0 if there are multiple breaking API changes within the year. This versioning scheme still correlates to Dgraph versions, yet supports Go Modules which mandates using Semantic Versioning.

Also, why not start from a 20.07 corresponding release of Badger?

1 Like

Thanks for jotting this down.

I am in full agreement with this. Having badger have its own release cadence of sorts allows separation of concerns w.r.t Dgraph as a consumer (and not driver) of badger package. Badger should be its own independent repo and Dgraph just one of the many applications that uses Badger.

To address both these, @ibrahim already had a branch dgraph-maintenance (I dont like the word dgraph in a badger repo’s branch anyways). We could just rename this to say Badger v20.05.0.
Dgraph 20.03 point to Badger 20.05.0
Dgraph 20.07 point to Badger 20.05.0
Dgraph master point to Badger 20.05.x or Badger 20.09.0
Dgraph 20.11 (cut from master) point to whatever master points to at that time


1 Like

I like the badger release process that @vvbalaji is suggesting. It will be a lot of work for @ibrahim so he will need to have some help with the releases. We are already late for doing a badger release for 20.07 right (if we want to do it one month before).

I’d say whatever 20.07 is going to depend on, make that the first Badger release under calendar versioning.

Agreed. As of now, Ibrahim found a stable commit on Badger as of May, 2020. Hence, I use that name v20.05.x in my note above.

To bootstrap, I think it should be ok to time-travel to the past this one time and name the Badger release v20.06 (or v20.05).

There was another idea by @Shekar that we should name Badger releases same as Dgraph releases, even if they release a month (or whatever) before. That was a good idea, we thought.

We can. As Paras mentioned above, we already have a branch forked from May. We can release it with the bug fixes instead of just merging it with dgraph.

1 Like

I am in total agreement of using a released version of Badger in dgraph but I see two challenges (I’ve spoken with @vvbalaji and @Paras already about these)

  1. Engineer’s working on dgraph need some changes to an API on Badger - Right now, we’ll do a quick change in badger and then update badger in dgraph. Going forward, the dgraph bug/enhancement will have to wait for the next badger release (maybe after a month).
  2. Customer bugs - Badger will always have some bugs and our current approach is to fix it and update badger when a customer reports a bug. Going with the released version means that the customer issue also has to wait for the next release.

As long as we all are in agreement that Dgraph will use a released version of Badger and under no circumstances we cut corners and do a release for dgraph issues, we should be good.

If we will be doing badger releases with calendar versioning, we need to figure out the end of life policy too.

Correction, the dgraph-maintenance branch isn’t stable. I have cut the branch from the badger commit dgraph master has currently. I could’ve cut a branch from a previous stable point but then we’d be losing the memory/performance optimizations (which have bugs). If we’re okay with losing optimizations and some fixes, let’s use the badger v2.0.3 which was released in March 2020, that’s the last released badger version and it’s stable.

We need more people on Badger. The release process does take up time (we’ll also need to test every release). I can’t do everything single-handedly.

Cut betas like Dgraph does until next stable release.

@vvbalaji let’s allocate an engineer for Badger under Ibrahim.

1 Like

For 20.03.4 are we comfortable with this branch or do we need to go back to the changes that were published in 20.03.1 that is recommended to our customers? Dgraph v20.03.1 release

I am assuming March 2020 cut off would be missing some of the changes that were already in 20.03.1

This stable branch also has all the crashes mentioned in Current state of badger crashes - #5 by vvbalaji . The stable branch doesn’t have new changes, but existing changes (which have bugs) are already in the dgraph-maintenance branch.

I am assuming March 2020 cut off would be missing some of the changes that were already in 20.03.1

Yes, that’s correct. We’ll lose a few performance and memory optimizations.

I had a discussion with @mrjn and @vvbalaji on Friday about badger release and here’s what we decided

  • We released badger with v20.07.0 tag which will cause issues with go mod. The tag should’ve been v200.07.0 (but we’re not fixing it. Read below.).
  • Badger needs calendar versioning because dgraph needs it. But badger needs to be mindful of other projects (jaegar tracing, ipfs, etc) using badger which will need these patches.
  • Using v200.x.x will require us to maintain old v1x and v2x along with these versions.
  • A better idea is to keep badger major versions at v1x and v2x and do minor releases based on dgraph versions.
  • Dgraph v20.07.x will use badger v2.2007.y .
    • Dgraph v20.07.x will use only badger v2.2007.y versions and release/v2.2007 branch gets only patch fixes.
    • Dgraph v20.11.x will use badger v2.2011.x and so on.
  • Badger major version will change only on data format change as mentioned on badger/ at master · dgraph-io/badger · GitHub
  • Badger v1x will get only absolutely-necessary bug fixes. We will maintain only the badger versions that dgraph is using.

Action items

  • Ibrahim - Create a new v2.2007 branch from badger release/v20.07 and create a new v2.2007.0 tag.
  • Ibrahim - Cherry-pick the bug fixes that are needed for dgraph v20.07.1 and do a badger v2.2007.1 release.

@mrjn’s notes from the call

# Badger
- 20.07.0 -> Current Badger.
- 20.11.0 -> Breaking change.
Go mod requires Semantic Versioning (as per Manish understanding)

-> If this is true.
20 -> 200, to allow to use the MAJOR number to indicate breaking changes.
20.11.0 -> 200.11.0 (with a different MAJOR number from 20.07.x)
20.11.0 -> 21.11.0 (perhaps)
Have a MAJOR number, like say 3.

How do we release 20.11.0?
3.20.11 -> patch release as per SemVer. But, has a whole bunch of features as per us.
That has a problem.
We add an extra number to the MAJOR number in CalVer.
20.07.0 -> 200.07.0
20.11.0 -> 200.11.0 or 201.11.0, depending upon breakages.
21.03.0 -> 210.03.0
21.07.0 -> 210.07.0 or 211.07.0

## CalVer in Badger [Decided solution]
We need CalVer in Badger, because Dgraph needs it.
Because we make optimizations which are fine for SerVer patch, but not OK for Dgraph.
We keep our SemVer in Badger, but bring updates to them as and when we can. If we do this frequently enough, then we might be able to get more people testing Badger.
VV Balaji: 2.MINOR.x. MINOR -> Dgraph release. `2.2007.x`, `2.2011.x` 
`X.Dgraph Release.Z` for Badger.