@Naman - Looking through the above pull, I had some thoughts.
Rather than just doing one pass of phase 1 and one of phase 2, could the downtime due to phase 2 be minimized by doing possibly multiple cycles of phase 1 bringing it up to present (or almost present) before doing a final phase 2 that blocks only for a few seconds at most?
What I mean is something like this:
- Phase 1 as current
- At end of phase 1, check to see if any commits have happened during (1) or bring the checks up to the current (or very recent) timestamp, restarting phase 1 with a later timestamp if necessary (i.e. still non-blocking for writes)
- Repeat 1+2 as many times as necessary to bring it up to present or nearly present
- Go to phase 2 when it should be possible to only have a few seconds of downtime for commits at most