Hello everyone, I am currently using https://github.com/mosuka/cete as key-value storage for our configuration which relies on badger db for storing information.
We have noticed that at every restart the number of vlog files increases even if no writes were performed prior to the restart. The restart might be the result of a forced shutdown (preemtible nodes restarting) or a kubernetes probe killing the process.
The resulting situation is that if the cete/badger process does not come back online fast enough the kubernetes scheduler will eventually kill the process and restart it. At the next restart the database grows even bigger, slowing down the startup even further, this cycle goes on until no space on disk is left.
NOTE: We have RunValueLogGC running periodically to cleanup old vlog files, but the garbage collector is not able to keep up with this loop.
My questions would be:
- Why new vlog are generated at every startup?
- Is it because the badger instance was killed / crashed?
- How can we avoid the database from entering this loop without disabling probes?
As an additional information, badger is running with