How to run GC properly

I have a service which has many badger databases open. Each database has 10 GB and since I never ran compaction, it can be compacted up to 4 GB. Now usually when I call DB.RunValueLogGC(), it ends up with: Value log GC attempt didn’t result in any cleanup. If I repeat this call over and over it works eventually. I would like to understand what happens in order to increase success rate of this calls.

@adwinsky One of our engineers will respond soon. Thanks.

Hey @adwinsky, Compaction removes stale data from the LSM Tree while Value Log GC removes stale data from the Value Log File. Compaction happens automatically (unless you change the default options)

The value log GC attempts to clean up a value log file. Here’s how it works

  1. Pick a value log file (for simplicity, assume it has some logic to find a file)
  2. Once the value log file is picked, we perform sampling on the value log file to ensure we have enough stale data (the ratio of the stale data by the total data in the value log file should be greater than the discard ratio)

If the sampling cannot find enough stale data, we do not perform garbage collection on that value log file. In you case, the initial calls to RunValueLogGC return ErrNoRewrite (GC didn’t result in any cleanup) error because either you don’t have enough vlog files (you need at least 2 vlog file for GC to work) or the sampling on value log file couldn’t find enough stale data.