Understanding Garbage Collection

Having trouble getting Garbage collection to work. We are seeing our disk size growing unbounded despite running GC (GC returns nothing to collect). Looking around, it seems GC will only run if there are discard stats available. It seems discard stats are only calculated on compaction. I only see compactions run on a Drop(), Open() or Close(). Do compactions happen automatically in the background? Or is an explicit call to Flatten() needed?

1 Like

Realize my question was a bit misguided.

It looks like the compactors are running in the background, however, compaction only runs if the LSM tree needs compaction. Our application sees many duplicate writes to keys growing the value log, however, the LSM tree looks to be staying within the non-compact size. Is there a way to get discard stats updated even if the LSM tree does not need to be compacted?

1 Like

You are right, compactions run in background and gets triggered only if LSM tree needs compaction.

Currently, you will have to flatten the DB to gain compaction benefit. You can also try increasing the value log threshold, so that LSM tree size increases and probability of compactions getting triggered increases.

1 Like