Confused on how GC and Compaction are supposed to work

Been playing with Badger for our solution. I have have added in a worker to handle the GC

db.RunValueLogGC(0.001)

And have this for my options

cfg := badger.DefaultOptions(badgerPath).WithValueLogFileSize(5000000).WithLoggingLevel(badger.DEBUG)

So I changed the file size to be about 5M and have the ratio set to .001 just to try and see if I can force more GC to happen.

And I send through several items, and delete them. But I see this in my badger db directory

$ ls -latrh
total 86M
-rw-------    1 sgs      sgs           16 Jul 28 21:28 MANIFEST
-rw-r--r--    1 sgs      sgs            2 Jul 28 21:28 LOCK
-rw-------    1 sgs      sgs           28 Jul 28 21:28 KEYREGISTRY
-rw-r--r--    1 sgs      sgs         1.0M Jul 28 21:28 DISCARD
drwxrwxrwt    1 root     root          33 Jul 28 21:28 ..
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 21:39 000001.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 21:42 000002.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 21:44 000003.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 22:52 000004.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 22:55 000005.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 22:58 000006.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:00 000007.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:02 000008.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:04 000009.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:06 000010.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:07 000011.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:09 000012.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:10 000013.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:12 000014.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:15 000015.vlog
-rw-r--r--    1 sgs      sgs         4.8M Jul 28 23:19 000016.vlog
drwx------    2 sgs      sgs         4.0K Jul 28 23:19 .
-rw-r--r--    1 sgs      sgs         9.5M Jul 28 23:26 000017.vlog
-rw-r--r--    1 sgs      sgs       128.0M Jul 28 23:26 00001.mem

As you can see I am getting several .vlog files. But notice the DISCARD file never gets updated. This makes me feel like maybe compaction is not happening.

And I see this Debug message

DEBUG: No file with discard stats

Which makes sense since I don’t see any updates to the DISCARD file.

Any help in explaining to me what I am missing would be appreciated. Also I am using Badger v3

Here are some more observations. It does look like compaction is not getting run. I believe this is because badger creates the list of tables when it Opens the DB

For our application each time it starts we start with a fresh badger DB, This is because we already persist the data in kafka. We just use badger DB to be an on disk cache of the data we need. So when the application starts up. I see this log message

badger 2021/07/29 20:49:17 INFO: All 0 tables opened in 0s

And so if you look here badger/levels.go at 0c45f5f130a73c10ecb7072db7b7fe05a5ce41f6 · dgraph-io/badger · GitHub

The priority for L0 is always 0. I added this log line

     // Add L0 priority based on the number of tables.
     s.kv.opt.Infof("compact add priority L0 %d %d", s.levels[0].numTables(), s.kv.opt.NumLevelZeroTables)
     addPriority(0, float64(s.levels[0].numTables())/float64(s.kv.opt.NumLevelZeroTables))

And this is all I see in the logs

badger 2021/07/29 20:33:38 INFO: compact add priority L0 0 5

Any response that can help me understand this would be appreciated

Hey. I have the same problem and made some efforts to fix it.