Interruptable garbage collection

Moved from GitHub badger/1324

Posted by Stebalien:

What version of Go are you using (go version)?

$ go version
go version go1.14.2 amd64/linux

What operating system are you using?

Linux

What version of Badger are you using?

v1.6.1

Description

RunValueLogGC cannot be interrupted and can potentially run for minutes on a large enough datastore. This can be a problem if garbage collection is running on shutdown.

It would be nice to be able to pass a context to RunValueLogGC to cancel the process early. Alternatively, allowing a concurrent call to Close to interrupt a GC run would fix this. However, at the moment, close can’t be called concurrently with anything else.

Would a new RunValueLogGCWithContext function be considered if a patch were proposed?

jarifibrahim commented :

This looks like a useful enhancement since GC could take minutes at times to complete.The GC has two steps. The first sampling step can be exited at any time.
https://github.com/dgraph-io/badger/blob/536fed1846d0f4db9579bcff6797614e134eadfa/value.go#L1628 The second step is where we rewrite the value log file. I’m wondering if there would be any side effects of stopping the rewrite while it’s running https://github.com/dgraph-io/badger/blob/536fed1846d0f4db9579bcff6797614e134eadfa/value.go#L499