Duplicated Rows?

(Sovereign313) #1

So, I have some code running that updates a k,v pair. When I retrieve the data it’s properly updated, but if I cat the 000000.vlog file, it shows the historical entries, and the new entry. Is this expected behavior, is my code screwing up and adding multiple kv’s? An example:


(Manish R Jain) #2

Each write to value log is an append. Later, value logs can be GCed.

(Sovereign313) #3

Awesome. Thanks, I just wanted to make sure it doesn’t fill my disk, and spend time looking through my code for a bug. Appreciate it… any idea how long before they can get GCed?

(Daniel Mai) #4

Garbage collection is done manually. The recommendation is to do it periodically, ideally during periods of low activity. See the docs on garbage collection: https://github.com/dgraph-io/badger#garbage-collection

(Sovereign313) #5

Awesome. thank you for your help.

(Sovereign313) #6

So, I’ve built an http server for api requests that use badgerDB… I have an endpoint to call GC:

Which runs this code:

func handleGC(w http.ResponseWriter, r *http.Request) {
        err := db.RunValueLogGC(0.7)
        if err != nil {
                fmt.Fprintf(w, err.Error())

        fmt.Fprintf(w, "success")

It responds with:
Value log GC attempt didn’t result in any cleanup.

I have 2 records in the DB with this info:


The size of the vlog is 224k with every updated “Set” from the start. Why isn’t /gc cleaning up these entries?

(Daniel Mai) #7

GC does not remove the latest value log. If there’s only one vlog, then GC won’t touch it.