Moved from GitHub badger/1326
Posted by bonedaddy:
What version of Go are you using (go version
)?
go version go1.14.2 linux/amd64
What operating system are you using?
NAME=“Ubuntu”
VERSION=“18.04.4 LTS (Bionic Beaver)”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu 18.04.4 LTS”
VERSION_ID=“18.04”
HOME_URL=“https://www.ubuntu.com/”
SUPPORT_URL=“https://help.ubuntu.com/”
BUG_REPORT_URL=“https://bugs.launchpad.net/ubuntu/”
PRIVACY_POLICY_URL=“https://www.ubuntu.com/legal/terms-and-policies/privacy-policy”
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
What version of Badger are you using?
v2.0.3
Does this issue reproduce with the latest master?
Haven’t tried
Steps to Reproduce the issue
- Store a ton of data in your key-value store (in this case 1.7TB)
- Restart badger
- After service startup iterate over all keys in the key-store
What Badger options were set?
Default options with the following modifications:
DefaultOptions = Options{
GcDiscardRatio: 0.2,
GcInterval: 15 * time.Minute,
GcSleep: 10 * time.Second,
Options: badger.DefaultOptions(""),
}
DefaultOptions.Options.CompactL0OnClose = false
DefaultOptions.Options.Truncate = true
I’ve also set the following:
- ValueLogLoadingMode = FileIO
- TableLoadingMode = FileIO
- SyncWrites = false
What did you do?
At the start of my service, the key-value store will be iterated over to announce to peers data in the key-value store. Unfortunately however when storing a large amount of data in that key-value store (1.7TB), iterative over the kv allocates a large amount of memory.
What did you expect to see?
Being able to iterate over the keys without allocating a large amount of memory
What did you see instead?
2GB+ of allocations when iterating over all the keys in a large datastore of 1.7TB
Additional Information
I recorded the following profile which shows what’s responsible for the memory allocations:
2239.12MB 57.90% 57.90% 2239.12MB 57.90% github.com/RTradeLtd/go-datastores/badger.(*txn).query
687.09MB 17.77% 75.66% 687.09MB 17.77% github.com/dgraph-io/badger/v2/table.(*Table).read
513.05MB 13.27% 88.93% 1139.44MB 29.46% github.com/RTradeLtd/go-datastores/badger.(*txn).query.func1
83.20MB 2.15% 91.08% 83.20MB 2.15% github.com/dgraph-io/badger/v2/skl.newArena
69.16MB 1.79% 92.87% 109.17MB 2.82% github.com/dgraph-io/badger/v2/pb.(*TableIndex).Unmarshal
40MB 1.03% 93.90% 40MB 1.03% github.com/dgraph-io/badger/v2/pb.(*BlockOffset).Unmarshal
It looks like this is because I have a function that is iterative over all the keys in the key-value store to broadcast the keys to another peer. I’m not sure why this would result in a massive amount of memory being allocated though.
This seems somewhat related to other reported issues such as Provide simple option for limiting total memory usage · Issue #1268 · dgraph-io/badger · GitHub. The usage of FileIO
for table and value log loading mode seems to decrease memory usage abit, however it seems like the overall process of reading keys and/org value from badger requires a lot of memory