What are the hardware specifications of the machine (RAM, OS, Disk)?
win10 : i5-7500, 16G RAM, SSD
linux : Intel® Xeon® CPU E5-2637 v3 @ 3.50GHz, 64G RAM, SSD
What did you do?
db, err := badger.Open(badger.DefaultOptions("badger"))
if err != nil {
panic(err)
}
defer db.Close()
for i := 0; i < 10_0000; i++ {
bs := make([]byte, 8)
binary.BigEndian.PutUint64(bs, uint64(time.Now().UnixNano())+rand.Uint64())
err := db.Update(func(txn *badger.Txn) error {
return txn.Set(bs, bs)
})
if err != nil {
panic(err)
}
}
What did you expect to see?
win10 is is similar to linux
What did you see instead?
It spends 3s on win10 but 30s on linux. The linux is a machine of production env, can you try on your win10 and linux? Maybe my linux has some problem?
I was able to reproduce this as well. The problem seems to be related the way sync writes are handled in linux.
Running this on an HDD takes ~6 seconds on an HDD in Windows 10 + NTFS, and takes a very long time on the same HDD with Ubuntu 19.10 + ext4 (I’m at 10 minutes now, but gave up).
However, if I change the badger options to include SyncWrites=false then it takes ~6 seconds on Linux. The above linked issue has a lot of performance benchmarking that definitely matches up with this issue.
@campoy@jarifibrahim is this expected badger+Linux behavior or a bug?
@aschmahmannsyncWrites=false would speed up writes because the file is no longer opened in sync mode. Writes in sync mode are expensive. But the issue here seems to be that writes in linux are much slower on HDD compared to windows.
I’ll try to verify this against windows and Linux. Thanks for reporting!
@jarifibrahim is it possible that there are semantic sync flag differences between Linux and Windows here? From my cursory look following the Go Windows write code it looks like the sync flag might be totally ignored, haven’t had a chance to look at the Linux Go code yet.
Until Golang decides to support Windows CreateFile flags in Open as it does Unix flags that we may need to copy-paste the Open function call and modify it to support O_SYNC.
@tsellers-r7 yes. In general you’re expecting data loss during a crash if you don’t use O_SYNC. If you thought you had O_SYNC, but you didn’t (i.e. on Windows) then you might not be accounting for potential loss as well as you could.
Also worth noting that data loss is always possible depending on how the crash occurs. For example, IIUC some hard drives lie to the OS about when something is on disk and treat the disk’s cache as “on disk”.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Ran into this issue using badger-v2. Originally this looked like an issue with ipfs/go-datastore because this issue didnt cause problems when using ipfs/go-datastore@v0.3.1, and tests were completing in roughly 128seconds.
However upgrading to ipfs/go-datastore@v0.4.4 caused this issue to rear its head, and the datastore test suite wouldn’t ever complete. Longest time I had it running was 30 minutes and still the tests were running.
Setting sync writes to false resolved the issue, and the test suite completed in 6 seconds. This seems like a pretty serious issue, as async writes increases the risk of data loss, and doesn’t seem like it’s a solid solution to be used in all situations.
Version Information:
Badger-v2: v2.0.3
go-datastore: v0.4.4
go: go version go1.14.2 linux/amd64
OS:
@bonedaddy - As far as I understand it the bug here isn’t that Linux is slow, it’s that Windows is fast only because Go doesn’t actually open the files as sync and so it’s using async everywhere.