Why on linux slower than on win10?

Moved from GitHub badger/1084

Posted by chengjing1181122598:

What version of Go are you using (go version)?

$ go version
1.13

What version of Badger are you using?

v1.6.0

Does this issue reproduce with the latest master?

yes

What are the hardware specifications of the machine (RAM, OS, Disk)?

win10 : i5-7500, 16G RAM, SSD
linux : Intel® Xeon® CPU E5-2637 v3 @ 3.50GHz, 64G RAM, SSD

What did you do?

db, err := badger.Open(badger.DefaultOptions("badger"))
if err != nil {
	panic(err)
}
defer db.Close()
for i := 0; i < 10_0000; i++ {
	bs := make([]byte, 8)
	binary.BigEndian.PutUint64(bs, uint64(time.Now().UnixNano())+rand.Uint64())
	err := db.Update(func(txn *badger.Txn) error {
		return txn.Set(bs, bs)
	})
	if err != nil {
		panic(err)
	}
}

What did you expect to see?

win10 is is similar to linux

What did you see instead?

It spends 3s on win10 but 30s on linux. The linux is a machine of production env, can you try on your win10 and linux? Maybe my linux has some problem?

aschmahmann commented :

I was able to reproduce this as well. The problem seems to be related the way sync writes are handled in linux.

Running this on an HDD takes ~6 seconds on an HDD in Windows 10 + NTFS, and takes a very long time on the same HDD with Ubuntu 19.10 + ext4 (I’m at 10 minutes now, but gave up).

However, if I change the badger options to include SyncWrites=false then it takes ~6 seconds on Linux. The above linked issue has a lot of performance benchmarking that definitely matches up with this issue.

@campoy @jarifibrahim is this expected badger+Linux behavior or a bug?

jarifibrahim commented :

@aschmahmann syncWrites=false would speed up writes because the file is no longer opened in sync mode. Writes in sync mode are expensive. But the issue here seems to be that writes in linux are much slower on HDD compared to windows.

I’ll try to verify this against windows and Linux. Thanks for reporting!

jarifibrahim commented :

It looks like writes are actually much faster on windows. The write syscall takes up 26% of the time on Linux while it takes only 9.8% on windows.


CPU profile on Windows


CPU Profile on Linux

aschmahmann commented :

@jarifibrahim is it possible that there are semantic sync flag differences between Linux and Windows here? From my cursory look following the Go Windows write code it looks like the sync flag might be totally ignored, haven’t had a chance to look at the Linux Go code yet.

src/syscall/syscall_windows.go - The Go Programming Language in the Open function doesn’t seem to check for the sync flag

aschmahmann commented :

It looks like the issue is related to proposal: syscall: define Windows O_ALLOW_DELETE for use in os.OpenFile · Issue #34681 · golang/go · GitHub (see proposal: syscall: define Windows O_ALLOW_DELETE for use in os.OpenFile · Issue #34681 · golang/go · GitHub for confirmation about O_SYNC).

Until Golang decides to support Windows CreateFile flags in Open as it does Unix flags that we may need to copy-paste the Open function call and modify it to support O_SYNC.

What do you think @jarifibrahim ?

tsellers-r7 commented :

Are there data loss concerns related to O_SYNC not working as expected on Windows?

aschmahmann commented :

@tsellers-r7 yes. In general you’re expecting data loss during a crash if you don’t use O_SYNC. If you thought you had O_SYNC, but you didn’t (i.e. on Windows) then you might not be accounting for potential loss as well as you could.

Also worth noting that data loss is always possible depending on how the crash occurs. For example, IIUC some hard drives lie to the OS about when something is on disk and treat the disk’s cache as “on disk”.

stale commented :

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

networkimprov commented :

Can someone submit a patch to Go adding O_SYNC support in Windows syscall.Open()?

stale commented :

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bonedaddy commented :

Ran into this issue using badger-v2. Originally this looked like an issue with ipfs/go-datastore because this issue didnt cause problems when using ipfs/go-datastore@v0.3.1, and tests were completing in roughly 128seconds.

However upgrading to ipfs/go-datastore@v0.4.4 caused this issue to rear its head, and the datastore test suite wouldn’t ever complete. Longest time I had it running was 30 minutes and still the tests were running.

Setting sync writes to false resolved the issue, and the test suite completed in 6 seconds. This seems like a pretty serious issue, as async writes increases the risk of data loss, and doesn’t seem like it’s a solid solution to be used in all situations.

Version Information:

Badger-v2: v2.0.3
go-datastore: v0.4.4
go: go version go1.14.2 linux/amd64
OS:

NAME="Pop!_OS"
VERSION="19.10"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Pop!_OS 19.10"
VERSION_ID="19.10"
HOME_URL="https://system76.com/pop"
SUPPORT_URL="http://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=eoan
UBUNTU_CODENAME=eoan
LOGO=distributor-logo-pop-os

CPU: i7-9750H
Memory: 32GB DDR4
Disk (test was ran off this disk): 256GB NVMe SSD

tsellers-r7 commented :

@bonedaddy - As far as I understand it the bug here isn’t that Linux is slow, it’s that Windows is fast only because Go doesn’t actually open the files as sync and so it’s using async everywhere.

jarifibrahim commented :

Hey @bonedaddy, the root cause of the issue is os: O_SYNC not utilized in os.OpenFile() on Windows · Issue #35358 · golang/go · GitHub .

bonedaddy commented :

@tsellers-r7 good to know, I’m very unfamiliar with windows so that clears things up.

@jarifibrahim thanks for the link, good to know the issue exists in go issue tracking.