Test suite failure on 32 bit architectures, like armhf or x86

Moved from GitHub badger/1384

Posted by slyon:

What version of Go are you using (go version)?

$ go version
go version go1.14.3 linux/armhf # on Ubuntu
go version go1.14.2 linux/i386 # on Debian

What operating system are you using?

Debian/Ubuntu Linux (32 bit)

What version of Badger are you using?

2.0.3

Does this issue reproduce with the latest master?

(Not yet tested)

Steps to Reproduce the issue

Running the following command inside an Ubuntu Groovy armhf machine (e.g. RaspberryPi) within the badger 2.0.3 source:
Download and install http://cdimage.ubuntu.com/ubuntu-server/daily-preinstalled/current/groovy-preinstalled-server-armhf+raspi.img.xz (e.g.: https://wiki.ubuntu.com/ARM/RaspberryPi)
Then:

ssh ubuntu@ubuntu #password: ubuntu
sudo apt install golang-go git devscripts
sudo apt install dh-golang golang-any golang-github-cespare-xxhash-dev golang-github-datadog-zstd-dev golang-github-dgraph-io-ristretto-dev golang-github-dgryski-go-farm-dev golang-github-dustin-go-humanize-dev golang-github-pkg-errors-dev golang-github-spf13-cobra-dev golang-github-stretchr-testify-dev golang-golang-x-net-dev golang-golang-x-sys-dev golang-goprotobuf-dev golang-snappy-go-dev
git clone https://salsa.debian.org/go-team/packages/badger.git
wget http://deb.debian.org/debian/pool/main/b/badger/badger_2.0.3.orig.tar.gz
cd badger
debuild

What Badger options were set?

Default test suite options.

What did you do?

Running the test suite while building the Debian/Ubuntu package:

go test -vet=off -v -p 4 github.com/dgraph-io/badger/badger github.com/dgraph-io/badger/badger/cmd github.com/dgraph-io/badger/integration/testgc github.com/dgraph-io/badger/options github.com/dgraph-io/badger/pb github.com/dgraph-io/badger/skl github.com/dgraph-io/badger/table github.com/dgraph-io/badger/trie github.com/dgraph-io/badger/y

What did you expect to see?

A successful run of the badger test suite and thus successful build of the Debian/Ubuntu package. Similar to this Ubuntu Groovy arm64 (64 bit) test & build:
https://launchpadlibrarian.net/477251471/buildlog_ubuntu-groovy-arm64.badger_2.0.3-1_BUILDING.txt.gz

What did you see instead?

A failure of the badger test suite and thus failed build of the 32 bit Debian/Ubuntu packages. The detailed logs can be found here for a Ubuntu Groovy armhf (32 bit) test run: https://launchpadlibrarian.net/482048892/buildlog_ubuntu-groovy-armhf.badger_2.0.3-1_BUILDING.txt.gz

Looking at the Debian build logs, we can find very similar issues for all 32 bit architectures (e.g. i386, armhf, armel, mipsel, …):
https://buildd.debian.org/status/package.php?p=badger&suite=sid – while it is working on the 64 bit architectures for Debian and Ubuntu.

jarifibrahim commented :

Hey @slyon , badger tests do fail on 32 bit machine. We tried to run badger tests on 32 bit machine on travis Fix int overflow for 32bit by vardhanapoorv · Pull Request #1216 · dgraph-io/badger · GitHub but it looks like travis is using a 64 bit machine.

We’re looking for ways to run 32 bit tests on travis CI. if you have any suggestions, that would be great. Fixing the tests for 32 bit is easy, the difficult part is ensuring we don’t break the build again.

slyon commented :

Hey @jarifibrahim , is this “easy fix” already available somewhere? Or what would I need to do, in order to fix the 32 bit tests for Debian & Ubuntu? It seems like the code changes from Fix int overflow for 32bit by vardhanapoorv · Pull Request #1216 · dgraph-io/badger · GitHub are already part of the 2.0.3 release, which is failing to build here…

TravisCI seems to only support 64 bit machines, indeed. But you might be able to run a 32 bit image or container on a 64 bit host machine, e.g. 32 bit Ubuntu Groovy armhf on travis arm64 linux machine or 32 bit Debian testing i386 on travis amd64 linux machine. I never did that myself, though…

jarifibrahim commented :

Most of the crashes I see are invalid filename Build log for badger (2.0.3-1) on i386 and this could be because an int overflowing on 32-bit system. I also see a cannot allocate memory which could be because the CI system doesn’t have enough memory.

We don’t have a fix for the failing tests yet.

I see delve runs 32 bit tests in a container and badger could do the same delve/.travis.yml at master · chainhelen/delve · GitHub

slyon commented :

Most of the crashes I see are invalid filename Build log for badger (2.0.3-1) on i386 and this could be because an int overflowing on 32-bit system. I also see a cannot allocate memory which could be because the CI system doesn’t have enough memory.

That’s what I thought at first… But as it is failing on all 32 bit architectures across different projects, I have the impression that maybe the 2GB memory limit of 32 bit arches might be exceeded during the test, which would fail with a similar error. Especially as the Debian & Ubuntu autopkgtest-cloud uses very similar VMs for the different architectures, so if there is enough memory to run it on arm64, there should also be enough memory to run it on armhf.

We don’t have a fix for the failing tests yet.

Okay, let’s see if I can find some time to dig into it.

I see delve runs 32 bit tests in a container and badger could do the same delve/.travis.yml at master · chainhelen/delve · GitHub

Yes! It would be great if 32 bit compatibility could be checked regularly. :slight_smile:

jarifibrahim commented :

Yes! It would be great if 32 bit compatibility could be checked regularly. :slight_smile:

I’ve created Support 32 bit builds on travis · Issue #1385 · dgraph-io/badger · GitHub for it

That’s what I thought at first… But as it is failing on all 32 bit architectures across different projects, I have the impression that maybe the 2GB memory limit of 32 bit arches might be exceeded during the test, which would fail with a similar error. Especially as the Debian & Ubuntu autopkgtest-cloud uses very similar VMs for the different architectures, so if there is enough memory to run it on arm64, there should also be enough memory to run it on armhf.

Some tests in badger might be running in pure-in-memory mode (for instance the write batch in-memory tests) and they could take up more than 2 GB.