SIGILL: illegal instruction on older CPU

Moved from GitHub dgraph/5427

Posted by miko:

What version of Dgraph are you using?

v20.03.1

Have you tried reproducing the issue with the latest release?

Yes. Actually there is no such bug on older version (was wunning v1.2.0)

What is the hardware spec (RAM, OS)?

Hardware:
Vendor: Sun Microsystems
Version: S39_3B25
Release Date: 11/15/2007
System Information
Manufacturer: Sun Microsystems
Product Name: Sun Fire X2200 M2
Version: Rev 50

CPU:
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron™ Processor 2218

Steps to reproduce the issue (command/config used to run Dgraph).

Just load some data and query for it (I was using graphql, but I think it is not relevant).

Expected behaviour and actual result.

Actual result is server (alpha) crash with error:

14:04:38.013454      13 admin.go:540] Successfully loaded GraphQL schema. 
 Serving GraphQL API.
SIGILL: illegal instruction
PC=0x1242db5 m=10 sigcode=2

goroutine 320 [running]:
github.com/dgryski/go-groupvarint.Decode4(0xc023998a10, 0x4, 0x4, 0xc023d38440, 
0x11, 0x20)
	/go/pkg/mod/github.com/dgryski/go-groupvarint@v0.0.0-20190318181831-5ce5
df8ca4e1/decode_amd64.s:16 +0x25 fp=0xc023998988 sp=0xc023998980 pc=0x1242db5
github.com/dgraph-io/dgraph/codec.(*Decoder).UnpackBlock(0xc023d3d680, 0xc0002bf
180, 0x7f1f8345b7d0, 0x0)
	/ext-go/1/src/github.com/dgraph-io/dgraph/codec/codec.go:157 +0x20d fp=0
xc023998a58 sp=0xc023998988 pc=0x12435ad
github.com/dgraph-io/dgraph/codec.(*Decoder).Seek(0xc023d3d680, 0x0, 0x1, 0xc023
d1c420, 0xc023998b10, 0x0)
	/ext-go/1/src/github.com/dgraph-io/dgraph/codec/codec.go:195 +0x2c2 fp=0
xc023998ad8 sp=0xc023998a58 pc=0x1243b62
github.com/dgraph-io/dgraph/posting.(*pIterator).init(0xc023998c98, 0xc023d21320
, 0x0, 0x0, 0x0, 0x0)
	/ext-go/1/src/github.com/dgraph-io/dgraph/posting/list.go:144 +0x1a1 fp=
0xc023998ba8 sp=0xc023998ad8 pc=0x1337021
github.com/dgraph-io/dgraph/posting.(*List).iterate(0xc023d21320, 0x20, 0x0, 0xc
023998da8, 0x8, 0x8)
	/ext-go/1/src/github.com/dgraph-io/dgraph/posting/list.go:645 +0x111 fp=
0xc023998d10 sp=0xc023998ba8 pc=0x1339cb1
github.com/dgraph-io/dgraph/posting.(*List).Uids(0xc023d21320, 0x20, 0x0, 0x0, 0
x194f78e, 0x11, 0x0)
	/ext-go/1/src/github.com/dgraph-io/dgraph/posting/list.go:1004 +0x224 fp
=0xc023998de0 sp=0xc023998d10 pc=0x133bca4
github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings.func1(0x0, 0x
1, 0x0, 0x98b566)
	/ext-go/1/src/github.com/dgraph-io/dgraph/worker/task.go:794 +0x11c4 fp=
0xc023998f80 sp=0xc023998de0 pc=0x1411534
github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings.func2(0xc023d
212c0, 0xc00053d540, 0x0, 0x1)
	/ext-go/1/src/github.com/dgraph-io/dgraph/worker/task.go:811 +0x3a fp=0x
c023998fc0 sp=0xc023998f80 pc=0x14119da
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc023998fc8 sp=0xc02
3998fc0 pc=0x9b4de1
created by github.com/dgraph-io/dgraph/worker.(*queryState).handleUidPostings
	/ext-go/1/src/github.com/dgraph-io/dgraph/worker/task.go:810 +0x3b9

I imagine that the bug is this single line:


running on a CPU not supporting SSE3.
While I do understand that you may not support non-SSE3 cpus, is there any way (like compilation options) I can still run dgraph server on this machine? v1.2.0 still runs OK on this machine.

danielmai commented :

Hi @miko. Can you try with a Dgraph built without the assembly code by setting the build flag noasm? That would create a binary with the Go-based code from go-groupvarint instead of the assembly code.

If you clone this repo you can build from master or off of a particular git tag (e.g., v20.03.1) with make BUILD_TAGS="noasm" install

git clone https://github.com/dgraph-io/dgraph
cd ./dgraph
git checkout v20.03.1
make BUILD_TAGS="noasm" install # installs dgraph binary in $GOPATH/bin

If the binary is built with the assembly code, then the log message [Decoder]: Using assembly version of decoder will be printed. Otherwise, it won’t print out anything about the decoder.

Actually I have recompiled dgraph on this old server (with “make image”), and it works OK, even if it prints [Decoder]: Using assembly version of decoder. Have checked both v20.03.1 and master.
So it works if recompiled locally. Thanks for the tip.
(BTW: I always use docker images)

1 Like