Unexpected fault address error on one of the pods

I’ve 3 pods running dgraph alpha with separate hard-disk of 100GB each. 2 of them are running fine and have around 5-10 GB of data. But 1 of the pods is going in crash loop back off and is failing to start. Any help would be greatly appreciated.

Here are the crash logs:

++ awk '{gsub(/\.$/,""); print $0}'
+ dgraph alpha --my=production-dgraph-alpha-1.production-dgraph-alpha-headless.default.svc.cluster.local:7080 --zero production-dgraph-zero-0.production-dgraph-zero-headless.default.svc.cluster.local:5080,production-dgraph-zero-1.production-dgraph-zero-headless.default.svc.cluster.local:5080,production-dgraph-zero-2.production-dgraph-zero-headless.default.svc.cluster.local:5080
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/04/13 04:20:27 Integration installed: ContextifyFrames
[Sentry] 2021/04/13 04:20:27 Integration installed: Environment
[Sentry] 2021/04/13 04:20:27 Integration installed: Modules
[Sentry] 2021/04/13 04:20:27 Integration installed: IgnoreErrors
[Decoder]: Using assembly version of decoder
Page Size: 4096
[Sentry] 2021/04/13 04:20:28 Integration installed: ContextifyFrames
[Sentry] 2021/04/13 04:20:28 Integration installed: Environment
[Sentry] 2021/04/13 04:20:28 Integration installed: Modules
[Sentry] 2021/04/13 04:20:28 Integration installed: IgnoreErrors
I0413 04:20:28.222272      18 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.
W0413 04:20:28.223089      18 run.go:573] --lru_mb is deprecated, use --cache_mb instead
I0413 04:20:28.403759      18 init.go:107]

Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit https://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.


I0413 04:20:28.403793      18 run.go:696] x.Config: {PortOffset:0 QueryEdgeLimit:1000000 NormalizeNodeLimit:10000 MutationsNQuadLimit:1000000 PollInterval:1s GraphqlExtension:true GraphqlDebug:false GraphqlLambdaUrl:}
I0413 04:20:28.403851      18 run.go:697] x.WorkerConfig: {TmpDir:t ExportPath:export NumPendingProposals:256 Tracing:0.01 MyAddr:production-dgraph-alpha-1.production-dgraph-alpha-headless.default.svc.cluster.local:7080 ZeroAddr:[production-dgraph-zero-0.production-dgraph-zero-headless.default.svc.cluster.local:5080 production-dgraph-zero-1.production-dgraph-zero-headless.default.svc.cluster.local:5080 production-dgraph-zero-2.production-dgraph-zero-headless.default.svc.cluster.local:5080] TLSClientConfig:<nil> TLSServerConfig:<nil> RaftId:0 WhiteListedIPRanges:[] MaxRetries:-1 StrictMutations:false AclEnabled:false AbortOlderThan:5m0s SnapshotAfter:10000 ProposedGroupId:0 StartTime:2021-04-13 04:20:27.867970504 +0000 UTC m=+0.017959390 LudicrousMode:false LudicrousConcurrency:2000 EncryptionKey:**** LogRequest:0 HardSync:false}
I0413 04:20:28.403907      18 run.go:698] worker.Config: {PostingDir:p PostingDirCompression:1 PostingDirCompressionLevel:0 WALDir:w MutationsMode:0 AuthToken: PBlockCacheSize:1395864371 PIndexCacheSize:751619276 WalCache:0 HmacSecret:**** AccessJwtTtl:0s RefreshJwtTtl:0s CachePercentage:0,65,35,0 CacheMb:0}
I0413 04:20:28.404151      18 log.go:295] Found file: 60 First Index: 1770001
I0413 04:20:28.404204      18 log.go:295] Found file: 61 First Index: 1800001
I0413 04:20:28.404298      18 storage.go:132] Init Raft Storage with snap: 1798300, first: 1798301, last: 1805564
I0413 04:20:28.404326      18 server_state.go:76] Setting Posting Dir Compression Level: 0
I0413 04:20:28.404338      18 server_state.go:120] Opening postings BadgerDB with options: {Dir:p ValueDir:p SyncWrites:false NumVersionsToKeep:2147483647 ReadOnly:false Logger:0x2e0ddf8 Compression:1 InMemory:false MemTableSize:67108864 BaseTableSize:2097152 BaseLevelSize:10485760 LevelSizeMultiplier:10 TableSizeMultiplier:2 MaxLevels:7 ValueThreshold:1024 NumMemtables:5 BlockSize:4096 BloomFalsePositive:0.01 BlockCacheSize:1395864371 IndexCacheSize:751619276 NumLevelZeroTables:5 NumLevelZeroTablesStall:15 ValueLogFileSize:1073741823 ValueLogMaxEntries:1000000 NumCompactors:4 CompactL0OnClose:false ZSTDCompressionLevel:0 VerifyValueChecksum:false EncryptionKey:[] EncryptionKeyRotationDuration:240h0m0s BypassLockGuard:false ChecksumVerificationMode:0 DetectConflicts:false managedTxns:false maxBatchCount:0 maxBatchSize:0}
unexpected fault address 0x7f020e493000
[Sentry] 2021/04/13 04:20:29 Sending fatal event [daf6feff71d8484abc452f463bd9b930] to o318308.ingest.sentry.io project: 1805390
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7f020e493000 pc=0xa9b431]

goroutine 1 [running]:
runtime.throw(0x1d77e26, 0x5)
        /usr/local/go/src/runtime/panic.go:1116 +0x72 fp=0xc0002181b8 sp=0xc000218188 pc=0xa60af2
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:739 +0x485 fp=0xc0002181e8 sp=0xc0002181b8 pc=0xa77665
runtime.memmove(0x7f020e493000, 0xc00c7d0020, 0x14)
        /usr/local/go/src/runtime/memmove_amd64.s:183 +0x151 fp=0xc0002181f0 sp=0xc0002181e8 pc=0xa9b431
github.com/dgraph-io/badger/v3.(*logFile).bootstrap(0xc0004f8680, 0xc0001229c0, 0xc0001229c0)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/memtable.go:628 +0x1b3 fp=0xc000218260 sp=0xc0002181f0 pc=0x123afb3
github.com/dgraph-io/badger/v3.(*logFile).open(0xc0004f8680, 0xc00c7cc050, 0xb, 0x42, 0x8000000, 0xa00000, 0xa)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/memtable.go:560 +0x4ae fp=0xc000218328 sp=0xc000218260 pc=0x123ad4e
github.com/dgraph-io/badger/v3.(*DB).openMemTable(0xc000065800, 0x67, 0x42, 0xc0004cde10, 0xc0006680b0, 0x1f76a80)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/memtable.go:130 +0x274 fp=0xc000218510 sp=0xc000218328 pc=0x1237a14
github.com/dgraph-io/badger/v3.(*DB).newMemTable(0xc000065800, 0x1d752a7, 0x1, 0x1d752a7)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/memtable.go:153 +0x55 fp=0xc0002185a0 sp=0xc000218510 pc=0x1237e55
github.com/dgraph-io/badger/v3.Open(0x1d752a7, 0x1, 0x1d752a7, 0x1, 0x0, 0x7fffffff, 0x0, 0x1f94b80, 0x2e0ddf8, 0x1, ...)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/db.go:304 +0x1185 fp=0xc000218c20 sp=0xc0002185a0 pc=0x1213385
github.com/dgraph-io/badger/v3.OpenManaged(...)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/managed_db.go:26
github.com/dgraph-io/dgraph/worker.(*ServerState).initStorage(0x2bb6240)
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/server_state.go:123 +0x4f8 fp=0xc000219760 sp=0xc000218c20 pc=0x175dc38
github.com/dgraph-io/dgraph/worker.InitServerState()
        /ext-go/1/src/github.com/dgraph-io/dgraph/worker/server_state.go:54 +0xa5 fp=0xc0002197d0 sp=0xc000219760 pc=0x175d365
github.com/dgraph-io/dgraph/dgraph/cmd/alpha.run()
        /ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/alpha/run.go:700 +0x108a fp=0xc000219c38 sp=0xc0002197d0 pc=0x18b2e6a
github.com/dgraph-io/dgraph/dgraph/cmd/alpha.init.2.func1(0xc0004bd900, 0xc000667500, 0x0, 0x3)
        /ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/alpha/run.go:94 +0x65 fp=0xc000219c78 sp=0xc000219c38 pc=0x18b43e5
github.com/spf13/cobra.(*Command).execute(0xc0004bd900, 0xc000667470, 0x3, 0x3, 0xc0004bd900, 0xc000667470)
        /go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830 +0x2c2 fp=0xc000219d50 sp=0xc000219c78 pc=0x12b2042
github.com/spf13/cobra.(*Command).ExecuteC(0x29b3000, 0x0, 0x0, 0x0)
        /go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914 +0x30b fp=0xc000219e28 sp=0xc000219d50 pc=0x12b2c6b
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
github.com/dgraph-io/dgraph/dgraph/cmd.Execute()
        /ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/root.go:72 +0x85 fp=0xc000219e68 sp=0xc000219e28 pc=0x194bf65
main.main()
        /ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/main.go:102 +0x171 fp=0xc000219f88 sp=0xc000219e68 pc=0x194d5b1
runtime.main()
        /usr/local/go/src/runtime/proc.go:204 +0x209 fp=0xc000219fe0 sp=0xc000219f88 pc=0xa63509
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000219fe8 sp=0xc000219fe0 pc=0xa9a4e1

goroutine 19 [chan receive]:
github.com/golang/glog.(*loggingT).flushDaemon(0x2bb7780)
        /go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:882 +0x8b
created by github.com/golang/glog.init.0
        /go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:410 +0x274

goroutine 21 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc00018a380)
        /go/pkg/mod/go.opencensus.io@v0.22.5/stats/view/worker.go:276 +0x105
created by go.opencensus.io/stats/view.init.0
        /go/pkg/mod/go.opencensus.io@v0.22.5/stats/view/worker.go:34 +0x68

goroutine 22 [chan receive]:
github.com/dgraph-io/dgraph/x.init.0.func1(0x1f948c0, 0xc000130030)
        /ext-go/1/src/github.com/dgraph-io/dgraph/x/metrics.go:291 +0xe5
created by github.com/dgraph-io/dgraph/x.init.0
        /ext-go/1/src/github.com/dgraph-io/dgraph/x/metrics.go:287 +0x93

goroutine 9 [chan receive]:
main.main.func2(0xc00003f590, 0x1e09538)
        /ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/main.go:60 +0xdc
created by main.main
        /ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/main.go:50 +0x16c

goroutine 157 [chan receive]:
github.com/getsentry/sentry-go.(*HTTPTransport).worker(0xc0006d8300)
        /go/pkg/mod/github.com/getsentry/sentry-go@v0.6.0/transport.go:303 +0x7d
created by github.com/getsentry/sentry-go.(*HTTPTransport).Configure.func1
        /go/pkg/mod/github.com/getsentry/sentry-go@v0.6.0/transport.go:174 +0x3e

goroutine 163 [select]:
github.com/dgraph-io/badger/v3/y.(*WaterMark).process(0xc0004d7a40, 0xc0004d7a10)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/y/watermark.go:214 +0x2d4
created by github.com/dgraph-io/badger/v3/y.(*WaterMark).Init
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/y/watermark.go:72 +0x76

goroutine 164 [select]:
github.com/dgraph-io/badger/v3/y.(*WaterMark).process(0xc0004d7a70, 0xc0004d7a10)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/y/watermark.go:214 +0x2d4
created by github.com/dgraph-io/badger/v3/y.(*WaterMark).Init
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/y/watermark.go:72 +0x76

goroutine 165 [select]:
github.com/dgraph-io/ristretto/z.(*AllocatorPool).freeupAllocators(0xc0000f3660)
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/z/allocator.go:383 +0x15b
created by github.com/dgraph-io/ristretto/z.NewAllocatorPool
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/z/allocator.go:323 +0xae

goroutine 166 [select]:
github.com/dgraph-io/ristretto.(*defaultPolicy).processItems(0xc0000b4040)
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/policy.go:102 +0xc5
created by github.com/dgraph-io/ristretto.newDefaultPolicy
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/policy.go:86 +0x129

goroutine 167 [select]:
github.com/dgraph-io/ristretto.(*Cache).processItems(0xc00009a080)
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/cache.go:418 +0x191
created by github.com/dgraph-io/ristretto.NewCache
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/cache.go:204 +0x308

goroutine 168 [select]:
github.com/dgraph-io/ristretto.(*defaultPolicy).processItems(0xc0000b40c0)
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/policy.go:102 +0xc5
created by github.com/dgraph-io/ristretto.newDefaultPolicy
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/policy.go:86 +0x129

goroutine 169 [select]:
github.com/dgraph-io/ristretto.(*Cache).processItems(0xc00009a100)
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/cache.go:418 +0x191
created by github.com/dgraph-io/ristretto.NewCache
        /go/pkg/mod/github.com/dgraph-io/ristretto@v0.0.4-0.20210122082011-bb5d392ed82d/cache.go:204 +0x308

goroutine 170 [select]:
github.com/dgraph-io/badger/v3.(*DB).monitorCache(0xc000065800, 0xc00001d5f0)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/db.go:409 +0x19c
created by github.com/dgraph-io/badger/v3.Open
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/db.go:277 +0x885

goroutine 171 [select]:
github.com/dgraph-io/badger/v3.(*DB).updateSize(0xc000065800, 0xc00001d6e0)
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/db.go:1114 +0x158
created by github.com/dgraph-io/badger/v3.Open
        /go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2011.1/db.go:297 +0xa35
W0413 04:20:29.234595       1 sentry_integration.go:140] unable to read CID from file /tmp/dgraph-alpha-cid-sentry open /tmp/dgraph-alpha-cid-sentry: no such file or directory. Skip
[Sentry] 2021/04/13 04:20:29 Buffer flushed successfully.

@ibrahim can we help out quickly here?

Hey @Siddhant , can you check if your disk is out of space or not? SIGBUS usually happens when you are out of disk space.

If you see badger/memtable.go at d918b9904b2ac71316a75e04fc54427a3bbd06bf · dgraph-io/badger · GitHub, we are copying the data to mmaped file. So, disk space seems to be the cause of SIGBUS.

1 Like