What version of Dgraph are you using?
dgraph version 22.0.1
Command used
curl -H 'content-type: application/graphql' -XPOST localhost:8080/admin -d 'mutation { restore(input:{ location: "s3://s3.us-east-1.amazonaws.com/systemdb-prod/v22.0.1/nightly/"}){ message code } }'
Command response:
{"data":{"restore":{"message":"Restore operation started.","code":"Success"}},"extensions":{"tracing":{"version":1,"startTime":"2022-12-03T03:44:11.755148468Z","endTime":"2022-12-03T03:44:12.127584021Z","duration":372435569}}}--- Online restore complete.
Setup:
6 node cluster (3xA, 3xZ) that has access to an s3 bucket.
The online restore process is started, then the alpha fails at the end of the process with the following output below. Lastly, the alphas get stuck in a crash loop.
I1203 03:24:05.048612 21 run.go:754] x.Config: {PortOffset:0 Limit:normalize-node=10000; query-timeout=0ms; mutations-nquad=1000000; query-edge=1000000; max-pending-queries=10000; disallow-drop=false; txn-abort-after=5m; mutations=allow; max-retries=-1 LimitMutationsNquad:1000000 LimitQueryEdge:1000000 BlockClusterWideDrop:false LimitNormalizeNode:10000 QueryTimeout:0s MaxRetries:-1 GraphQL:extensions=true; poll-interval=1s; lambda-url=; introspection=true; debug=false GraphQLDebug:false}
I1203 03:24:05.048659 21 run.go:755] x.WorkerConfig: {TmpDir:t ExportPath:export Trace:ratio=0.01; jaeger=; datadog= MyAddr:systemdb-alpha-0.systemdb-alpha-headless.systemdb-prod.svc.cluster.local:7080 ZeroAddr:[systemdb-zero-0.systemdb-zero-headless.systemdb-prod.svc.cluster.local:5080 systemdb-zero-1.systemdb-zero-headless.systemdb-prod.svc.cluster.local:5080 systemdb-zero-2.systemdb-zero-headless.systemdb-prod.svc.cluster.local:5080] TLSClientConfig:<nil> TLSServerConfig:<nil> Raft:snapshot-after-duration=30m; pending-proposals=256; idx=; group=; learner=false; snapshot-after-entries=10000 Badger:{Dir: ValueDir: SyncWrites:false NumVersionsToKeep:1 ReadOnly:false Logger:0xc000416820 Compression:1 InMemory:false MetricsEnabled:true NumGoroutines:8 MemTableSize:67108864 BaseTableSize:2097152 BaseLevelSize:10485760 LevelSizeMultiplier:10 TableSizeMultiplier:2 MaxLevels:7 VLogPercentile:0 ValueThreshold:1048576 NumMemtables:5 BlockSize:4096 BloomFalsePositive:0.01 BlockCacheSize:1395864371 IndexCacheSize:751619276 NumLevelZeroTables:5 NumLevelZeroTablesStall:15 ValueLogFileSize:1073741823 ValueLogMaxEntries:1000000 NumCompactors:4 CompactL0OnClose:false LmaxCompaction:false ZSTDCompressionLevel:0 VerifyValueChecksum:false EncryptionKey:[] EncryptionKeyRotationDuration:240h0m0s BypassLockGuard:false ChecksumVerificationMode:0 DetectConflicts:true NamespaceOffset:-1 managedTxns:false maxBatchCount:0 maxBatchSize:0 maxValueThreshold:0} WhiteListedIPRanges:[{Lower:0.0.0.0 Upper:255.255.255.255}] StrictMutations:false AclEnabled:false HmacSecret:**** AbortOlderThan:5m0s ProposedGroupId:0 StartTime:2022-12-03 03:24:04.469775855 +0000 UTC m=+0.206174361 Ludicrous:concurrency=2000; enabled=false LudicrousEnabled:false Security:whitelist=0.0.0.0/0; token= EncryptionKey:**** LogRequest:0 HardSync:false Audit:false}
I1203 03:24:05.048735 21 run.go:756] worker.Config: {PostingDir:/dgraph/p1 WALDir:w MutationsMode:0 AuthToken: HmacSecret:**** AccessJwtTtl:0s RefreshJwtTtl:0s CachePercentage:0,65,35 CacheMb:2048 Audit:<nil> ChangeDataConf:file=; kafka=; sasl_user=; sasl_password=; ca_cert=; client_cert=; client_key=; sasl-mechanism=PLAIN;}
I1203 03:24:05.048887 21 log.go:295] Found file: 1 First Index: 1
I1203 03:24:05.048941 21 storage.go:125] Init Raft Storage with snap: 0, first: 1, last: 21
I1203 03:24:05.048955 21 server_state.go:140] Opening postings BadgerDB with options: {Dir:/dgraph/p1 ValueDir:/dgraph/p1 SyncWrites:false NumVersionsToKeep:2147483647 ReadOnly:false Logger:0x3111300 Compression:1 InMemory:false MetricsEnabled:true NumGoroutines:8 MemTableSize:67108864 BaseTableSize:2097152 BaseLevelSize:10485760 LevelSizeMultiplier:10 TableSizeMultiplier:2 MaxLevels:7 VLogPercentile:0 ValueThreshold:1048576 NumMemtables:5 BlockSize:4096 BloomFalsePositive:0.01 BlockCacheSize:1395864371 IndexCacheSize:751619276 NumLevelZeroTables:5 NumLevelZeroTablesStall:15 ValueLogFileSize:1073741823 ValueLogMaxEntries:1000000 NumCompactors:4 CompactL0OnClose:false LmaxCompaction:false ZSTDCompressionLevel:0 VerifyValueChecksum:false EncryptionKey:[] EncryptionKeyRotationDuration:240h0m0s BypassLockGuard:false ChecksumVerificationMode:0 DetectConflicts:false NamespaceOffset:1 managedTxns:false maxBatchCount:0 maxBatchSize:0 maxValueThreshold:0}
I1203 03:24:05.643442 21 log.go:34] All 1480 tables opened in 246ms
I1203 03:24:05.645824 21 log.go:34] Discard stats nextEmptySlot: 0
I1203 03:24:05.646750 21 log.go:34] Set nextTxnTs to 16994037
I1203 03:24:05.650440 21 log.go:34] Deleting empty file: /dgraph/p1/000004.vlog
I1203 03:24:05.651294 21 groups.go:100] Current Raft Id: 0x1
I1203 03:24:05.651311 21 groups.go:116] Sending member request to Zero: id:1 addr:"systemdb-alpha-0.systemdb-alpha-headless.systemdb-prod.svc.cluster.local:7080"
I1203 03:24:05.651603 21 worker.go:114] Worker listening at address: [::]:7080
I1203 03:24:05.652537 21 run.go:567] Bringing up GraphQL HTTP API at 0.0.0.0:8080/graphql
I1203 03:24:05.652554 21 run.go:568] Bringing up GraphQL HTTP admin API at 0.0.0.0:8080/admin
E1203 03:24:05.652553 21 groups.go:1224] Error during SubscribeForUpdates for prefix "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x15dgraph.graphql.schema\x00": Unable to find any servers for group: 1. closer err: <nil>
I1203 03:24:05.652564 21 run.go:595] gRPC server started. Listening on port 9080
I1203 03:24:05.652860 21 run.go:596] HTTP server started. Listening on port 8080
I1203 03:24:05.751814 21 pool.go:164] CONN: Connecting to systemdb-zero-0.systemdb-zero-headless.systemdb-prod.svc.cluster.local:5080
I1203 03:24:05.759907 21 groups.go:135] Connected to group zero. Assigned group: 0
I1203 03:24:05.759924 21 groups.go:137] Raft Id after connection to Zero: 0x1
I1203 03:24:05.759960 21 pool.go:164] CONN: Connecting to systemdb-alpha-2.systemdb-alpha-headless.systemdb-prod.svc.cluster.local:7080
I1203 03:24:05.759975 21 pool.go:164] CONN: Connecting to systemdb-alpha-1.systemdb-alpha-headless.systemdb-prod.svc.cluster.local:7080
I1203 03:24:05.760034 21 pool.go:164] CONN: Connecting to systemdb-zero-2.systemdb-zero-headless.systemdb-prod.svc.cluster.local:5080
I1203 03:24:05.760055 21 pool.go:164] CONN: Connecting to systemdb-zero-1.systemdb-zero-headless.systemdb-prod.svc.cluster.local:5080
I1203 03:24:05.760084 21 draft.go:270] Node ID: 0x1 with GroupID: 1
I1203 03:24:05.760095 21 draft.go:279] RaftContext: id:1 group:1 addr:"systemdb-alpha-0.systemdb-alpha-headless.systemdb-prod.svc.cluster.local:7080"
I1203 03:24:05.760194 21 node.go:153] Setting raft.Config to: &{ID:1 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc0000ab300 Applied:0 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x3111300 DisableProposalForwarding:false}
[Sentry] 2022/12/03 03:24:05 Sending fatal event [970f5b301eab485092fc18b30cf5609b] to o318308.ingest.sentry.io project: 1805390
2022/12/03 03:24:05 proto: SchemaUpdate: illegal tag 0 (wire type 7)
Error while loading schema from db
github.com/dgraph-io/dgraph/x.Checkf
/home/runner/work/dgraph/dgraph/x/error.go:51
github.com/dgraph-io/dgraph/schema.LoadSchemaFromDb.func1
/home/runner/work/dgraph/dgraph/schema/schema.go:533
github.com/dgraph-io/badger/v3.(*Item).Value
/home/runner/go/pkg/mod/github.com/dgraph-io/badger/v3@v3.2103.1/iterator.go:100
github.com/dgraph-io/dgraph/schema.LoadSchemaFromDb
/home/runner/work/dgraph/dgraph/schema/schema.go:529
github.com/dgraph-io/dgraph/schema.LoadFromDb
/home/runner/work/dgraph/dgraph/schema/schema.go:502
github.com/dgraph-io/dgraph/worker.StartRaftNodes
/home/runner/work/dgraph/dgraph/worker/groups.go:153
github.com/dgraph-io/dgraph/dgraph/cmd/alpha.run.func4
/home/runner/work/dgraph/dgraph/dgraph/cmd/alpha/run.go:814
runtime.goexit
/opt/hostedtoolcache/go/1.18.5/x64/src/runtime/asm_amd64.s:1571
james@Jamess-MBP SystemDB % E1202 22:26:24.955626 63609 proxy_server.go:147] Error while proxying request: context canceled