Alpha keep sending votes to deleted nodes,cannot select leader

Two nodes are no longer bootable and I have removed them from the cluster. The remaining nodes are still sending votes to them.

alpha log
I0224 04:25:12.407646      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:12.407653      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:14.907494      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:14.907549      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:14.907557      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:14.907572      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:14.907580      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:17.407577      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:17.407616      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:17.407622      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:17.407638      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:17.407646      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
W0224 04:25:18.407874      16 node.go:420] Unable to send message to peer: 0x4. Error: Do not have address of peer 0x4,
W0224 04:25:18.407887      16 node.go:420] Unable to send message to peer: 0x6. Error: Do not have address of peer 0x6,
I0224 04:25:19.907584      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:19.907611      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:19.907617      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:19.907647      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:19.907656      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:22.407662      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:22.407702      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:22.407713      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:22.407760      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:22.407787      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:24.907702      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:24.907741      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:24.907749      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:24.907773      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:24.907790      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:27.407663      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:27.407764      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:27.407776      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:27.407807      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:27.407821      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
W0224 04:25:28.408227      16 node.go:420] Unable to send message to peer: 0x6. Error: Do not have address of peer 0x6,
W0224 04:25:28.408266      16 node.go:420] Unable to send message to peer: 0x4. Error: Do not have address of peer 0x4,
I0224 04:25:29.907557      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:29.907588      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:29.907604      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:29.907619      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:29.907630      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:32.407556      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:32.407622      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:32.407629      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:32.407652      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:32.407661      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:34.907618      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:34.907652      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:34.907659      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:34.907676      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:34.907687      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:37.407588      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:37.407628      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:37.407634      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:37.407651      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:37.407664      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:39.907591      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:39.907642      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:39.907663      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:39.907680      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:39.907689      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
W0224 04:25:40.907928      16 node.go:420] Unable to send message to peer: 0x6. Error: Do not have address of peer 0x6,
W0224 04:25:40.907928      16 node.go:420] Unable to send message to peer: 0x4. Error: Do not have address of peer 0x4,
I0224 04:25:42.407559      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:42.407591      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:42.407596      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:42.407614      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:42.407622      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:44.957423      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:44.957469      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:44.957476      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:44.957498      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:44.957517      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:47.408363      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:47.408409      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:47.408416      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:47.408433      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:47.408444      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:50.107822      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:50.107853      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:50.107858      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:50.107891      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:50.107912      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
W0224 04:25:51.108360      16 node.go:420] Unable to send message to peer: 0x6. Error: Do not have address of peer 0x6,
W0224 04:25:51.108374      16 node.go:420] Unable to send message to peer: 0x4. Error: Do not have address of peer 0x4,
I0224 04:25:52.608440      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:52.608470      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:52.608476      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:52.608492      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:52.608506      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:55.108422      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:55.108446      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:55.108451      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:55.108473      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:55.108492      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 04:25:57.607956      16 log.go:34] 5 is starting a new election at term 19,
I0224 04:25:57.607983      16 log.go:34] 5 became pre-candidate at term 19,
I0224 04:25:57.607989      16 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 04:25:57.608005      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 04:25:57.608024      16 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,

I Want to Do

Start cluster

Dgraph Metadata

dgraph version
Dgraph version   : v20.11.1
Dgraph codename  : tchalla-1
Dgraph SHA-256   : cefdcc880c0607a92a1d8d3ba0beb015459ebe216e79fdad613eb0d00d09f134
Commit SHA-1     : 7153d13fe
Commit timestamp : 2021-01-28 15:59:35 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

@ibrahim this looks like a Zero-has-gone-away issue. Can you have a look?

@zzl221000 Is there another Zero in the same cluster?

@chewxy
There are three zero nodes that look normal.
The other two groups (g1, g3) have no problems.

g2 alpha startup log
I0224 12:26:25.646258      17 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 12:26:28.017617      17 log.go:34] 5 is starting a new election at term 19,
I0224 12:26:28.017656      17 log.go:34] 5 became pre-candidate at term 19,
I0224 12:26:28.017663      17 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 12:26:28.017680      17 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 12:26:28.017690      17 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
I0224 12:26:29.846739      17 log.go:34] 5 is starting a new election at term 19,
I0224 12:26:29.846765      17 log.go:34] 5 became pre-candidate at term 19,
I0224 12:26:29.846770      17 log.go:34] 5 received MsgPreVoteResp from 5 at term 19,
I0224 12:26:29.846784      17 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 4 at term 19,
I0224 12:26:29.846810      17 log.go:34] 5 [logterm: 7, index: 313429415] sent MsgPreVote request to 6 at term 19,
[Decoder]: Using assembly version of decoder,
Page Size: 4096,
[Sentry] 2021/02/24 12:26:34 Integration installed: ContextifyFrames,
[Sentry] 2021/02/24 12:26:34 Integration installed: Environment,
[Sentry] 2021/02/24 12:26:34 Integration installed: Modules,
[Sentry] 2021/02/24 12:26:34 Integration installed: IgnoreErrors,
[Decoder]: Using assembly version of decoder,
Page Size: 4096,
[Sentry] 2021/02/24 12:26:34 Integration installed: ContextifyFrames,
[Sentry] 2021/02/24 12:26:34 Integration installed: Environment,
[Sentry] 2021/02/24 12:26:34 Integration installed: Modules,
[Sentry] 2021/02/24 12:26:34 Integration installed: IgnoreErrors,
I0224 12:26:34.871294      18 sentry_integration.go:48] This instance of Dgraph will send anonymous reports of panics back to Dgraph Labs via Sentry. No confidential information is sent. These reports help improve Dgraph. To opt-out, restart your instance with the --enable_sentry=false flag. For more info, see https://dgraph.io/docs/howto/#data-handling.,
I0224 12:26:35.077573      18 init.go:107] ,
,
Dgraph version   : v20.11.2,
Dgraph codename  : tchalla-2,
Dgraph SHA-256   : 0153cb8d3941ad5ad107e395b347e8d930a0b4ead6f4524521f7a525a9699167,
Commit SHA-1     : 94f3a0430,
Commit timestamp : 2021-02-23 13:07:17 +0530,
Branch           : HEAD,
Go version       : go1.15.5,
jemalloc enabled : true,
,
For Dgraph official documentation, visit https://dgraph.io/docs/.,
For discussions about Dgraph     , visit https://discuss.dgraph.io.,
,
Licensed variously under the Apache Public License 2.0 and Dgraph Community License.,
Copyright 2015-2020 Dgraph Labs, Inc.,
,
,
I0224 12:26:35.077620      18 run.go:696] x.Config: {PortOffset:6 QueryEdgeLimit:1000000 NormalizeNodeLimit:10000 MutationsNQuadLimit:1000000 PollInterval:1s GraphqlExtension:true GraphqlDebug:false GraphqlLambdaUrl:},
I0224 12:26:35.077664      18 run.go:697] x.WorkerConfig: {TmpDir:t ExportPath:export NumPendingProposals:256 Tracing:0.01 MyAddr:192.168.3.12:7086 ZeroAddr:[192.168.3.9:5080 192.168.3.11:5080 192.168.3.12:5080] TLSClientConfig:<nil> TLSServerConfig:<nil> RaftId:0 WhiteListedIPRanges:[{Lower:0.0.0.0 Upper:255.255.255.255}] MaxRetries:-1 StrictMutations:false AclEnabled:false AbortOlderThan:5m0s SnapshotAfter:10000 ProposedGroupId:0 StartTime:2021-02-24 12:26:34.441740675 +0000 UTC m=+0.022598203 LudicrousMode:false LudicrousConcurrency:2000 EncryptionKey:**** LogRequest:0 HardSync:false},
I0224 12:26:35.077726      18 run.go:698] worker.Config: {PostingDir:p PostingDirCompression:1 PostingDirCompressionLevel:0 WALDir:w MutationsMode:0 AuthToken: PBlockCacheSize:13958643712 PIndexCacheSize:7516192768 WalCache:0 HmacSecret:**** AccessJwtTtl:0s RefreshJwtTtl:0s CachePercentage:0,65,35,0 CacheMb:0},
I0224 12:26:35.077976      18 log.go:295] Found file: 10452 First Index: 313396289,
I0224 12:26:35.078024      18 log.go:295] Found file: 10453 First Index: 313426289,
I0224 12:26:35.078060      18 log.go:295] Found file: 10454 First Index: 313429177,
I0224 12:26:35.078091      18 log.go:295] Found file: 10455 First Index: 313429179,
I0224 12:26:35.078119      18 log.go:295] Found file: 10456 First Index: 313429403,
I0224 12:26:35.078151      18 log.go:295] Found file: 10457 First Index: 313429405,
I0224 12:26:35.078230      18 log.go:295] Found file: 10458 First Index: 313429408,
I0224 12:26:35.078263      18 log.go:295] Found file: 10459 First Index: 313429414,
I0224 12:26:35.078348      18 storage.go:132] Init Raft Storage with snap: 313422969, first: 313422970, last: 313429415,
I0224 12:26:35.078368      18 server_state.go:76] Setting Posting Dir Compression Level: 0,
I0224 12:26:35.078380      18 server_state.go:120] Opening postings BadgerDB with options: {Dir:p ValueDir:p SyncWrites:false NumVersionsToKeep:2147483647 ReadOnly:false Logger:0x2e0fef8 Compression:1 InMemory:false MemTableSize:67108864 BaseTableSize:2097152 BaseLevelSize:10485760 LevelSizeMultiplier:10 TableSizeMultiplier:2 MaxLevels:7 ValueThreshold:1024 NumMemtables:5 BlockSize:4096 BloomFalsePositive:0.01 BlockCacheSize:13958643712 IndexCacheSize:7516192768 NumLevelZeroTables:5 NumLevelZeroTablesStall:15 ValueLogFileSize:1073741823 ValueLogMaxEntries:1000000 NumCompactors:4 CompactL0OnClose:false ZSTDCompressionLevel:0 VerifyValueChecksum:false EncryptionKey:[] EncryptionKeyRotationDuration:240h0m0s BypassLockGuard:false ChecksumVerificationMode:0 DetectConflicts:false managedTxns:false maxBatchCount:0 maxBatchSize:0},
I0224 12:26:37.092917      18 log.go:34] All 1910 tables opened in 1.878s,
I0224 12:26:37.094156      18 log.go:34] Discard stats nextEmptySlot: 1,
I0224 12:26:37.095228      18 log.go:34] Set nextTxnTs to 421508262,
I0224 12:26:37.102716      18 log.go:34] Deleting empty file: p/000007.vlog,
I0224 12:26:37.105094      18 groups.go:99] Current Raft Id: 0x5,
E0224 12:26:37.105094      18 groups.go:1143] Error during SubscribeForUpdates for prefix "\x00\x00\vdgraph.cors\x00": Unable to find any servers for group: 1. closer err: <nil>,
I0224 12:26:37.105779      18 worker.go:104] Worker listening at address: [::]:7086,
E0224 12:26:37.107315      18 groups.go:1143] Error during SubscribeForUpdates for prefix "\x00\x00\x15dgraph.graphql.schema\x00": Unable to find any servers for group: 1. closer err: <nil>,
I0224 12:26:37.107322      18 run.go:519] Bringing up GraphQL HTTP API at 0.0.0.0:8086/graphql,
I0224 12:26:37.107356      18 run.go:520] Bringing up GraphQL HTTP admin API at 0.0.0.0:8086/admin,
I0224 12:26:37.107407      18 run.go:552] gRPC server started.  Listening on port 9086,
I0224 12:26:37.107418      18 run.go:553] HTTP server started.  Listening on port 8086,
I0224 12:26:37.205489      18 pool.go:162] CONNECTING to 192.168.3.9:5080,
I0224 12:26:37.209761      18 pool.go:162] CONNECTING to 192.168.3.11:5080,
I0224 12:26:37.212709      18 groups.go:127] Connected to group zero. Assigned group: 0,
I0224 12:26:37.212741      18 groups.go:129] Raft Id after connection to Zero: 0x5,
I0224 12:26:37.212820      18 pool.go:162] CONNECTING to 192.168.3.9:7087,
I0224 12:26:37.212947      18 pool.go:162] CONNECTING to 192.168.3.11:7087,
I0224 12:26:37.213009      18 pool.go:162] CONNECTING to 192.168.3.12:7087,
I0224 12:26:37.213068      18 pool.go:162] CONNECTING to 192.168.3.11:7086,
I0224 12:26:37.213122      18 pool.go:162] CONNECTING to 192.168.3.12:7085,
I0224 12:26:37.213172      18 pool.go:162] CONNECTING to 192.168.3.11:7085,
I0224 12:26:37.213215      18 pool.go:162] CONNECTING to 192.168.3.9:7085,
I0224 12:26:37.213276      18 pool.go:162] CONNECTING to 192.168.3.12:5080,
I0224 12:26:37.213354      18 draft.go:230] Node ID: 0x5 with GroupID: 2,
I0224 12:26:37.213448      18 node.go:152] Setting raft.Config to: &{ID:5 peers:[] learners:[] ElectionTick:20 HeartbeatTick:1 Storage:0xc0004083c0 Applied:313422969 MaxSizePerMsg:262144 MaxCommittedSizePerReady:67108864 MaxUncommittedEntriesSize:0 MaxInflightMsgs:256 CheckQuorum:false PreVote:true ReadOnlyOption:0 Logger:0x2e0fef8 DisableProposalForwarding:false},
W0224 12:26:37.213507      18 pool.go:267] Connection lost with 192.168.3.11:7086. Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 192.168.3.11:7086: connect: connection refused",
I0224 12:26:37.214590      18 node.go:310] Found Snapshot.Metadata: {ConfState:{Nodes:[4 5 6] Learners:[] XXX_unrecognized:[]} Index:313422969 Term:4 XXX_unrecognized:[]},
I0224 12:26:37.214614      18 node.go:321] Found hardstate: {Term:19 Vote:6 Commit:313429400 XXX_unrecognized:[]},
I0224 12:26:40.665160      18 node.go:326] Group 2 found 33127 entries,
I0224 12:26:40.665222      18 draft.go:1689] Restarting node for group: 2,
I0224 12:26:40.665251      18 node.go:189] Setting conf state to nodes:4 nodes:5 nodes:6 ,
I0224 12:26:40.665429      18 log.go:34] 5 became follower at term 19,
I0224 12:26:40.665456      18 log.go:34] newRaft 5 [peers: [4,5,6], term: 19, commit: 313429400, applied: 313422969, lastindex: 313429415, lastterm: 7],
I0224 12:26:40.665516      18 draft.go:180] Operation started with id: opRollup,
I0224 12:26:40.665605      18 draft.go:1084] Found Raft progress: 313424984,
I0224 12:26:40.665617      18 groups.go:807] Got address of a Zero leader: 192.168.3.11:5080,
I0224 12:26:40.667643      18 groups.go:821] Starting a new membership stream receive from 192.168.3.11:5080.,
I0224 12:26:40.669734      18 groups.go:838] Received first state update from Zero: counter:157533714 groups:<key:1 value:<members:<key:1 value:<id:1 group_id:1 addr:"192.168.3.9:7085" last_update:1613857276 > > members:<key:2 value:<id:2 group_id:1 addr:"192.168.3.11:7085" leader:true last_update:1614136754 > > members:<key:3 value:<id:3 group_id:1 addr:"192.168.3.12:7085" last_update:1614136751 > > tablets:<key:"Person.introduction" value:<group_id:1 predicate:"Person.introduction" > > tablets:<key:"RlNode.rlid" value:<group_id:1 predicate:"RlNode.rlid" on_disk_bytes:37705411951 uncompressed_bytes:54151613727 > > tablets:<key:"dgraph.cors" value:<group_id:1 predicate:"dgraph.cors" on_disk_bytes:199 uncompressed_bytes:75 > > tablets:<key:"dgraph.drop.op" value:<group_id:1 predicate:"dgraph.drop.op" > > tablets:<key:"dgraph.graphql.p_query" value:<group_id:1 predicate:"dgraph.graphql.p_query" > > tablets:<key:"dgraph.graphql.p_sha256hash" value:<group_id:1 predicate:"dgraph.graphql.p_sha256hash" > > tablets:<key:"dgraph.graphql.schema" value:<group_id:1 predicate:"dgraph.graphql.schema" > > tablets:<key:"dgraph.graphql.schema_created_at" value:<group_id:1 predicate:"dgraph.graphql.schema_created_at" > > tablets:<key:"dgraph.graphql.schema_history" value:<group_id:1 predicate:"dgraph.graphql.schema_history" > > tablets:<key:"dgraph.graphql.xid" value:<group_id:1 predicate:"dgraph.graphql.xid" > > tablets:<key:"dgraph.type" value:<group_id:1 predicate:"dgraph.type" on_disk_bytes:7939664069 uncompressed_bytes:21832806655 > > snapshot_ts:421498025 checksum:10544634265750131315 > > groups:<key:2 value:<members:<key:5 value:<id:5 group_id:2 addr:"192.168.3.12:7086" last_update:1614137647 > > members:<key:11 value:<id:11 group_id:2 addr:"192.168.3.11:7086" > > tablets:<key:"Company.province" value:<group_id:2 predicate:"Company.province" on_disk_bytes:3741156853 uncompressed_bytes:10175890425 > > tablets:<key:"Company.status" value:<group_id:2 predicate:"Company.status" on_disk_bytes:4312123758 uncompressed_bytes:12884788568 > > tablets:<key:"Company.wasManager" value:<group_id:2 predicate:"Company.wasManager" on_disk_bytes:4482552146 uncompressed_bytes:10514075761 > > tablets:<key:"Person.ownId" value:<group_id:2 predicate:"Person.ownId" on_disk_bytes:28718276337 uncompressed_bytes:37346019718 > > snapshot_ts:421498025 checksum:15197177210431107250 > > groups:<key:3 value:<members:<key:7 value:<id:7 group_id:3 addr:"192.168.3.12:7087" last_update:1614136741 > > members:<key:8 value:<id:8 group_id:3 addr:"192.168.3.11:7087" leader:true last_update:1614136751 > > members:<key:9 value:<id:9 group_id:3 addr:"192.168.3.9:7087" > > tablets:<key:"Company.invest" value:<group_id:3 predicate:"Company.invest" on_disk_bytes:166516771 uncompressed_bytes:480097056 > > tablets:<key:"Company.invested" value:<group_id:3 predicate:"Company.invested" on_disk_bytes:167049405 uncompressed_bytes:439429881 > > tablets:<key:"Company.name" value:<group_id:3 predicate:"Company.name" on_disk_bytes:14111513166 uncompressed_bytes:15219815621 > > tablets:<key:"Company.partner" value:<group_id:3 predicate:"Company.partner" on_disk_bytes:543355 uncompressed_bytes:1820222 > > tablets:<key:"Company.wasLegal" value:<group_id:3 predicate:"Company.wasLegal" on_disk_bytes:4415094885 uncompressed_bytes:14992721750 > > tablets:<key:"Company.wasPartner" value:<group_id:3 predicate:"Company.wasPartner" on_disk_bytes:189185 uncompressed_bytes:629503 > > tablets:<key:"Company.wasShareholder" value:<group_id:3 predicate:"Company.wasShareholder" on_disk_bytes:2714876089 uncompressed_bytes:5723542142 > > tablets:<key:"Person.avatar" value:<group_id:3 predicate:"Person.avatar" on_disk_bytes:188 move_ts:78779184 uncompressed_bytes:66 > > tablets:<key:"Person.legal" value:<group_id:3 predicate:"Person.legal" on_disk_bytes:4457950398 uncompressed_bytes:15264204181 > > tablets:<key:"Person.manager" value:<group_id:3 predicate:"Person.manager" on_disk_bytes:4013635446 uncompressed_bytes:11134791727 > > tablets:<key:"Person.name" value:<group_id:3 predicate:"Person.name" on_disk_bytes:12340244275 uncompressed_bytes:21841655020 > > tablets:<key:"Person.pid" value:<group_id:3 predicate:"Person.pid" on_disk_bytes:192 move_ts:78128708 uncompressed_bytes:68 > > tablets:<key:"Person.shareholder" value:<group_id:3 predicate:"Person.shareholder" on_disk_bytes:2450703327 uncompressed_bytes:8568605697 > > snapshot_ts:421498025 checksum:9918253677290345795 > > zeros:<key:1 value:<id:1 addr:"192.168.3.9:5080" > > zeros:<key:2 value:<id:2 addr:"192.168.3.11:5080" leader:true > > zeros:<key:3 value:<id:3 addr:"192.168.3.12:5080" > > maxLeaseId:394750000 maxTxnTs:421520000 maxRaftId:11 removed:<id:4 group_id:2 addr:"192.168.3.11:7086" last_update:1614138676 > removed:<id:10 group_id:2 addr:"192.168.3.11:7086" > removed:<id:6 group_id:2 addr:"192.168.3.9:7086" last_update:1614140119 > cid:"6ee0e06c-baec-4d56-910a-58e3a391552e" license:<maxNodes:18446744073709551615 expiryTs:1616383230 enabled:true > ,
I0224 12:26:41.665616      18 groups.go:159] Server is ready,
I0224 12:26:41.665638      18 access_ee.go:390] ResetAcl closed,
I0224 12:26:41.665644      18 access_ee.go:311] RefreshAcls closed,
I0224 12:26:41.671116      18 graphql.go:41] ResetCors closed,
I0224 12:26:42.114167      18 admin.go:709] Successfully loaded GraphQL schema.  Serving GraphQL API.,

Related:

Since this is now seemingly a repeated issue, let’s see if we can find the root cause and fix it @ibrahim

Doesn’t look like 6 and 4 were actually removed using /removeNode endpoint. If you don’t do that, they are still part of the group.

@mrjn Actually both 6 and 4 are deleted with the /removeNode endpoint. In the startup log of 5 I see the first state update

For easy viewing, I manually formatted the log format,

Received first state update from Zero: counter:157533714 
groups:<
key:1 value:
<
members:<key:1 value:<id:1 group_id:1 addr:"192.168.3.9:7085" last_update:1613857276 > > 
members:<key:2 value:<id:2 group_id:1 addr:"192.168.3.11:7085" leader:true last_update:1614136754 > > 
members:<key:3 value:<id:3 group_id:1 addr:"192.168.3.12:7085" last_update:1614136751 > > 
tablets:<key:"Person.introduction" value:<group_id:1 predicate:"Person.introduction" > > 
tablets:<key:"RlNode.rlid" value:<group_id:1 predicate:"RlNode.rlid" on_disk_bytes:37705411951 uncompressed_bytes:54151613727 > > 
tablets:<key:"dgraph.cors" value:<group_id:1 predicate:"dgraph.cors" on_disk_bytes:199 uncompressed_bytes:75 > > 
tablets:<key:"dgraph.drop.op" value:<group_id:1 predicate:"dgraph.drop.op" > > 
tablets:<key:"dgraph.graphql.p_query" value:<group_id:1 predicate:"dgraph.graphql.p_query" > > 
tablets:<key:"dgraph.graphql.p_sha256hash" value:<group_id:1 predicate:"dgraph.graphql.p_sha256hash" > > 
tablets:<key:"dgraph.graphql.schema" value:<group_id:1 predicate:"dgraph.graphql.schema" > > 
tablets:<key:"dgraph.graphql.schema_created_at" value:<group_id:1 predicate:"dgraph.graphql.schema_created_at" > > 
tablets:<key:"dgraph.graphql.schema_history" value:<group_id:1 predicate:"dgraph.graphql.schema_history" > > 
tablets:<key:"dgraph.graphql.xid" value:<group_id:1 predicate:"dgraph.graphql.xid" > > 
tablets:<key:"dgraph.type" value:<group_id:1 predicate:"dgraph.type" on_disk_bytes:7939664069 uncompressed_bytes:21832806655 > > 
snapshot_ts:421498025 checksum:10544634265750131315 > > 
groups:<key:2 value:<members:<key:5 value:<id:5 group_id:2 addr:"192.168.3.12:7086" last_update:1614137647 > > 
members:<key:11 value:<id:11 group_id:2 addr:"192.168.3.11:7086" > > 
tablets:<key:"Company.province" value:<group_id:2 predicate:"Company.province" on_disk_bytes:3741156853 uncompressed_bytes:10175890425 > > 
tablets:<key:"Company.status" value:<group_id:2 predicate:"Company.status" on_disk_bytes:4312123758 uncompressed_bytes:12884788568 > > 
tablets:<key:"Company.wasManager" value:<group_id:2 predicate:"Company.wasManager" on_disk_bytes:4482552146 uncompressed_bytes:10514075761 > > 
tablets:<key:"Person.ownId" value:<group_id:2 predicate:"Person.ownId" on_disk_bytes:28718276337 uncompressed_bytes:37346019718 > > 
snapshot_ts:421498025 checksum:15197177210431107250 > > 
groups:<key:3 value:<members:<key:7 value:<id:7 group_id:3 addr:"192.168.3.12:7087" last_update:1614136741 > > 
members:<key:8 value:<id:8 group_id:3 addr:"192.168.3.11:7087" leader:true last_update:1614136751 > > 
members:<key:9 value:<id:9 group_id:3 addr:"192.168.3.9:7087" > > 
tablets:<key:"Company.invest" value:<group_id:3 predicate:"Company.invest" on_disk_bytes:166516771 uncompressed_bytes:480097056 > > 
tablets:<key:"Company.invested" value:<group_id:3 predicate:"Company.invested" on_disk_bytes:167049405 uncompressed_bytes:439429881 > > 
tablets:<key:"Company.name" value:<group_id:3 predicate:"Company.name" on_disk_bytes:14111513166 uncompressed_bytes:15219815621 > > 
tablets:<key:"Company.partner" value:<group_id:3 predicate:"Company.partner" on_disk_bytes:543355 uncompressed_bytes:1820222 > > 
tablets:<key:"Company.wasLegal" value:<group_id:3 predicate:"Company.wasLegal" on_disk_bytes:4415094885 uncompressed_bytes:14992721750 > > 
tablets:<key:"Company.wasPartner" value:<group_id:3 predicate:"Company.wasPartner" on_disk_bytes:189185 uncompressed_bytes:629503 > > 
tablets:<key:"Company.wasShareholder" value:<group_id:3 predicate:"Company.wasShareholder" on_disk_bytes:2714876089 uncompressed_bytes:5723542142 > > 
tablets:<key:"Person.avatar" value:<group_id:3 predicate:"Person.avatar" on_disk_bytes:188 move_ts:78779184 uncompressed_bytes:66 > > 
tablets:<key:"Person.legal" value:<group_id:3 predicate:"Person.legal" on_disk_bytes:4457950398 uncompressed_bytes:15264204181 > > 
tablets:<key:"Person.manager" value:<group_id:3 predicate:"Person.manager" on_disk_bytes:4013635446 uncompressed_bytes:11134791727 > > 
tablets:<key:"Person.name" value:<group_id:3 predicate:"Person.name" on_disk_bytes:12340244275 uncompressed_bytes:21841655020 > > 
tablets:<key:"Person.pid" value:<group_id:3 predicate:"Person.pid" on_disk_bytes:192 move_ts:78128708 uncompressed_bytes:68 > > 
tablets:<key:"Person.shareholder" value:<group_id:3 predicate:"Person.shareholder" on_disk_bytes:2450703327 uncompressed_bytes:8568605697 > > 
snapshot_ts:421498025 checksum:9918253677290345795 > > 
zeros:<key:1 value:<id:1 addr:"192.168.3.9:5080" > > 
zeros:<key:2 value:<id:2 addr:"192.168.3.11:5080" leader:true > > 
zeros:<key:3 value:<id:3 addr:"192.168.3.12:5080" > > 
maxLeaseId:394750000 maxTxnTs:421520000 maxRaftId:11 
removed:<id:4 group_id:2 addr:"192.168.3.11:7086" last_update:1614138676 > 
removed:<id:10 group_id:2 addr:"192.168.3.11:7086" > 
removed:<id:6 group_id:2 addr:"192.168.3.9:7086" last_update:1614140119 > 
cid:"6ee0e06c-baec-4d56-910a-58e3a391552e" license:<maxNodes:18446744073709551615 expiryTs:1616383230 enabled:true > ,

It receives the message that the node has been removed

@chewxy can you try this out? I suspect that the sequence in which this was done probably caused a quorum issue.

on it