Connection not found for insert

M.Lau · August 25, 2020, 4:49am

Hi I can query but received an error message while doing an insert:

{
  "name": "t",
  "url": "http://localhost:8080/mutate?commitNow=true",
  "errors": [
    {
      "message": "cannot retrieve predicate information: No connection exists",
      "extensions": {
        "code": "ErrorInvalidRequest"
      }
    }
  ]
}

The dgraph alpha server message:

Move predicate request: predicate:"name" source_gid:1 dest_gid:2 txn_ts:3440019

The dgraph zero server message:
I0825 04:40:16.655843   30155 zero.go:438] Connected: cluster_info_only:true
I0825 04:40:16.656903   30155 zero.go:420] Got connection request: id:1 addr:"localhost:7080"
I0825 04:40:16.657287   30155 zero.go:551] Connected: id:1 addr:"localhost:7080"
I0825 04:47:30.954485   30155 tablet.go:208]

Groups sorted by size: [{gid:2 size:0} {gid:1 size:240067445}]

I0825 04:47:30.954536   30155 tablet.go:213] size_diff 240067445
I0825 04:47:30.954946   30155 tablet.go:108] Going to move predicate: [name], size: [70 MB] from group 1 to 2
I0825 04:47:30.955069   30155 tablet.go:135] Starting move: predicate:"name" source_gid:1 dest_gid:2 txn_ts:3440019
E0825 04:47:30.958280   30155 tablet.go:70] while calling MovePredicate: rpc error: code = Unknown desc = Unable to find a connection for group: 2

My dgraph version is:
v1.2.6
on ubuntu 18

If my understanding of the error message is correct,
dgraph is detecting a few clusters and it is trying to move it to another cluster,
I don’t think I set up any new cluster as this is on a single machine that I SSH into
is there anyway to solve this problem please?

minhaj · August 25, 2020, 4:58am

Hi @M.Lau, Dgraph is trying to move the predicate from 1 group to another. For more information about sharding, you can read here. Can you tell us a little more about the configuration of your cluster (Number of alphas and zeros) and possible steps to reproduce this error on a newly start cluster(if you are able to reproduce it).

M.Lau · August 25, 2020, 7:59am

Ok. I didn’t setup any sharding because this is a single machine setup.

I have only 1 zero and 1 alpha. But I wonder whether this error happens because I accidentally forgot to close the original alpha while starting a new alpha in a new folder. Is this a possible reason why it is trying to split the files into two places?

Is there a way to disable auto sharding or predicate auto rebalancing? so that the user can manually decide when to do so.

minhaj · August 25, 2020, 9:51am

You can set the interval of sharding by using --rebalance_interval when you start dgraph zero, after which dgraph will try to redistribute the predicate among groups. Currently, you cannot disable sharding completely but can set to a high value. But If there is only a single group then that should not be something to be worried about. Can you post the response of /state endpoint of your zero if still facing the same issue?

M.Lau · August 26, 2020, 1:44am

I have the state endpoint of the alpha…not able to get state end point of zero…what should i do to get it?

Here is Alpha State End Point, it show two alpha at 7080 which is weird?

{
    "counter": "1660771",
    "groups": {
        "1": {
            "members": {
                "1": {
                    "id": "1",
                    "groupId": 1,
                    "addr": "localhost:7080",
                    "leader": true,
                    "lastUpdate": "1598330419"
                }
            },
            "tablets": {
                "action": {
                    "groupId": 1,
                    "predicate": "action"
                },
                ... predicates here 
            },
            "snapshotTs": "3395461",
            "checksum": "7004392364713691865"
        },
        "2": {
            "members": {
                "2": {
                    "id": "2",
                    "groupId": 2,
                    "addr": "IPADDR:7080",
                    "leader": true,
                    "lastUpdate": "1597913245"
                }
            }
        }
    },
    "zeros": {
        "1": {
            "id": "1",
            "addr": "localhost:5080",
            "leader": true
        }
    },
    "maxLeaseId": "952307",
    "maxTxnTs": "3450000",
    "maxRaftId": "2",
    "cid": "ddee15d0-416e-4b14-ac4f-fdad40d18875",
    "license": {
        "maxNodes": "18446744073709551615",
        "expiryTs": "1586669495"
    }
}

ashishgoswami · August 27, 2020, 1:51pm

@M.Lau,

From the state information, you have two groups(“1” and “2”). You might arrive in this state as you forgot to stop old alpha. Since you started another alpha without stopping the first one, zero node might have done rebalancing.

M.Lau · August 28, 2020, 8:14am

Yes. I think it is because I accidentally started a dgraph alpha in the wrong directory without p, w folders which resulted in 2 groups registered at zero.

I removed the group by running: curl “localhost:6080/removeNode?group=2&id=2”
because I noticed that group 1 id 1 is holding the tablets…

I was able to run a query in Ratel successfully.

But when I tried to run a mutation subsequently and Ratel gave this error message:
{
“name”: “t”,
“url”: “http://localhost:8080/mutate?commitNow=true”,
“errors”: [
{
“message”: “cannot retrieve predicate information: No connection exists”,
“extensions”: {
“code”: “ErrorInvalidRequest”
}
}
]
}

can you help me please?

my current zero state is:
{“counter”:“2181”,“groups”:{“1”:{“members”:{“1”:{“id”:“1”,“groupId”:1,“addr”:“localhost:7080”,“leader”:true,“lastUpdate”:“1598603128”}},“tablets”:{ …(… removed … ) ,“2”:{}},“zeros”:{“1”:{“id”:“1”,“addr”:“localhost:5080”,“leader”:true}},“maxLeaseId”:“1242364”,“maxTxnTs”:“30000”,“maxRaftId”:“2”,“removed”:[{“id”:“2”,“groupId”:2,“addr”:“IPADR:7080”,“leader”:true,“lastUpdate”:“1598597463”}],“cid”:“2793f804-3c1a-4b47-8da3-d72055972c53”,“license”:{“maxNodes”:“18446744073709551615”,“expiryTs”:“1600924445”,“enabled”:true}}

my dgraph alpha /health is showing that it is online:
curl localhost:8080/health

{“version”:“v1.2.6”,“instance”:“alpha”,“uptime”:574}

M.Lau · August 31, 2020, 6:41am

I solved the problem by:

getting maxLeaseId from curl localhost:6080/state
shutting down dgraph zero & dgraph alpha
deleting the zw folder
restarting dgraph zero & dgraph alpha
remember to run command
‘curl localhost:6080/assign?what=uids&N=maxLeasedId’

This allow me to insert (run mutation) into the alpha.

I hope this helps the other users who are using single host setup.
How to replicate the problem:

Start dgraph zero, dgraph alpha single host
Shut down dgraph alpha
cd to another directory
start dgraph alpha
shut down the wrong dgraph alpha
cd to the original directory
start the original dgraph alpha
removeNode?group=2&id=2
when running mutation get No Connection Error.

Since the old dgraph alpha has been shut down, there is no transfer of tablets into the new dgraph alpha that was started at the wrong directory. All you need to do is to shut down the new dgraph alpha. shut down dgraph zero, delete zw, and restart dgraph zero, dgraph alpha at the right directory.

I not sure if this is a bug but apparently dgraph zero is supposed to automatically remove groups that don’t have any alphas inside. For my case when I query state of zero, i get 2 groups. the 2nd group has 2:{} which means there is no alpha inside. So I think dgraph zero didn’t remove the 2nd group, even after I run removeNode. I guess the existance of 2 groups cause the zero to try to shard. And the dgraph alpha is unable to serve the tablets thus no mutation can happen.

Topic		Replies	Views
"cannot retrieve predicate information: No connection exists" Dgraph	1	363	July 19, 2021
Unable to add mutations : cannot retrieve predicate information: No connection exists Users	5	1106	June 6, 2020
No connection exist Users	2	1280	January 23, 2018
Predicate is being moved, please retry later Users	22	2017	November 30, 2017
Dgraph zero rpc timeout when moving _predicate_ between groups Dgraph	2	567	April 17, 2018

Connection not found for insert

Related topics