Deleted nodes appearing in search when I include uid

I’m starting with something very basic. New nodes have a type “bills” (to search for using func: has(bills)) and a name as a string. It’s working fine.

I delete a node, and those tripples are removed. But when I run:

query { 
  bills(func: has(bills)) { 
    uid
    expand(_all_)
  } 
}

The deleted nodes are appearing in the search, as empty nodes with just a uid even though clearly do not have a “bills” type. Why is this happening? Can I delete them permanently?

Interestingly this doesn’t happen if I don’t include “uid” in the search, but I’m including it in the results because I need it in the client application, and dgraph-js doesn’t include it in the getJson() function apparently.

Can you please Show a Mutation example? I feel that I know what is going on, but give me a mutation to be sure.

Hey Michel! (do you ever sleep?)

I run the mutation through dgraph js, and the code looks like this:

        const txn = this.dgraphClient.newTxn();
        try {
            const mu: Mutation = new dgraph.Mutation();
            mu.setSetJson(data);
            await txn.mutate(mu);
            await txn.commit();
        } catch(e) {
            console.log(e);
        } finally {
            await txn.discard();
        }

And data looks like this:

{
    name: 'Testing' 
    bills: true
}

EDIT

Apologies, it just dawned on my you probably want a delete mutation, right? It looks like this (very similar):

        const txn = this.dgraphClient.newTxn();
        try {
            const mu: Mutation = new dgraph.Mutation();
            mu.setDeleteJson(data);
            await txn.mutate(mu);
            await txn.commit();
        } catch(e) {
            console.log(e);
        } finally {
            await txn.discard();
        }

        if (callback) callback(null);

and data is simple:

{
     uid: 0x12
}

Am I doing something wrong?

Similarly in ratel’s “explorer” tab the empty nodes turn up for both the “name” and the “bills” schema

Yep, must be coincidences.

Can you check your version? It was not what I anticipated, but I think it might have to do with the version.

when I run dgraph in a terminal I see:

Dgraph version   : v1.0.10

That’s odd I’m using the same version, tested all possibilities, but there’s no sign of this at my side. :confused:

Must be something in dgraph-js, can you try only in Ratel?
Go to http://play.dgraph.io , point it to your addr and redo the processes please.

JSON mutation example

{  
   "set":[  
     {"name": "Testing", "bills": true },
     {"name": "Testing2", "bills": true },
     {"name": "Testing3", "bills": false },
     {"name": "Testing4", "bills": false }
   ]
}

The Query

{
  va(func: has(bills)) {
    uid
    expand(_all_)
  }
}

JSON Delation example

{  
   "delete": [
      {
        "uid": "0x22"
      },
      {
        "uid": "0x21"
      }
   ]
}
Dgraph version   : v1.0.10
Commit SHA-1     : 8b801bd7
Commit timestamp : 2018-11-05 17:52:33 -0800
Branch           : HEAD

I really don’t see what can be.

I’m trying out a bunch of things now to see.

In the example you’ve just set up, what happens if you run:

{
  va(func: has(_predicate_)) {
    uid
    expand(_all_)
  }
}

EDIT:

When I drop all keys and run everything from scratch they way you did, I initially don’t have that bug, then a small while later I do. Could it be something to do with how I’ve set things up?

I think I found the source of this. It’s maybe because this issue was solved in RDF context, but JSON don’t. We’re looking it.

Hey @Awoogamuffin I have news,

Soon Dgraph will change to a new type system, so this will be fixed with that.
For now you can try a “query hack” like below.

{
  va(func: has(bills)) @filter(has(bills)) 
    {
    uid
    expand(_all_)
  }
}

OR

{
  va(func: has(bills)) @filter(has(name)) 
    {
    uid
    expand(_all_)
  }
}

Cheers.

1 Like

Thanks so much for the time with all this!

I’ve found a quick hack was to use:

va(func: has(bills)) @cascade

which eliminates the phantom uids.

My concern is that when I play around in ratel, everything is fine, until suddenly it isn’t. I don’t know what kicks it off. I try changing the search options, adding a trigram index to name, and it all works, then suddenly the phantom uids appear again.

Furthermore I’m concerned by the results of:

{
  va(func: has(_predicate_)) {
    uid
    expand(_all_)
  }
}

It shows all the phantom uids of not only the recent mutations, but all priors mutations (before I dropped all predicates). Will they continue to accumulate indefinitely as I delete things? Is there no way to permanently delete a node?

EDIT

Stranger still, running that whole has(predicate) doesn’t actually show all phantom ids. Look at the result:

{
  "data": {
    "va": [
      {
        "uid": "0x1"
      },
      {
        "uid": "0x2"
      },
      {
        "uid": "0xd"
      },
      {
        "uid": "0xe"
      },
      {
        "uid": "0xf"
      },
      {
        "uid": "0x10"
      },
      {
        "uid": "0x11"
      },
      {
        "uid": "0x12"
      },
      {
        "uid": "0x13"
      },
      {
        "uid": "0x14"
      },
      {
        "uid": "0x15"
      },
      {
        "uid": "0x16"
      },
      {
        "uid": "0x17"
      },
      {
        "uid": "0x18"
      },
      {
        "uid": "0x19"
      },
      {
        "uid": "0x1a"
      },
      {
        "uid": "0x1b"
      },
      {
        "uid": "0x1c"
      },
      {
        "uid": "0x1d"
      },
      {
        "uid": "0x1e"
      },
      {
        "uid": "0x1f"
      },
      {
        "uid": "0x20"
      },
      {
        "uid": "0x21"
      },
      {
        "uid": "0x22"
      },
      {
        "uid": "0x23"
      },
      {
        "uid": "0x24"
      },
      {
        "uid": "0x25"
      },
      {
        "uid": "0x26"
      },
      {
        "uid": "0x27"
      },
      {
        "uid": "0x2b",
        "bills": false,
        "name": "Testing4"
      }
    ]
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 82557,
      "processing_ns": 10970042,
      "encoding_ns": 1092269
    },
    "txn": {
      "start_ts": 9969
    }
  }
}

clearly not all used uids are present (0x3 for example). I’ve probably screwed something up with how I’ve set up dgraph. I’ll delete everything (down to the p, w and zw directories) and start again. Sorry for all the annoyance!

1 Like

That’s can be the case. But only if you don’t care about a flat object.

For now, there’s no way to delete that nodes. They’re considered as deleted. They should be ignored.

That _predicate_ predicate is kind of a root indexing. You gonna find all linked to it.

Good point, I’ll be sure to use the double filter hack instead! Thanks for pointing that out.

Ok, will do! Do they still take up memory though? I know the uid is just a long, but the OCD in me doesn’t like thinking of that used-up memory.

The LRU RAM just are used for cache. If you don’t call the empty Nodes, they’d not be in the memory.

you can follow this issue
https://github.com/dgraph-io/dgraph/issues/2672

Yes, that seems like a similar issue. Especially the fact that it appears to happen randomly

I do not think it’s similar with this issue (It seems, but it is not. However, it can only be related). For it is an indexed context. But what happens with u Michael Beeson is that you deleted the nodes and they return empty - clearly the Dgraph is finding the predicate that was erased in that “Phantom Node” so it became a “phantom”. So it’s diff.

This random thing is a totally diff issue. If you do, please give a more detailed example to reproduce. If I can’t reproduce there’s no demand to fix it.

But as I said, things gonna change soon anyway.

Cheers.

Hey Michel! I’ve finally had time to really test this, and I think maybe it does indeed have something to do with indexing.

So I wipe the database and start fresh, then create the nodes as in the example you made

{  
   "set":[  
     {"name": "Testing", "bills": true },
     {"name": "Testing2", "bills": true },
     {"name": "Testing3", "bills": false },
     {"name": "Testing4", "bills": false }
   ]
}

Then delete:

{  
   "delete": [
      {
        "uid": "0x22"
      },
      {
        "uid": "0x21"
      }
   ]
}

Then, I go into the schema and add an index to “name” (in this case I used trigram).

After that (I don’t if this is necessary for the bug to show up), I delete another node.

All the while, the query:

{
  va(func: has(bills)) {
    uid
    expand(_all_)
  }
}

Is behaving as expected.

Then I wait for a few minutes (make a coffee, read the paper) and run the query again and boom! Phantom uid nodes appearing. What’s more interesting, the uids that appear are the ones I delete BEFORE adding the index to “name”. The one I deleted afterwards remains gone.

Don’t know if this is of any use to you guys, but there you have it. I’ve gone through this twice and it’s happened both times. The time to wait is not too long; maybe a few minutes (maybe until the next time dgraph zero does that “purged below…” thing?)

In the meantime, the @filter hack is working, but I do think this has something to do with indexes.

EDIT

After a bit more time, the node I deleted after adding the index turns up also

1 Like

this problem is getting worse. If I use expand(all) on a query, and I want to include uid, I’m getting all these phantom nodes. If I want to use the @filter hack I can’t just use expand(all) (which I would rather do). After running some edits (which involves deleting billIItems then adding new ones) I get a right mess of phantom nodes which will only multiply with each subsequent edit.

For example, this query:

{
  bills(func: has(billsType)) @filter(has(billsType)) {
    expand(_all_) {
      uid
      expand(_all_)
    }
  }
}

Gives me this result:

{
  "data": {
    "bills": [
      {
        "date": "2018-11-20T23:00:00Z",
        "billItems": [
          {
            "uid": "0x7e"
          },
          {
            "uid": "0x7f"
          },
          {
            "uid": "0x81"
          },
          {
            "uid": "0x82"
          },
          {
            "uid": "0x83"
          },
          {
            "uid": "0x84",
            "name": "Beer",
            "billItemsType": true,
            "amount": 325
          },
          {
            "uid": "0x85",
            "name": "Food",
            "billItemsType": true,
            "amount": 600
          },
          {
            "uid": "0x86",
            "name": "Food",
            "billItemsType": true,
            "amount": 600
          },
          {
            "uid": "0x87",
            "name": "Beer",
            "billItemsType": true,
            "amount": 325
          },
          {
            "uid": "0x88",
            "name": "Beer",
            "billItemsType": true,
            "amount": 325
          },
          {
            "uid": "0x89",
            "name": "Beer",
            "billItemsType": true,
            "amount": 325
          }
        ],
        "billsType": true,
        "uid": "0x80"
      }
    ]
  }
}

This only happens when I include “uid” in the query. If I don’t, the phantom nodes stay away. Maybe another solution would be a way to make dgraph-js to include the uids in the results (by default it appears not to, unless I include uid in the query). Is there a way to do that?

But you are.

Try this:

{
  bills(func: has(billsType)) @filter(has(billsType)) {
   uid
   date
   billsType
   billItems @filter(has(billItemsType)) {
      uid
      expand(_all_)
    }
  }
}

Yes I’m already implementing that. It’s just a pity that I can’t use expand(_all_)