Deleting nodes?

When I’m deleting all edges of a node, I was also somewhat expeting the “node” itself would be deleted as well.
I’m trying the following in a Ratel window.

------- SCHEMA /ALTER
person: uid @reverse .
person_name: string @index(exact,fulltext,term,trigram) .
person_address: string @index(exact) .
person_email: string @index(exact) @upsert .
------ MUTATE

{
  set {
  _:person1 <person> _:jondoe .
  _:jondoe <person_name> "Jon Doe" .
  _:jondoe <person_address> "Home of Jon Doe 42" .
  _:jondoe <person_email> "jon.doe@acme.com" .
  }
}

------- MUTATE

{
  set {
  _:person2 <person> _:janedoe .
  _:janedoe <person_name> "Jane Doe" .
  _:janedoe <person_address> "Home of Jane Doe 33" .
  _:janedoe <person_email> "jane.doe@acme.com" .
  }
}

------ QUERY

{
  listPerson(func: has(person)) {
    uid
    person {
      person_name
      person_address
      person_email
    }
  }
}

------- RESULT from above Query

{
  "data": {
    "listPerson": [
      {
        "uid": "0x4f",
        "person": [
          {
            "person_name": "Jon Doe",
            "person_address": "Home of Jon Doe 42",
            "person_email": "jon.doe@acme.com"
          }
        ]
      },
      {
        "uid": "0x51",
        "person": [
          {
            "person_name": "Jane Doe",
            "person_address": "Home of Jane Doe 33",
            "person_email": "jane.doe@acme.com"
          }
        ]
      }
    ]
  }

----- Then I deletes the data: DELETE Person 1 /MUTATE

{
  delete {
    <0x51> * * .
  }
}

-------- DELETE Person 2

{
  delete {
    <0x4f> * * .
  }
}

--------- Then I repeats this query

{
  listPerson(func: has(person)) {
    uid
    person {
      person_name
      person_address
      person_email
    }
  }
}

---------- And get the following results

{
  "data": {
    "listPerson": [
      {
        "uid": "0x4f"
      },
      {
        "uid": "0x51"
      }
    ]
  }
}

-------- However I was expecting

{
  "data": {
    "listPerson": []
  }
}

What did I miss, where did I go wrong?

Since dGraph is not deleting the “person” node, would the consequences be that we have a lot of empty nodes?
Yes I know I could use @cascade, but the empty nodes would still be in the database?

Any suggestions, or pointers to the documentation would be most appreciated…

Cheers
Erlend

1 Like

Dgraph is deleting only one level of your model. That is, just the requested Node, it will not automatically expand and delete other Nodes below.

I believe there are two confusions here. The first is about the level of deletion (Note from docs: The patterns * P O and * * O are not supported since its expensive to store/find all the incoming edges.) The second is by creating several unnecessary Nodes.

if you run:

{
    person_name(func: has(person_name)) {
    uid
      person_name
      person_address
      person_email
     _predicate_
   }
}

You gonna have the result:


{
  "data": {
    "person_name": [
      {
        "uid": "0x2733",
        "person_name": "Jon Doe",
        "person_address": "Home of Jon Doe 42",
        "person_email": "jon.doe@acme.com",
        "_predicate_": [
          "person_name",
          "person_email",
          "person_address"
        ]
      },
      {
        "uid": "0x2735",
        "person_name": "Jane Doe",
        "person_address": "Home of Jane Doe 33",
        "person_email": "jane.doe@acme.com",
        "_predicate_": [
          "person_name",
          "person_email",
          "person_address"
        ]
      }
    ]
  }

That means the data is still there. You have not deleted users. Just your links/relations with the “Root node” that you called as “person”. See? You’ve create 4 Nodes for two supposed operations that should create 2 Nodes. Or One node “root” and then 2 nodes for childs. But if you are creating 4 nodes aware of this, you should know that you are deleting the parents and not the Child’s anyway.

Now I’ll explain why you’re seeing these UIDs. It’s normal they do exist, though, you forgot you’re using “person: uid @reverse.” This means that there will be information in the children pointing to these nodes. To eliminate this you would have to make a second mutation of the children to the parents. Thus eliminating the remnants of “@reverse”.

The Dgraph does not do this because it costs a lot to scan all DB for related parents. Besides being dangerous. Better the user do it.

My recomendation.

To avoid creating multiple nodes called “Person”. I would recommend using the mutation as below. That way you are only creating a “Kind”. Which is perfectly acceptable.

{
  set {
  _:jondoe <person> " " .
  _:jondoe <person_name> "Jon Doe" .
  _:jondoe <person_address> "Home of Jon Doe 42" .
  _:jondoe <person_email> "jon.doe@acme.com" .
  }
}

{
  listPerson(func: has(person)) {
    uid
    person_name
    person_address
    person_email
  }
}

Cheers

1 Like

Also take a read here Delete edge along with all values of pointing node - #11 by MichelDiz

Thank you !

This really helped and cleared up some of my confusion.
What you explained makes completely sence.

I might have understood it better if the example was person/address and not foo/bar regarding how to give a node a type :slight_smile:

You rock

Erlend

2 Likes

Some additions:

By keeping your model*, I would recommend you also collect the child nodes. And deletes them in the same batch.

{
  delete {
    <0x51> * * .
    <0x4f> * * .
    <0xa51> * * . # Delete the child node as well.
    <0xa4f> * * . # Delete the child node as well.
  }
}

In JSON would be like:


{
  "delete": [
      { "uid": "0x51" },
      { "uid": "0x4f" },
      { "uid": "0xa51" },
      { "uid": "0xa4f" }
  ]
}

Now if you keep “Person” as a single, immutable and reference Node. I would recommend that you create the “Person” node first. And then all new mutations you’ll create use the UID from it.

e.g:

{
  set {
  _:person1 <person> _:jondoe .
  _:person1 <name> "e.g This is a root" .
  }
}

We get the person’s UID. “0xf32b” and we mutate all new persons with it.

{
  set {
    <0xf32b> <person> _:jondoe .
  _:jondoe <person_name> "Jon Doe" .
  _:jondoe <person_address> "Home of Jon Doe 42" .
  _:jondoe <person_email> "jon.doe@acme.com" .
  }
}

But never delete this node. 0xf32b. Always the target child node.

But if you still have any doubts, let me know.

Your recomendation is really helpful for me, tks!

May I ask how to delete a node, not only it’s edge.
I used the
<uid> <edge> * .
and also the
<uid> * * .
in https://tour.dgraph.io/schema/7/
But the node itself still exist.

Cheers

1 Like

In Dgraph you can delete the contents of the Node. But his UID will remain, do not worry about it.

Haha, that’s what I was trying to do before,I’v been focusing on destory it’s UID but failed, if there’s no need to worry about it, then I will leave it alone, thanks a lot!

1 Like

Here is a question about the Node.
If I delete all the contents of the Node and leave it alone,there will be a Node exist in the databse all the time(the Node only has a UID).

Suppose we have many of these kinds of Node,Does this affect search efficiency or the usage of memeory?

Nops, Dgraph uses index tables to exec functions. So if your nodes are empty, they’re not indexed.

1 Like

Hum,I got it.
Thank you for your explanation

also i do. i use {“delete”:{“uid”:“0x1”}} . even it response success.but when i do query func:uid(0x1) it still exist also all the edge

If you are using the latest version of Dgraph (v1.1.0) you have to have the Type Schema.
Please, read this blog post

If there are million or billion “deleted” nodes, it can waste a lot of disk space, which is possible, especially we can’t delete them directly. I don’t like this flaw in honestly.

I don’t know how much space a single node takes. But If we assume (just speculation) that it takes 1byte (taking into consideration the number of bytes a hexadecimal needs - but this is probably an exaggeration) we could say that 1 billion of empty nodes could potentially be 1 Gigabyte. But need to check this.

However, you can simply use Upsert Transaction forever and everywhere.
e.g:

1 - Delete and set the node to be Recycled

upsert {
  query {
    v as var(func: eq(email, "user@company1.io"))
  }

  mutation {
  # Clean the node
    delete {
      uid(v) * * .
    }
  # Set to Recycle
    set {
       uid(v) <Recycle> "true" .
    }

  }
}

2 - When creating a new entity, Recycle the node.

Attention: Multiple mutations in an Upsert doesn’t work for now. It is coming Add support for multiple mutations in an upsert query · Issue #3817 · dgraph-io/dgraph · GitHub

upsert {
  query {
    v as var(func: eq(email, "user@company1.io"))
    RecycledUID as var(func: has(Recycle), first:1)
  }

# If the user exists with this email, we update
  mutation @if(eq(len(v), 1)) {
    set {
      uid(v) <name> "Lucas" .
      uid(v) <email> "user@company1.io".
      uid(v) <age> "31" .
    }
  }

# Else we capture the node to be Recycled and mutate.
  mutation @if(eq(len(v), 0) AND  eq(len(RecycledUID), 1)) {
    set {
      uid(RecycledUID) <name> "Lucas" .
      uid(RecycledUID) <email> "user@company1.io".
      uid(RecycledUID) <age> "31" .
    }
    delete {
       uid(RecycledUID) <Recycle> * .
    }
  }

# And else if the The user doesn't exists and also theres none
# UIDs marked to be recycled left. We create a new one.
  mutation @if(eq(len(v), 0) AND  eq(len(RecycledUID), 0)) {
    set {
      _:New <name> "Lucas" .
      _:New <email> "user@company1.io".
      _:New <age> "31" .
    }
  }

}
1 Like

May I ask why dgraph doesn’t have a real deletion of node?

Dunno, I think is a design thing. The more you worry about subtasks, the more you overload the workload. It is a matter of cost-benefit. However, if you don’t care about cost-effectiveness, just use upserts.

dgraph build on a key-value database named Badger, Badger group only by predicates, not nodes. group is called shard, in that way, every node is scattered across many shards.

let’s say you have person node {uid: 0x1, name: 'Tom', age: 20}, it will be stored like this :

shard1:
key: <name, 0x1>
value: “Tom”

shard2:
key: <age, 0x1>
value: 20

shards are scattered across servers, every shard maybe on a different server. so delete a node entirely may force the engine traverse lots of servers, it’s a expensive cost.

well I’m not sure about this, this is just what guess, let me know if I was wrong .