How to delete duplicate node

i try to delete duplicate. However, i can not delete
it virtualize in dgraph
1
it is equal in draw.io


but I can try do delete duplicate

it is about a code


all_list = [
[‘A’, ‘C’, [‘F’,‘G’]],
[‘A’, ‘D’, [‘G’,‘H’]],
[‘A’, ‘E’, [‘F’,‘G’]],
[‘B’, ‘C’, [‘F’,‘G’]]
]

list_json =
for i in range(len(all_list)):
list_hashtag =
for j in range(len(all_list[i][2])):
print(‘------------------------------------------------------------------------------------------------’)
print(all_list[i][0],all_list[i][1],all_list[i][2][j])
list_hashtag.append({
“uid”: f":{all_list[i][2][j]}“,
“hashtag”: f”{all_list[i][2][j]}"
})
print(‘------------------------------------------------------------------------------------------------’)
list_json.append({
“user_handle”: f"{all_list[i][0]}“,
“user_name”: f”{all_list[i][0]}“,
“uid”: f”
:{all_list[i][0]}“,
“authored”: [
{
“tweet”: f”{all_list[i][1]}",
“tagged_with”: list_hashtag
}
]
})
p = {“set”: list_json}

this is about query code
{
tweet_graph(func: has(user_handle)) {
user_name
authored {
tweet
tagged_with {
hashtag
}
}
}
}

Hi @Mickey248,
It looks like we want to merge the “C” Nodes. I am trying to model and solve this as a “Data Merging” problem entirely using DQL, so that the solution is not dependent on any programming language.

Step 1
Let’s create a similar structure. If we are aware of duplicates, we need to add corresponding tags, so that this information is expressed in data and can be so queried.

{
  set{
    _:a <ilink> _:c1 .
    _:a <name> "A" .
    _:b <ilink> _:c2 .
    _:b <name> "B" .
    
    _:c1 <olink>  _:g .
    _:c1 <olink>  _:f .    
    _:c2 <olink>  _:f .
    _:g <name> "G" .
    _:f <name> "F" .
    
    
    _:c1 <name> "C1" .    
    _:c1 <tag> "duplicate" .
    _:c2 <name> "C2" .
    _:c2 <tag> "duplicate" .
  }
}

At this point, the graph around C node looks as below, quite similar to what you have.

Step 2
As a first mutation, we will add a new merged node and merge the outgoing links.

# create a merged node with merged outgoing links
upsert{
  query{
    duplicates(func: eq(tag,"duplicate")){
      outs as olink
    }
    merged as var(func: eq(tag,"merged"))
  } 
  mutation{
    set{
      uid(merged) <olink> uid(outs) .
      uid(merged) <tag> "merged" .
      uid(merged) <name> "C-MERGED" .
    }
  }
}

The graph around the new merged C node looks as below.

Similarly, we can merge the incoming links using the mutation below.

# merge incoming links into newly merged nodes
upsert{
  query{
    duplicate(func: eq(tag,"duplicate")){
      ins as ~ilink
    }
    v as var(func: eq(tag,"merged"))
  } 
  mutation{
    set{
      uid(ins) <ilink> uid(v) .
    }
  }
}

Now, the graph around the merged C looks to be in the shape we need it to be.

Step 3
Finally, we can delete the links around the tagged duplicate nodes “C1” and “C2”. Please take due care while doing this step.

# delete links to 
upsert{
  query{
    duplicate as var(func: eq(tag,"duplicate")){
      ins as ~ilink
      outs as olink
    }

  } 
  mutation{
    delete{
      uid(ins) <ilink> uid(duplicate) .
      uid(duplicate) <olink> uid(outs) .
    }
  }
}

C1 and C2 nodes are now orphans and can be cleaned up if required.

As mentioned earlier, this solution does not involve any python code. Please review.

thank you it is very useful to work