Query to delete duplicates

Hi

Is there a way to run a mutation that would remove only the extra links in a given graph. For example, given the following query:

{  
  offices(func: has(project.id)) @filter(gt(count(office),1)) {
    uid
    project.office
    project.title
    office {
      uid
      office.name
    }
  }
}

I could identify all the projects that are assigned to one office.

"data": {
    "offices": [
      {
        "uid": "0x2714",
        "project.office": "UK",
        "project.title": "Some project",
        "office": [
          {
            "uid": "0x1fc53",
            "office.name": "unassigned"
          },
          {
            "uid": "0x1fc57",
            "office.name": "some other office"
          },
          {
            "uid": "0x35bde",
            "office.name": "UK"
          }
        ]
      },

I’d like to remove all the edges to offices where project.office does not equal office.name.

In the example above, project.office is “UK”, so I’d like to remove the edges to the office nodes where office.name is not “UK”. I only want to remove the links to the offices, not the office nodes.

I would end up with …

"data": {
    "offices": [
      {
        "uid": "0x2714",
        "project.office": "UK",
        "project.title": "Some project",
        "office": [
          {
            "uid": "0x35bde",
            "office.name": "UK"
          }
        ]
      },

While I can write a chunk of code to delete the unwanted offices, I just think I should be able to do this in the database in a single mutation/query. I just haven’t quite managed to work out how to do it … any pointers gratefully received.

Thanks in advance.
Mike

You could use upsert to do this. https://docs.dgraph.io/master/mutations/#upsert-block. If you have trouble writing a query, happy to help. You query in upsert would query everything that you want to delete and the mutation would delete all the data that you want to delete.

I’d appreciate the assistance if you don’t mind! Thanks in advance.

Something like this

upsert {
  query {
     Q1 as q(func: has(project.office)){
      PO as project.office
      project.title
  }
    q2(func: uid(Q1)){
      office @filter(NOT eq(office.name, val(PO))) {
        TODELETE as uid
        office.name
      }
    }
  }

  mutation {
    delete {
      uid(Q1) <office> uid(TODELETE) .
    }
  }
}
2 Likes

Exactly like that :slight_smile: Thank you.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.