Upsert resulting duplicate node

My below upsert statement always resulting in duplicate nodes.
Any help.

 upsert {
  query {
   student_uid as var(func: eq(Student.studentId, "009aec0b-0068-16bb-e914-379de14d4e9a")) {uid}
  }

  mutation {
   set {
    uid(student_uid) <Student> <_:my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9a> .
    <_:my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9a> <dgraph.type> "Student" .
    <_:my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9a> <xid> "my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9" .
    <_:my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9a> <Student.studentId> "009aec0b-0068-16bb-e914-379de14d4e" .
   }
  }  
}
  • My schema
    type Student {
      studentId: String! @id
      courses: [Course] @hasInverse(field: student)
      xid: String!  @search(by: [hash])
    }
    

Also, I tried this variant

 upsert{
  query {
   student_uid as var(func: eq(xid, "my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9")) {
    uid 
   }
  }

 mutation {
  set {
   uid(student_uid) <xid> "my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9" .
   uid(student_uid) <Student.studentId> "009aec0b-0068-16bb-e914-379de14d4e9" .
   uid(student_uid) <dgraph.type> "Student" .
  }
 }
}

You are using Blank node in an upsert query. Blank node means “create a new node” and every time you hit mutate, it will create a new node. And link with the one you are targeting.

Can you explain what you are trying to do? Feels like you wanna upsert the Student entity, but why the uid(student_uid) <Student> <_:my.org/Student/009aec0b-0068-16bb-e914-379de14d4e9a> . ?

I don’t recommend this approach. If you are relying in the XID predicate. You should never try to upsert that value.

  • Also, I tried this variant
   upsert {
    query {
     var(func: eq(Student.studenttId, "009aec0b-0068-16bb-e914-379de14d4e9")) {
      student_uid as uid
    }
   }

   mutation {
    set {
      uid(student_uid ) <xid> "my.org/Patient/009aec0b-0068-16bb-e914-379de14d4e9" .
      uid(student_uid ) <Student.studenttId> "009aec0b-0068-16bb-e914-379de14d4e9" .
      uid(student_uid ) <dgraph.type> "Student" .
     }
    }  
   }

And what happened? This query seems valid.

Duplicate node, interestingly both have same uid and studentId

{
 test2(func: has(Student.StudentId)) {
  Student.studentId
 }
}

Below reply

{
  "data": {
    "test2": [
      {
        "Student.studentId": "009aec0b-0068-16bb-e914-379de14d4e9"
      },
      {
        "Student.studentId": "009aec0b-0068-16bb-e914-379de14d4e9"
      }
    ]
  },
  "extensions": {
    "server_latency": {
      "parsing_ns": 76702,
      "processing_ns": 1287032,
      "encoding_ns": 22201,
      "assign_timestamp_ns": 677617,
      "total_ns": 2125653
    },
    "txn": {
      "start_ts": 1044
    },
    "metrics": {
      "num_uids": {
        "Student.studentId": 2,
        "_total": 2
      }
    }
  }
}

Not possible, make sure to delete all the other duplicated and try again. The query is valid.

  • It’s a consistent respro for me.
  • I did DROP DATA and re-executed mutations and query
  • Query
    {
     test2(func: has(Student.studentId)) {
      Student.studentId
      uid
     }
    }
    
  • Mutation
     upsert{
      query {
       var(func: eq(Student.studentId, "009aec0b-0068-16bb-e914-379de14d4e9")) {
       student_uid as uid
      }
     }
     mutation {
      set {
        uid(student_uid ) <Student.studentId> "009aec0b-0068-16bb-e914-379de14d4e9" .
        uid(student_uid ) <dgraph.type> "Patient" .
       }
      }  
    }
    
  • Reply
    {
    "data": {
      "test2": [
        {
          "Student.studentId": "009aec0b-0068-16bb-e914-379de14d4e9",
          "uid": "0x4c2"
        },
        {
          "Student.studentId": "009aec0b-0068-16bb-e914-379de14d4e9",
          "uid": "0x4c3"
        }
      ]
    },
    "extensions": {
      "server_latency": {
        "parsing_ns": 56601,
        "processing_ns": 917524,
        "encoding_ns": 24700,
        "assign_timestamp_ns": 590615,
        "total_ns": 1633441
      },
      "txn": {
        "start_ts": 1094
      },
      "metrics": {
        "num_uids": {
          "Student.studentId": 2,
          "_total": 4,
          "uid": 2
        }
      }
    }
    }
    

Porsche, is that your name?

Look, computers don’t lie. I see your query, and I see it is valid. There’s no other indication that would lead to duplication. Dgraph itself has a ton of self testing steps before shipping any feature. And it is not possible to have duplicate if your query is valid. Which means, you haven’t dropped the data correctly.

And why this UUID 009aec0b-0068-16bb-e914-379de14d4e9 is a student in the previous results and now it is a Patient?

In order to delete something you have to have a perfect aligned Schema and also the correct <dgraph.type> value. If something is wrong in the middle, it won’t delete the data.

So, please, instead of trying Drop or something. Delete all data from Dgraph manually. Deleting the whole files in the p, zw, w directories. With this you gonna start from scratch.

BTW. If you gonna rely on XID you should read this https://dgraph.io/docs/mutations/external-ids-upsert-block/#sidebar

Cheers!

1 Like