Custom XIDs for creating edges

zmateen · August 15, 2020, 7:40am

I’m bulk-loading data into d-graph for an external source, as is. There’s a literal that uniquely identifies each node. Let’s call it “name”. Each node has a “name”. Some nodes (10% of the total) have a literal “targetName” that specifies “name” of another node to which do I need to create a “target” edge.
This is how I do it right now:

Bulk load the data.
Query for all nodes and return their "name"s and uids.
Externally create a uid to “name” hash
Query for all nodes (with a has(targetName) filter)
Create the N-quad for each mutation, one at a time (using the hash, “name” and “targetName”)
Live-load the generated RDF

Currently Step 2 here is killing my efficiency by orders of magnitude, and it is pretty redundant and suboptimal to go about. Is there a native dgraph functionality that can help me achieve this?

Given that I’m maintaining the uniqueness of “name” externally anyway, if I can use it as an xid identifier for live load mutations, that’ll be tremendously helpful but I’m not sure if XIDs serve that purpose.

anand · August 15, 2020, 2:08pm

Welcome @zmateen,

I am not sure if you have considered the upsert capability which might help in eliminating step 2 altogether.

Considering the schema :

<friend>: [uid] .
<friendName>: string @index(exact) .
<name>: string @index(hash) .

Consider a Person type as below. The name attribute is setup for identification, and friend is a relation between Person objects.

Given this, if you could create a mutation for each x <friend> y as below. It will try to match by name, and if not found, create either/both “x” and “y” node. “john” and “steve” can be replaced by the unique identifiers you are using.

upsert {
  # john friend steve
  query {
    findX(func: eq(name, "john")) {
      x as uid      
    }
    
   findY(func: eq(name, "steve")) {
      y as uid      
    }
  }
  mutation {
    set {
#     set types      
      uid(x) <dgraph.type> "Person" .
      uid(y) <dgraph.type> "Person" .
#     set relation friend      
      uid(x) <friend> uid(y) .
#     set attributes      
      uid(x) <name> "john" .
      uid(y) <name> "steve" .
    }
  }
}

You could load data directly through curl commands as mentioned here. You could also use the Ratel UI, it’s definitely more user friendly.

Topic		Replies	Views
Understanding bulk data loads, and bulk updates, with XID in v0.8 Users	2	859	November 1, 2017
Dgraph live - way to prevent duplication of data? Users	7	1575	February 16, 2018
How to update a large amount of data in dgraph every day Dgraph mutation	23	3671	August 10, 2020
How can I build xidmap Dgraph	8	1493	November 21, 2019
How to implement custom id to replace uid, and different nodes are associated according to id Dgraph	5	1030	July 5, 2020

Custom XIDs for creating edges

Related topics