Loading Edges into Dgraph database

CEMcCreadie · September 1, 2021, 11:56am

What I want to do

For the sake of an example I have two very basic JSON files. One of the data types being Person which contains a name and a company name they work for, and the other being data type Company containing its name.

My schema is

type Person {
  name
  company_name
  works_for: [Company]
}
type Company {
  company_name
}

And the example JSON files are:

Person.json

[
  {
     "name":"Example name",
     "company_name":"Acme"
  }
]

Company.json

[
  {
     "company_name":"Acme"
  }
]

I can load these in with the Live Loader very easily and get the following output.

I now want to create a edge “works_for” where company_name is the same for person and company. How best to do this in a scaleable manner?

MattH · September 1, 2021, 1:28pm

I’m using Bulk Loader to instantiate a graph, and just went through a similar exercise. In my example, I have Author and Book entities, with an “authored” edge. The .json for the authored edge looks like this:

[
  {
    "uid": "_:8d61c26e-a959-4383-bd6d-1cc922368688",
    "authored": {
      "uid": "_:0ad095ee-ef9a-4c10-af84-170da2d3c604"
    }
  }
]

In my scenario, I’m loading everything at once, so using the uid/blank node feature to associate the specific author to the specific book.

Some additional details here:

MichelDiz · September 1, 2021, 2:29pm

You can also use Upsert Block. (There is a JSON version, which I believe it works only via HTTP/cURL):

upsert {
  query {
    q(func: eq(company_name, "Acme")) {
      v as uid #Find the company
    }
  }

  mutation {
    set {
      _:NewUser  <name> "Example name" .
      _:NewUser  <dgraph.type> "Person" .
      _:NewUser  <works_for> uid(v) .
    }
  }
}

See https://dgraph.io/docs/mutations/upsert-block/#sidebar

You have to remove the “company_name” from your users to not confuse it.

Liveloader has some options to do automatic upserts. But you have to use a flag to record the XIDs and also keep tracking XIDs(XIDs are external identifiers, in the case of Dgraph, Blank nodes will be considered XIDs during the load).

CEMcCreadie · September 2, 2021, 10:05am

Thanks for the help. Using you suggestion I now have python script that upserts the edges node by node, it’ll do for now but it isn’t really scalable. I guess that to load in bulk a lot of data with the performance like: Loading close to 1M edges/sec into Dgraph - Dgraph Blog it needs to be in one big rdf file with the edges predefined?

MichelDiz · September 2, 2021, 12:22pm

The size or the amount of files has no limit. It depends on the resources available. And no, the edges don’t need to be predefined, you can connect them later.

Topic		Replies	Views
SQL to Dgraph Migration - Edges not creating Dgraph	9	394	August 20, 2021
How to set up data for edge data? Dgraph dataset , liveload , area:live-loader	6	753	August 13, 2020
Live Loader - Node not creating as expected Dgraph	1	395	March 18, 2022
Import single json creating unique nodes and edges GraphQL	1	660	May 6, 2021
Bulk JSON loading Dgraph mutation	1	829	January 29, 2019

Loading Edges into Dgraph database

What I want to do

Related topics