SQL to Dgraph Migration - Edges not creating

What I want to do

I have some relational database tables (PostgreSQL) which I’m trying to convert to Dgraph.

There are two tables, author and books, and then a map table supporting a many-to-many relationship between the two tables.

I’m using the bulk loader to load the data.

I’m having problems getting the mapping to work in Dgraph.

What I did

I exported the authors and books table from DBeaver directly to JSON.

For the mapping table, in order to get the data into what I think is the right format, I exported the table to .csv and then ran the csv2json command per the documentation.

Here are snippets from each of the three JSON files:

Author:

{
    "author_id" : "00004178-a951-4425-a260-40e548189169",
    "full_name" : "Joe Johnson",
	"dgraph.type" : "author"
}

Book:

{		
	"book_id" : "d6d30b03-27cf-4156-8b63-0218ff69a2bc",
	"title" : "Awesome Book Title",
    "date" : "2018-10-11",
	"dgraph.type" : "book"
}

The mapping file, authored.json:

{
    "uid": "_:00004178-a951-4425-a260-40e548189169",
    "authored": {
		"uid": "_:d6d30b03-27cf-4156-8b63-0218ff69a2bc"
	}
}

The Authors and Books appear to load fine, but the “authored” predicate doesn’t seem to work and the following query comes back empty:

  {
    authors(func:has(authored), first: 5) {
         full_name
         authored {
           expand(_all_)
         }
   }
 }

If I run the following mutation, the above query will find Bugs Bunny as expected (but nothing else):

{
	set {
		_:7b100b59-6c99-473e-be52-eb00ec1ee631 <author_id> "7b100b59-6c99-473e-be52-eb00ec1ee631" .
		_:7b100b59-6c99-473e-be52-eb00ec1ee631 <dgraph.type> "author" .
		_:7b100b59-6c99-473e-be52-eb00ec1ee631 <full_name> "Bugs Bunny" .
		_:7b100b59-6c99-473e-be52-eb00ec1ee631 <authored> _:25476291-e5a2-4867-bdba-ac09736323b0 .
		
		_:25476291-e5a2-4867-bdba-ac09736323b0 <book_id> "25476291-e5a2-4867-bdba-ac09736323b0" .
		_:25476291-e5a2-4867-bdba-ac09736323b0 <dgraph.type> "book" .
		_:25476291-e5a2-4867-bdba-ac09736323b0 <title> "1,239 Ways to Cook a Carrot" .
	}
}

Is there something wrong with the way I’m creating the authored.json?

Thanks in advance!

I think I caught that one of my issues with the authored.json is that the format should be:

{
    "author_id": "_:00004178-a951-4425-a260-40e548189169",
    "authored": {
      "book_id": "_:d6d30b03-27cf-4156-8b63-0218ff69a2bc"
    }
  }

I had copied the syntax from the example and didn’t catch that it probably needs to be the blank node unique external id predicate (xid).

With that said though, I fixed the authored.json, reloaded everything via the bulk loader and my authored edges still don’t seem to be creating…

Any other ideas would be appreciated - thanks in advance!

_:identifier is a blank node placeholder allowing you to reference a node with an unknown uid throughout a set block.

After the set happens, that blank is replaced with a dgraph created uid.

When you’re creating the edge, you need to reference the author and book by their uid, or set the author_id and book_id to be unique id’s so you can also use them to reference the node.

See https://dgraph.io/docs/mutations/blank-nodes/.

Thanks for the response. Yes, that’s what I’m trying to accomplish in the authored.json. Both author_id and book_id are uuid-based primary keys in the source database, which I’m trying to reference as unique xids.

Can the blank nodes syntax be used in a separate file with bulk loader? Do the the three files need to be concatenated?

I haven’t used Bulk Loader, but are are you using the --store_xids and --xidmap arguments?

That looks like it will persist your blank node identifier as an xid field on the created node so that operations in other files can be mapped back to the same created node.

The D.Type is case-sensitive. I think for your case it should be “Book” instead.

Thanks @MichelDiz.

My schema has the types defined as lowercase, but since it seems like the best practice is to start with upper-case letters, I’ll make the change.

In regards to the authored.json file that contains the author to book mappings, does this format above look like it should work, or do I need to switch Bulk Loader to use the --store_xids flag and add the xid predicate to both the author and book files, as suggested by @rcbevans?

If I could you please show what the new format should look like?

Current:

{
    "author_id": "_:00004178-a951-4425-a260-40e548189169",
    "authored": {
          "book_id": "_:d6d30b03-27cf-4156-8b63-0218ff69a2bc"
    }
  }

Would it just become:

{
    "xid": "_:00004178-a951-4425-a260-40e548189169",
    "authored": {
          "xid": "_:d6d30b03-27cf-4156-8b63-0218ff69a2bc"
    }
 }

Thank you

I dropped the schema and data. I then created three files with just one entry each (Author, Book, and the mapping). I just tried running Bulk Loader with both the --store_xids and xidmap settings on, but am still seeing the same behavior. It creates the Author and Book nodes. Then it creates two empty nodes with just the “authored” edge between them.

Am I off somewhere with quotes?

I’m going to try concatenating the files next to see if that works. If not, then I’ll try switching to RDF format.

The correct key to use is “uid” in order to the XIDS and mapping to work.

e.g:

{
    "uid" : "_:00004178-a951-4425-a260-40e548189169",
    "author_id" : "00004178-a951-4425-a260-40e548189169",
    "full_name" : "Joe Johnson",
	"dgraph.type" : "Author"
}
{		
    "uid" : "_:d6d30b03-27cf-4156-8b63-0218ff69a2bc",
	"book_id" : "d6d30b03-27cf-4156-8b63-0218ff69a2bc",
	"title" : "Awesome Book Title",
    "date" : "2018-10-11",
	"dgraph.type" : "Book"
}

and then

{
    "uid": "_:00004178-a951-4425-a260-40e548189169",
    "authored": {
		"uid": "_:d6d30b03-27cf-4156-8b63-0218ff69a2bc"
	}
}
1 Like

Thanks @MichelDiz and @rcbevans

The xid setting and uid syntax fixed it and the edges are working now. Thank you for your help!