Using a GraphQL Schema + Live Loader to ingest data

Hey there,
not sure if this is the right category but I’ll give it a shot.

I’m currently using Dgraph solely via a GraphQL Schema, meaning that I have a schema.graphl file which looks something like this:

type Person {
  id: ID!
  slug: String! @id
  firstName: String!
  lastName: String!
  posts: [Post] @hasInverse(field: author)
}

type Post {
  id: ID!
  slug: String! @id
  author: Person
}

# ... Other type definitions

Spinning up Dgraph and mutating / querying via the GraphQL endpoint works flawlessly.

However given that I have a lot more data to ingest, inserting it via GraphQL mutations isn’t feasible. Live Loader seems to be the correct option here. In particular I need to rely on the --upsertPredicate option to ensure that re-runs won’t overwrite the already present data in the future.

I followed the docs in https://dgraph.io/docs/deploy/fast-data-loading/live-loader/ but I just can’t make the connection between the upserPredicate example and the GraphQL Schema I wrote.

After fiddling around a little bit I came up with this data.rdf file:

<jdoe> <Blockchain.slug> "jdoe" .
<jdoe> <Blockchain.firstName> "John" .
<jdoe> <Blockchain.lastName> "Doe" .
<jdoe> <dgraph.type> "Person" .

I can now run this script via the following Live Loader command: dgraph live --files data.rdf --upsertPredicate "Person.slug"

It will only ever create one node given that the slug is used as the --upsertPredicate.

Now the downside is that this only works because I prefixed slug with Person (Person.slug) which means that there’s no way to have a single .rdf file with multiple different types in it while still using the --upsertPredicate flag that way.

Can someone please explain how Live Loader should be used with an existing GraphQL Schema and the --upsertPredicate flag (a full example with edges between nodes would be super helpful!)?

Is using the slug fine or should I add xid to the GraphQL types? Is xid just a convention one should follow to provide own ids or is it relied upon by Dgraph internally?

And what’s the difference between:

<foobar>
<_:foobar>
_:foobar

Thanks a lot for working on Dgraph and taking the time to answer this question!

After spending more time searching through this forum and the docs I was finally able to figure out a way to use Live Loader in combination with a GraphQL-based schema.

As @pshaddel pointed out in How do you go from GraphQL to RDF for bulk loading? you can specify the type and predicate names in the GraphQL Schema.

See the documentation here: https://dgraph.io/docs/graphql/dgraph/#mapping-graphql-to-a-dgraph-schema

Based on that I came up with the following solution.

The GraphQL Schema:

type Person {
  id: ID!
  xid: String! @id @dgraph(pred: "xid")
  slug: String! @id
  firstName: String!
  lastName: String!
  posts: [Post] @hasInverse(field: author)
}

type Post {
  id: ID!
  xid: String! @id @dgraph(pred: "xid")
  slug: String! @id
  author: Person
}

The .rdf file:

<person/jdoe> <Person.slug> "jdoe" .
<person/jdoe> <Person.firstName> "John" .
<person/jdoe> <Person.lastName> "Doe" .
<person/jdoe> <dgraph.type> "Person" .
<person/jdoe> <xid> "person/jdoe" .

<post/post-1> <Post.slug> "post-1" .
<post/post-1> <dgraph.type> "Post" .
<post/post-1> <xid> "post/post-1" .

<post/post-1> <Post.author> <person/jdoe> .

If I now run dgraph live --files data.rdf --upsertPredicate "xid" everything gets inserted accordingly. Running the command multiple times doesn’t insert the data multiple times (given that we’re using the xid to uniquely identify it).

TBH I’m not entirely sure why this works (especially why I have to use the xid as a Node’s subject like <post/post-1>) but it does what it should do.

Would be awesome if someone from the Dgraph team could chime in if this is the best practice to accomplish this task. Thanks in advance!

1 Like

This syntax is a bit dangerous for the parser. You lucky cuz the upsertPredicate feature is new and probably works before the parser being run. You should use <_:person/jdoe> instead.

The feature upsertPredicate creates several upsert blocks under the hood. Based on the blank node used. And if the blank node is stored in the XID predicate. It will upsert correctly. That’s all.

I see. That makes sense. Thanks for taking the time to reply.

Just a quick follow-up question: Is using xid: String! @id @dgraph(pred: "xid") on every GraphQL type in combination with the --upsertPredicate "xid" the correct approach / workaround here?

Overall this might be personal preference, but I just want to make sure that I’m not following bad practices when there’s a better solution.

I don’t see any problem.

Well, I’m not the guy to recommend best practices in GraphQL. But looks like it won’t be a problem. And If you follow the docs, you’re good to go.

I can cover you 99% DQL and 60% GraphQL(I mean, Dgraph’s features) with my Knowledge. If you still need someone to rectify it I can ping for u.

I don’t see any problem.

Awesome. Thanks for confirming.

If you still need someone to rectify it I can ping for u.

All good. Thanks for the offer and taking the time to reply here.

1 Like