I have been using Dgraph recently to import CSV and JSON data. I am trying to ensure that I do not have duplicate values for some key reference data (e.g. currency values). I have multiple files that contain the CSV or JSON data.
With Neo4j, I can do upserts using a Cypher command called MERGE. It uses a constraint model that must be defined prior to loading the data (see below). Based on that constraint, any value that already exists will not be loaded.
// CURRENCY
CREATE CONSTRAINT ON (currency:Currency) ASSERT currency.currency_code IS UNIQUE;
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///Currency-Rates.csv' AS line
MERGE (currency:Currency {currency_code: line.Currency_Code});
With Dgraph, using an upsert block looks like this:
upsert {
query {
v as var(func: eq(currency.code, "CAD"))
}
mutation {
set {
uid(v) <currency.code> "CAD" .
uid(v) <currency.name> "Canadian Dollar: .
}
}
}
So Dgraph will do an upsert based on the “uid”. When the upsert block queries and determines if a “uid” exists or not. it will execute the mutation or not.
So my question - Is there a way to use an input file and load Dgraph without creating upsert blocks for every record in the file?