My desire is to be able to write an RDF file that when inserted into Dgraph using dgraph live -r myfile.rdf.gz that no matter how many times I run it, it will not insert duplicates of any of the triples.
So far from what I’ve been reading and testing, it doesn’t seem like there is a way to do this. So my last ditch is to post here and see what you guys think or whether there are alternative suggestions.
The use case is our data is being collected asynchronously and sent to the graph via events. I’d like for each event to generate an RDF file that ends up being fed to dgraph live but from what I’m seeing I’ll actually have to write code to do a transaction and manual query for each predicate.
On a side note, also not sure what to query to find out which node is the “lead” node. Querying http://127.0.0.1:8080/state doesn’t return anything. It would be nice to have a list of HTTP endpoints in the docs, or maybe I missed it.
That’s right. There is no way to currently for Dgraph live to do this right now. You’d have to add an index to store your xid, query it and use it.
The /state endpoint is present on Zero. So considering that Zero is serving HTTP on port 6080 (assuming you started with offset -2000), you could goto http://127.0.0.1:6080/state.
So I have zero running on port 5080 (kubernetes HA setup) and if I run curl http://127.0.0.1:5080/state it returns ? not sure what the question mark means.
@llonchj thats an interesting idea. So I could store a map on the local file system and lookup uids and do a search/replace maybe…
Or because of concurrency might be better to just write my own script.
@artooro If you always plan to use the same client to upload the data, then this should work. We store the xid => uid mapping in the --xidmap directory locally on the client.