Continue a dataset load from where it stopped, with Live Load or Bulk Load which may have been interrupted by N reasons.
Why that wasn’t great, with examples
When an interrupt occurs. And I try to insert the load again, the load start from scratch. This is not desired result. Let’s avoid spending time rewriting something that is already in the DB.
This issue is not just about duplicate Nodes due to a load retry. You can avoid duplicated nodes by using the --xidmap flag.
e.g:
./dgraph live -f test.rdf,other.rdf.gz -s test.schema --xidmap ./xd
Every time you reuse the XIDMAP mapping files, all previously mapped blank_nodes will be automatically addressed/written to the mapped UID.
However the load will always start from scratch, even though Blank_nodes have already been mapped. This issue is just to create a “checkpoint” feature to avoid spending days rewriting something that is already in the DB.