Deal better with Unrecognized RDF types

Moved from GitHub dgraph/4915

Posted by MichelDiz:

Experience Report

reference: Support JSON-LD on Dgraph

What you wanted to do

Import an RDF from a public dump.

What you actually did

dgraph live -f ./agrovoc_2019-11-04_lod.nt --format=rdf

Processing data file "./agrovoc_2019-11-04_lod.nt"
[14:00:03-0300] Elapsed: 05s Txns: 470 N-Quads: 470000 N-Quads/s [last 5s]: 94000 Aborts: 0
Error while mutating: Attr: [http://www.w3.org/2004/02/skos/core#scopeNote] should have @lang directive in schema to mutate edge: [entity:6090028 attr:"http://www.w3.org/2004/02/skos/core#scopeNote" value:"Trozas que en el aserradero, se usan para la elaboraci\303\263n de la madera aserrada" lang:"es" ] s.Code Unknown
Error while processing data file "./agrovoc_2019-11-04_lod.nt": During parsing chunk in processLoadFile: while parsing line "<http://aims.fao.org/aos/agrovoc/xl_tr_433_1321789680651> <http://www.w3.org/2004/02/skos/core#notation> \"1339129579719\"^^<http://aims.fao.org/aos/agrovoc/AgrovocCode> .\n": Unrecognized rdf type http://aims.fao.org/aos/agrovoc/AgrovocCode

e.g of the dataset

<http://aims.fao.org/aos/agrovoc/xl_cs_1299485709333> <http://www.w3.org/2004/02/skos/core#notation> "34938"^^<http://aims.fao.org/aos/agrovoc/AgrovocCode> .

Why that wasn’t great, with examples

We support the main RDF types from w3.org https://docs.dgraph.io/mutations/#language-and-rdf-types But that doesn’t cover all types existent out there in Knowledge databases using RDF.

My proposal would be to accept these types as default and to throw the type value to a facet.

e.g:

<http://aims.fao.org/aos/agrovoc/xl_cs_1299485709333> <http://www.w3.org/2004/02/skos/core#notation> "34938" (unknown_type="<http://aims.fao.org/aos/agrovoc/AgrovocCode>") .

This is just an abstract example, Dgraph is not meant to transform the RDF during a load. I’m just exemplifying what I mean above.

Supporting this, the user can later do something with it, such as using https://docs.dgraph.io/mutations/#external-ids-and-upsert-block

MichelDiz commented :

This is a similar solution to store_xids flag https://github.com/dgraph-io/dgraph/issues/5106

We could have a flag like --store_RDF_types.