Handling controversial schema error


#1

I’m currently trying to load Freebase dump into Dgraph. At some stage, I saw such error:

RDF doesn't match schema: Input for predicate rdf.freebase.com/ns/user.xandr.webscrapper.domain.ad_entry.ads_topic of type scalar is uid. Edge: entity:1434544 attr:"rdf.freebase.com/ns/user.xandr.webscrapper.domain.ad_entry.ads_topic" value_type:UID value_id:1464499

After I updated the schema with

<rdf.freebase.com/ns/user.xandr.webscrapper.domain.ad_entry.ads_topic>: uid .

The error message changed (though, for another line):

RDF doesn't match schema: Input for predicate rdf.freebase.com/ns/user.xandr.webscrapper.domain.ad_entry.ads_topic of type uid is scalar

As I understand, in freebase objects of different types can be linked to the subjects of certaing type using the same predicates. Is there a way I can handle this in Dgraph, or the only way is just to filter such controversial entities out?


(Michel Conrado) #2

This is certainly because the predicate has been set as UID, but the dataset has some fields with this very same pred that are like SCALAR. That is, will always give this error. So set it to string if you don’t need those as edges.

However, if you still want some entity-relationship. I recommend converting the edges as string to Blank_nodes. Both in outgoing and in origin.

<_:1464499> <someorigim.ads_topic> "name" . 
# That would be the example of the origin, the blank node would need to be everywhere where it is.
# Otherwise, if it is not a blank node, it will be recognized as SCALAR. And you will have that error.

<_:232320> <rdf.freebase.com/ns/user.xandr.webscrapper.domain.ad_entry.ads_topic> <_:1464499> .

#3

Setting it to string does not help, because the error I’ve mentioned first appears.

Regarding blank nodes - is there any way I can get ids (eg. 1464499, 1434544, etc.) for a particular node?


(Michel Conrado) #4

You gonna have to create a script for this. Regex, grep or some other tool also would work.
Our datasets came from freebase, but other devs (that are no longer working for Dgraph) had created go scripts just for that. You can try to venture there.

like this