Retrieve all predicates of all nodes

Hi Devs,

I want to retrieve all predicates of all nodes. I know that I can do this with

{
  result (func: has(dgraph.type)) {
    uid
    expand(_all_) {
      uid
    }
  }
}

However, this requires all nodes to have a dgraph.type in order for expand to work. Is there any way of retrieving all triples without the requirement of having all nodes have a type?

One alternative that exists is explicitly stating the predicates:

{
  prop as var(func: has(prop))
  edge as var(func: has(edge))
  
  result (func: uid(prop,edge)) {
    uid
    prop
    edge { uid }
  }
}

But for a huge schema this requires many queries. Is there another way you can think of?

Yep, there is a “hack” (not sure if it still works). You can create a “dummy” type in Dgraph with all possible edges. And you gonna use:

{
  result(func: has(anyPred)) {
    uid
    expand(MyDummyType) {
      uid
      expand(MyDummyType)
    }
  }
}

This is useful for datasets that you don’t know the structure and wanna create the Schema Type.

I presume this returns only uids that have anyPred, so I need a “seed” predicate that all nodes have like xid or dgraph.type. I’ll give it a try with my 235k predicates schema :smiley:

Hehehe, curious as to what are you modelling? Want to brainstorm if that the best way to model.

I have loaded the DBpedia RDF graph of Wikipedia (also see Discussion: Wikipedia backed by DGraph - #6 by EnricoMi), which comes with a wide schema when you load the infobox properties. These are user-defined predicates, which obviously is noisy and creates k’s of predicates. It is not a hand-crafted well designed schema, but a dirty real-world graph. Just wanted to have a wide long-tail schema for benchmarking.

3 Likes

For reference, I have put the code to generate the DBpedia dataset online, linked from here: Pre-processing DBpedia dataset for Dgraph

1 Like

Hi, I had similar need so I got the schema by making a query to /query with request body schema {}
@MichelDiz would there be any reason I should not do it this way?

This way what? Not sure what you mean.

Query for the Schema isn’t related to the topic/issue.

@MichelDiz Querying the schema appears to return a list of all predicates.

{
  "data": {
    "schema": [
      {
        "predicate": "NRHP",
        "type": "default"
      },
      {
        "predicate": "actor.dubbing_performances",
        "type": "uid",
        "list": true
      },
      {
        "predicate": "actor.film",
        "type": "uid",
        "count": true,
        "list": true
      },
      {
        "predicate": "addr.city",
        "type": "default"
      },
      {
        "predicate": "addr.country",
        "type": "default"
      },
      {
        "predicate": "addr.housename",
        "type": "default"
      },
      {
        "predicate": "addr.housenumber",
        "type": "default"
      }
...

@mr_rustbot I thought at the time that the question was related to a normal block. The Schema query is totally different thing(and a Known thing, so I didn’t suspect that He didn’t know about it I guess). You can’t use the result of a Schema query in any other block as far as I know. So, two different topics.

For sure you will have all predicates available in the cluster from a Schema query, but they have no context. With Schema query you will have a list of all predicates, but the question was related to nodes. All predicates of the given nodes. And nodes have context, its “body/object” that a Schema query don’t bring for us.