Figure out what's in a Database

Generally, I’m interested how to explore a Dgraph database when you don’t know what is in there. Suppose someone has given you access to a Dgraph instance, what’s the first query you run to start working out what’s in there? Or is GraphQL± not the tool for the job and you should find the schema or get the information out-of-band?

My specific case is I’ve used the Bulk loader to load some RDF and let it construct the schema itself. The RDF looks like:

<> <> <> .
<> <> "bar" .
<> <> "foo" .

At this stage, I’ve got a few problems:

  1. I don’t know if the property name was loaded with the full URI like or shortened somehow
  2. if full URIs are used, I can’t figure out how to escape the names in a GraphQL± query. Any queries that use the full URI complain about the : (colon) character. I’ve tried escaping with single quotes, double quotes and backticks but none seem to work. Google queries for “dgraph escape” aren’t yielding anything either.
  3. I can’t find the generated schema. I’m running in Docker and I’ve done a docker exec into the container to look around but can’t find a schema

Solutions to my specific case are appreciated but I’m more interested in the general skill of exploring a Dgraph DB with no prior knowledge. In a relational world I’d get some tables names, describe them then start doing some SELECT * queries to get a feel for the DB. In other NoSQL DBs there’s typically some wildcard type queries where you can start looking at some data, any data! In GraphQL± I’m stuck, I can’t figure out a SELECT * type query.

I’ve tried guessing at UIDs but no matter what UID I used, I get a response but without any edge names:

  f(func: uid(0x00)) {
    linked_to # here I'm guessing the RDF property was shortened

Thanks for the help

When exploring unknown data, I would recommend the following steps:

  1. First, query the schema.
  2. Then start looking at some of the nodes. E.g. use the has function to find all nodes with a particular predicate (discovered from the schema query).
  3. Next step would be to start expanding the graph using expand(_all_) and start to explore the relationships.

GrahpQL± and Ratel are great for this, since Ratel helps to visualise the relationships (which would otherwise be harder to understand just looking at the raw JSON responses).

For predicate names, I would recommend using simplified predicate names, e.g. type and linked_to. You would have to pre-process your data, but it will make everything much easier for you. Although, if you really want to use URIs for predicates, you should be able to escape using a backslash.

Hi Peter,
thanks for the helpful tips. They’re exactly what I needed.

I don’t want to use URIs for predicates, that’s just what my RDF data has in it and I was trying to quickly get Dgraph up and running with my data to take it for a test drive. If I’m going to be pre-processing the RDF, I’ll strip the names down to simple ones like you suggest.

For what it’s worth, I can’t get that backslash escaping to work. A query like this works, so we know it’s a good starting point:

  me(func: has(age)) {

Then I change the predicate name to a full URI, which we expect to fail:

  me(func: has( {

…and we get our error:

Expected arg after func [has], but got item lex.Item [14] ":"

So then I backslash escape the : so the query is:

  me(func: has(http\:// {

but then we get a different error:

Expected arg after func [has], but got item lex.Item [1] "while lexing {\n me(func: has(http\\:// {\n expand(_all_)\n }\n}: Unrecognized character in inside a func: U+005C '\\'"

It appears to be unhappy with the backslash. It won’t be a problem when I stop using full URIs but I wanted to highlight that it didn’t work for me.


Ah, I was wrong about backslash working to escape. In that case, it does appear that it’s not possible to have : in a predicate name.

I just found out the proper answer to this: If you wrap the predicate in angle brackets, you can have : in the query. E.g. <two:words>.

Thanks for looking that up @peter.

In hindsight it makes perfect sense, it’s similar to how SPARQL handles URIs.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.