Creating Schema and loading data

Hi All,

I am new to dgraph and don’t have experience in graphQL. Facing problems in creating schema and loading data using the bulk loader.
The schema required is as follows:

Nodes:
label: material
properties: name(string),type(string),number(float/int)
count: approx 2 million

Relationship:
label: parent (material->parent->material) directed
properties: rel_name(string),quantity(int/float)
count: 3 million

I also tried, installing dgraph on local system using docker, and source build. But the port 8080 is not accessing from dgraph-ratel.

Request you to share the schema and data loading steps with syntax.

Happy Weekend

Thanks & Regards,
Vinayak

Can you share details of what you have tried?

This depends of some context. Have you exposed the port? are you using a native installation of Docker or Docker in VM? If VM you have to use the IP of the VM.

Your intention is to use GraphQL? You should take a time to learn GraphQL first. There are a ton of tutorials and courses related. Dgraph is covered by all of them. But with extra notations/directives.

There’s no Bulkloader for GraphQL, I mean, you can bulkload an existing Dgraph dataset that came from a GraphQL setup (created on Dgraph only). But Bulkload is Dgraph exclusive, so you have to build the dataset with its structure to be able to use it. No dataset out there is compatible with it.

One suggested model might be:

type Material {
  id: ID!
  name: String
  type: String
  number: Float
  children: [MaterialProgeny]
  parents: [MaterialProgeny]
}

type MaterialProgeny {
  id: ID!
  parent: Material @hasInverse(field: "children")
  child: Material @hasInverse(field: "parents")
  rel_name: String
  quantity: Float
}

And RDF for bulk insert would look something like:

_:foo <dgraph.type> "Material" .
_:foo <Material.name> "Foo" .
_:foo <Material.type> "typeA" .
_:foo <Material.number> "123" .
_:foo <Material.children> _:foo_bar .
_:foo_bar <MaterialProgeny.parent> :_foo .
_:foo_bar <dgraph.type> "MaterialProgeny" .
_:foo_bar <MaterialProgeny.rel_name> "Foo Bar" .
_:foo_bar <MaterialProgeny.quantity> "42" .
_:foo_bar <MaterialProgeny.child> _:bar. 
_:bar <Material.parents> _:foo_bar .
_:bar <dgraph.type> "Material" .
_:bar <Material.name> "Bar" .
_:bar <Material.type> "typeB" .
_:bar <Material.number> "3.14" .

FYI, not all docker images contain ratel. This was a change in 21.03

What image are you using and what version of that image?

Hi,

I am using docker, to download dgraph in my local system.
Reference: https://dgraph.io/docs/deploy/download/
Steps followed for dgraph installation are as follows:

  1. docker pull dgraph/dgraph:v21.03.1
  2. You can test that it worked fine, by running:

docker run -it dgraph/dgraph:v21.03.1 dgraph
3. In one terminal started dgraph zero instance:
docker run -it dgraph/dgraph:v21.03.1 dgraph zero
4. docker run -it dgraph/dgraph:v21.03.0 dgraph bulk -s schema -f nodes.json
Error:
Encrypted input: false; Encrypted output: false
Schema path(schema) does not exist.
The json files are created using the following reference:
https://dgraph.io/docs/migration/loading-csv-data/
Schema is the same mentioned in one of the replies.
Please help me to resolve the issue.

make sure the schema is in the same path as the terminal is.

its in the same directory, still facing the issue

See this video

also you may see this one related to Ratel

Using the schema you shared by changing the following lines:

type Material {
  name: String
  type: String
  number: Float! @id
  children: [MaterialProgeny]
  parents: [MaterialProgeny]
}

type MaterialProgeny {
  id: ID!
  parent: Material @hasInverse(field: children)
  child: Material @hasInverse(field: parents)
  rel_name: String
  quantity: Float
}

Error while processing schema file “schema”: rpc error: code = Unknown desc = line 2 column 22: Expected new line after field declaration. Got @
rpc error: code = Unknown desc = line 2 column 22: Expected new line after field declaration. Got @amaster507

Are you sending this via GraphQL Admin endpoint? This schema is GraphQL, you should send it via GQL Admin or via the GraphQL flag. And also there is a trick that you have to do, which consist(for Liveloader or Bulkloader) add a fake DQL schema and them your GQL Schema.

@MichelDiz I am new to dgraph and still exploring it. Apologize, but I didn’t get the solution and how to move forward. The steps which I followed are as follows:

  1. Installing dgraph using docker
    docker run -it -p 5080:5080 -p 6080:6080 -p 8080:8080 -p 9080:9080 -p 8000:8000 -v ~/dgraph:/dgraph --name dgraph dgraph/standalone:v21.03.0
  2. Using docker cli to load data.
    Directory: /dgraph/dataload, where the schema and data loading json file is present.
  3. Executing command:
    docker live -f data.json -s schema
    Please modify the commands or share which can be used to load data.
    Thanks & Regards,
    Vinayak

For GraphQL schema you have to use the GraphQL flag. I think it is -g please use this command below to check it

dgraph live -h | grep graphql

So the final command would be something like

docker live -f data.json -s emptyText -g schema

you have also to give an empty file in the -s flag.

btw

Ratel isn’t present in that image anymore. You can remove it.

dgraph live -h | grep graphql
Blank output
dgraph live -f data.json -s emptyText -g schema
Error: unknown shorthand flag: ‘g’ in -g

humm, looks like only Bulkload has it dgraph/run.go at 7531e95f9854f9f2315e5400a78cf43c080680d6 · dgraph-io/dgraph · GitHub - Well, just run the schema ad /admin/(the GraphQL Admin) and load it normally in the live loader. Liveloader doesn’t need the empty schema file tho(I was confusing it).

Request you to share commands or steps which I can use. That will be helpful

Nothing too much different from what you are doing. Just add the GraphQL schema at localhost:8080/admin using a GraphQL client and done

https://dgraph.io/docs/graphql/api/api-overview/

I have gone through the documentation, didn’t find how can we pass the schema file in URL. I think we need support from dgraph team or who have already worked on it.
I have a couple of questions, would be great if you can answer them. I will discuss them with my team and get back to you or contact dgraph team for further assistance:

  1. I want to load 1.5 million nodes and approx 3.5 million relations between them. Nodes contain 24 properties and relation 7 respectively. How much time it will take to load the same in dgraph ??

  2. What are the hardware configuration required with number of server’s and other important details to achieve the same?

  3. Does dgraph support multiple schema’s on same instance? For example: On one instance we want two graph with name graph1 and graph2 as independent schema’s?

  4. Would you like to comment on the performance of aggregation, filter queries?

Please answer the questions, it will help us to come to the conclusion whether we can move forward with dgraph.

Seems like you have many questions unrelated to this topic, but to help you along on loading your schema, here are the docs on updating graphql schema: https://dgraph.io/docs/graphql/admin/#modifying-a-schema

eg:

curl -X POST localhost:8080/admin/schema --data-binary '@schema.graphql'

I agree with you. I am doing a POC related to graph database. I am not sure how can I contact support team or concerned team from where I can get the answers.

Yea no problem, for visibility and topic-awareness I would just make a new topic with your above questions. - while marking a solution on this thread if this issue is solved by something here.

1 Like