Bulk loader in a running Docker container

Hi guys. I have a Dgraph standalone running on Docker. The Docker Compose config is this one:

dgraph:
image: dgraph/standalone:v20.11.0-rc1-95-g42cfb9636
container_name: dgraph
volumes:
- dgraph-data:/dgraph
ports:
- 8080:8080
- 9080:9080
- 8000:8000
restart: on-failure

I used the “docker cp” to move a data.json and a test.schema to inside my container. I entered then in the container and executed the following command:

dgraph bulk -f data.json -s test.schema

The problem is that I’m getting:

2020/12/14 14:29:47 listen tcp 127.0.0.1:8080: bind: address already in use

I don’t know how to solve it since we will have Dgraph running on a Kubernetes cluster thus it needs to be running on containers. Also, I’m thinking about running in production something like this on my host:

docker exec e89dbc4ee96a -c ‘dgraph -f data.json -s test.schema’

Are there other options?

Be aware that the standalone image isn’t recommended for production. And it doesn’t have the best characteristics of a cluster with multiple nodes(instances).

Firstly, to make it work you have to do your own standalone image. Cuz the Bulkloader won’t work with a Live cluster. So you have to add a script to check if the bulkload has finished - like “wait-for-It.sh”. So after the bulk load the Alpha instance can start.

This happens, probably because you are trying to start a second Alpha inside the container. The standalone image has an Alpha and a Zero instance that runs automatically.

I see. I will try to create a new docker file running the bulk loader first.

For now, based on what you said, I tried the live loader instead. Just changed the word “bulk” with “live”. It means:

dgraph live -f data.json -s test.schema

Now I got:

Error while processing schema file “test.schema”: rpc error: code = Unknown desc = line 2 column 15: Expected new line after field declaration. Got @
rpc error: code = Unknown desc = line 2 column 15: Expected new line after field declaration. Got @root@e89roroot@e89dbcroot@e89dbrorrrrrrrororororoot@roo

The content of my files are:

data.json

{
“Person”: [
{
“xid”: “1”,
“name”: “Robert”
}
]
}

test.schema

type Person {
xid: String! @id
name: String @search(by: [exact])
}

I want to bulk insert data based on a GraphQL schema.

This schema is GraphQL. Liveloader and Bulk has a flag to load GraphQL schemas. But in Bulkload you have to use an empty schema in the main one. Not sure if that was fixed.

Also, your dataset must be inserted by GraphQL mutations, or you have to change it. e.g.

[
      {
        “xid”: “1”,
        “Person.name”: “Robert”,
        “Person.age”: “30”,
        “Person.something”: “else”,
        “dgraph.type”: “Person”,
      }
]

I’m not sure about the XID field. If you have to add Person. in front of it.

Thanks, @MichelDiz. Where do I find this flag?

Here we go

Dgraph version   : v20.07.2
Dgraph codename  : shuri-2
Dgraph SHA-256   : 180143ae812441778007a9026d78df08bfaf35b85113971b3266d4697ff9b692
Commit SHA-1     : a7bc16d56
Commit timestamp : 2020-10-22 10:17:53 -0700
Branch           : HEAD
Go version       : go1.14.4


micheldiz@micheldizs-iMac-Pro ~ % dgraph bulk -h | grep graphql
 -g, --graphql_schema string    Location of the GraphQL schema file.

I think the live load doesn’t have it. You should use the Admin Schema OP to add GraphQL schema.