Dgraph live crashes when loading 21mil sample freebase data into a t2.medium cluster

Moved from GitHub dgraph/4528

Posted by rahst12:

What version of Dgraph are you using?

1.1.1

Have you tried reproducing the issue with the latest release?

yes

What is the hardware spec (RAM, OS)?

t2.medium, centos

Steps to reproduce the issue (command/config used to run Dgraph).

  1. Install Multi-host dgraph with the following extra parameters:
docker-machine create --driver amazonec2 --amazonec2-instance-type t2.medium --amazonec2-ami ami-02eac2c0129f6376b --amazonec2-ssh-user centos aws01
docker-machine create --driver amazonec2 --amazonec2-instance-type t2.medium --amazonec2-ami ami-02eac2c0129f6376b --amazonec2-ssh-user centos aws02
docker-machine create --driver amazonec2 --amazonec2-instance-type t2.medium --amazonec2-ami ami-02eac2c0129f6376b --amazonec2-ssh-user centos aws03
  1. Setup dgraph
  2. Get data
wget "https://github.com/dgraph-io/benchmarks/blob/master/data/21million.rdf.gz?raw=true" -O 21million.rdf.gz -q
wget "https://raw.githubusercontent.com/dgraph-io/benchmarks/master/data/21million.schema" -O 21million.schema -q
  1. Run dgraph live:
dgraph live --files 21million.rdf.gz --alpha 172.31.86.65:9080,172.31.93.166:9080,172.31.87.103:9080 --zero 172.31.86.65:5080 --verbose -c 1 --schema 21million.
schema
  1. Wait for error:
[06:33:15Z] Elapsed: 10m20s Txns: 4042 N-Quads: 4042000 N-Quads/s [last 5s]:  2800 Aborts: 0
[06:33:20Z] Elapsed: 10m25s Txns: 4049 N-Quads: 4049000 N-Quads/s [last 5s]:  1400 Aborts: 0
[06:33:25Z] Elapsed: 10m30s Txns: 4050 N-Quads: 4050000 N-Quads/s [last 5s]:   200 Aborts: 0
[06:33:30Z] Elapsed: 10m35s Txns: 4050 N-Quads: 4050000 N-Quads/s [last 5s]:     0 Aborts: 0
2020/01/09 06:33:30 transport is closing
github.com/dgraph-io/dgraph/x.Fatalf
        /tmp/go/src/github.com/dgraph-io/dgraph/x/error.go:101
github.com/dgraph-io/dgraph/dgraph/cmd/live.handleError
        /tmp/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/batch.go:104
github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).request
        /tmp/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/batch.go:156
github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).makeRequests
        /tmp/go/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/batch.go:169
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1357

Expected behaviour and actual result.

MichelDiz commented :

Hey @rahst12, in case of live load. You should upgrade to at least t2.xlarge (I recommend t2.2xlarge for long loads) and when it finishes just downgrade to t2.medium.

MichelDiz commented :

@rahst12 Try the new feature called ludicrous mode (eventually consistent writes, you gonna have way better results - I guarantee). It hasn’t been released yet, use it sparingly. And disable it after you have loaded the dataset.

This mode is not recommended for financial systems.

When the next version comes, change the tag for it.

docker pull dgraph/dgraph:master
docker run -d -p 5080:5080 -p 6080:6080 -p 8080:8080 -p 9080:9080 -p 8000:8000 -v ~/dgraph:/dgraph --name dgraph dgraph/dgraph:master dgraph zero --ludicrous_mode
docker exec -d dgraph dgraph alpha --zero localhost:5080 --ludicrous_mode
docker exec -it dgraph sh

AND

curl --progress-bar -LS -o 21million.rdf.gz "https://github.com/dgraph-io/benchmarks/blob/master/data/release/21million.rdf.gz?raw=true"
curl --progress-bar -LS -o release.schema "https://github.com/dgraph-io/benchmarks/blob/master/data/release/release.schema?raw=true"

ls

Then

dgraph live -f 21million.rdf.gz -s release.schema --conc=100 --batch=10000

Decrease the values of conc and batch according to your machine (do some testing of parameters to find out which is the best)

Thanks! I’m late to reply here, but I’m going to give this a shot over the next two weeks or so.