"A bigger dataset" fails to finish loading in docker on macOS

I’ve started familiarizing myself with dgraph following the docs and the tour.

When I run the dgraph binaries and load the 1million.rdf.gz there’s no problem, the terminal output is:

Processing 1million.rdf.gz
[    2s] Txns: 41 RDFs: 41000 RDFs/sec: 20459 Aborts: 0
[    4s] Txns: 91 RDFs: 91000 RDFs/sec: 22748 Aborts: 0
[    6s] Txns: 164 RDFs: 164000 RDFs/sec: 27322 Aborts: 0
[    8s] Txns: 233 RDFs: 233000 RDFs/sec: 29114 Aborts: 0
[   10s] Txns: 300 RDFs: 300000 RDFs/sec: 29999 Aborts: 0
[   12s] Txns: 368 RDFs: 368000 RDFs/sec: 30661 Aborts: 0
[   14s] Txns: 436 RDFs: 436000 RDFs/sec: 31143 Aborts: 0
[   16s] Txns: 497 RDFs: 497000 RDFs/sec: 31062 Aborts: 0
[   18s] Txns: 555 RDFs: 555000 RDFs/sec: 30829 Aborts: 0
[   20s] Txns: 625 RDFs: 625000 RDFs/sec: 31250 Aborts: 0
[   22s] Txns: 679 RDFs: 679000 RDFs/sec: 30860 Aborts: 0
[   24s] Txns: 745 RDFs: 745000 RDFs/sec: 31038 Aborts: 0
[   26s] Txns: 803 RDFs: 803000 RDFs/sec: 30883 Aborts: 0
Number of TXs run         : 845                                                                     
Number of RDFs processed  : 844056
Time spent                : 27.487949534s
RDFs processed per second : 31261

The docs make docker look like the preferred method for deploying dgraph, but I’ve been unable to load data when running in containers. I started with the docker compose example but at first couldn’t figure out how to mount the directory where I’d downloaded 1million.rdf.gz. The example has a “volumes” section under each service that looks like:

volumes:
      - type: volume
        source: dgraph
        target: /dgraph
        volume:
          nocopy: true

Docker documentation does not clarify what any of that means (or I couldn’t find it) and I wasn’t making any headway changing the source or target. But replacing that block with:

volumes:
      - /Users/nfeldman/learn/dgraph:/dgraph

“works” (files in ~/learn/dgraph are visible in the container) for reasons that are unclear. The only other change I made was to set --lru_mb=4096.

When I then run docker exec -it dgraph_zero_1 dgraph live -r 1million.rdf.gz --zero localhost:5080 -d server:9080 -c 1 It fails due to what looks like a connection timeout. It starts out OK:

Processing 1million.rdf.gz
[    2s] Txns: 35 RDFs: 35000 RDFs/sec: 17497 Aborts: 0
[    4s] Txns: 48 RDFs: 48000 RDFs/sec: 12000 Aborts: 0
[    6s] Txns: 78 RDFs: 78000 RDFs/sec: 12999 Aborts: 0
[    8s] Txns: 132 RDFs: 132000 RDFs/sec: 16500 Aborts: 0
...

showing fewer “RDFs/sec” on each update, then does:

[ 1m34s] Txns: 608 RDFs: 608000 RDFs/sec:  6468 Aborts: 0
Error while mutating Assigning IDs is only allowed on leader.
[ 1m36s] Txns: 608 RDFs: 608000 RDFs/sec:  6333 Aborts: 1

and eventually terminates:

[ 5m38s] Txns: 668 RDFs: 668000 RDFs/sec:  1976 Aborts: 1
2019/02/18 01:30:39 transport is closing
github.com/dgraph-io/dgraph/x.Fatalf
	/ext-go/1/src/github.com/dgraph-io/dgraph/x/error.go:115
github.com/dgraph-io/dgraph/dgraph/cmd/live.handleError
	/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/batch.go:140
github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).request
	/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/batch.go:182
github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).makeRequests
	/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/batch.go:194
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1333

I attempted to load the data in docker multiple times (dropping the db and reloading the schema before each attempt), and it always fails. It always processes fewer triples per update, but it doesn’t always have that “Error while mutating Assigning IDs” message (and sometimes it has that message more than once).

It also fails in the same way if I don’t change the volumes block from what is given in the example and instead copy the archive into the container first, as suggested in this comment.

Why does it take so long and eventually fail to load in docker?

Thanks.

The Tour and Docs are different approaches. To proceed from where you are, you need to master Docker. Understand how attaching volumes work and so on.

What happened to you is unusual. Which version of Dgraph are you using?

A quick fix to the case (and later return to Docker) is to use Dgraph by binaries.
In this video teaches how to do. https://youtu.be/sxqRGfDL7Qw

Cheers.

I do need to learn docker properly, but I’d hoped the examples would not require much/any real docker experience to get working.

Using the binaries works great, but I’d like to use docker in the project I am considering dgraph for, which is why I am still trying to do this in docker rather than proceed using the binaries.

Re: version, the instance running in the container right now gives:

Dgraph version   : v1.0.11
Commit SHA-1     : b2a09c5b
Commit timestamp : 2018-12-17 09:50:56 -0800
Branch           : HEAD
Go version       : go1.11.1

when I do docker exec -it dgraph_server_1 dgraph help.

Thank you.

Are you using the “localhost” or IP here? Docker does not work without valid addresses. Except in cases where you have created a local DNS server in Docker network.

When I was trying to get the command from the tour to work with the example docker compose, I read through Can't execute tour part "a bigger dataset" . At some point I tried to run the command from this comment and have continued to use that command since. Since I don’t understand all of the options it uses, that is probably a mistake, but I took server to refer to the the section in docker-compose.yml named server copied from the example given here: https://docs.dgraph.io/get-started/#docker-compose. Does docker compose not create a network?

Yes it creates, but if you use another container you will have to join it in the network used in the Docker compose.

I think we have cURL in that image. Can you curl for “localhost:6080/state” and “zero:6080/state”? And “server:8080/” just in case. To see if you gonna have any return.

Share your Docker compose completely.

All done in the container:

curl -X GET ‘localhost:6080/state’ and ‘zero:6080/state’ return almost the same thing (“counter” changes each time)

{
    "counter": "3780",
    "groups": {
        "1": {
            "members": {
                "1": {
                    "id": "1",
                    "groupId": 1,
                    "addr": "server:7080",
                    "leader": true,
                    "lastUpdate": "1550462554"
                }
            },
            "tablets": {
                "_predicate_": {
                    "groupId": 1,
                    "predicate": "_predicate_",
                    "space": "31"
                },
                "dgraph.group.acl": {
                    "groupId": 1,
                    "predicate": "dgraph.group.acl",
                    "space": "39"
                },
                "dgraph.password": {
                    "groupId": 1,
                    "predicate": "dgraph.password",
                    "space": "37"
                },
                "dgraph.user.group": {
                    "groupId": 1,
                    "predicate": "dgraph.user.group",
                    "space": "43"
                },
                "dgraph.xid": {
                    "groupId": 1,
                    "predicate": "dgraph.xid",
                    "space": "36"
                },
                "director.film": {
                    "groupId": 1,
                    "predicate": "director.film"
                },
                "genre": {
                    "groupId": 1,
                    "predicate": "genre"
                },
                "initial_release_date": {
                    "groupId": 1,
                    "predicate": "initial_release_date"
                },
                "name": {
                    "groupId": 1,
                    "predicate": "name"
                }
            },
            "snapshotTs": "8795"
        }
    },
    "zeros": {
        "1": {
            "id": "1",
            "addr": "zero:5080",
            "leader": true
        }
    },
    "maxLeaseId": "3030000",
    "maxTxnTs": "10000",
    "maxRaftId": "1",
    "cid": "2db0697e-fe6c-424c-aef8-04f42795e5f1"
}

curl -I ‘server:8080’ returns a 200 and curl -X GET ‘server:8080’ returns Dgraph browser is available for running separately using the dgraph-ratel binary

docker-compose.yml

version: "3.2"
services:
  zero:
    image: dgraph/dgraph:latest
    volumes:
      - /Users/nfeldman/learn/dgraph:/dgraph
    ports:
      - 5080:5080
      - 6080:6080
    restart: on-failure
    command: dgraph zero --my=zero:5080
  server:
    image: dgraph/dgraph:latest
    volumes:
      - /Users/nfeldman/learn/dgraph:/dgraph
    ports:
      - 8080:8080
      - 9080:9080
    restart: on-failure
    command: dgraph alpha --my=server:7080 --lru_mb=4096 --zero=zero:5080
  ratel:
    image: dgraph/dgraph:latest
    volumes:
     - /Users/nfeldman/learn/dgraph:/dgraph
    ports:
      - 8000:8000
    command: dgraph-ratel

volumes:
  dgraph:

Okay, all seems to be fine.

Can you return to the original Docker Compose? (Volume configs)

Then you do:

docker exec -it dgraph_zero_1 wget "https://github.com/dgraph-io/tutorial/blob/master/resources/1million.rdf.gz?raw=true" -O 1million.rdf.gz -q

or

docker exec -it dgraph_zero_1 curl -o 1million.rdf.gz -L https://github.com/dgraph-io/tutorial/blob/master/resources/1million.rdf.gz?raw=true

And then the Live Load.

Thank you. I’m not sure why, but this did work on the second try. However, it took two attempts and the successful attempt took 3m56s vs the 27s it took to load when running the binaries. Is it expected to be dramatically slower in docker?

Processing 1million.rdf.gz
[    2s] Txns: 14 RDFs: 14000 RDFs/sec:  6999 Aborts: 0
[    4s] Txns: 32 RDFs: 32000 RDFs/sec:  7996 Aborts: 0
[    6s] Txns: 51 RDFs: 51000 RDFs/sec:  8500 Aborts: 0
[    8s] Txns: 68 RDFs: 68000 RDFs/sec:  8500 Aborts: 0
[   10s] Txns: 97 RDFs: 97000 RDFs/sec:  9700 Aborts: 0
[   12s] Txns: 129 RDFs: 129000 RDFs/sec: 10748 Aborts: 0
[   14s] Txns: 161 RDFs: 161000 RDFs/sec: 11500 Aborts: 0
[   16s] Txns: 190 RDFs: 190000 RDFs/sec: 11875 Aborts: 0
[   18s] Txns: 229 RDFs: 229000 RDFs/sec: 12722 Aborts: 0
[   20s] Txns: 245 RDFs: 245000 RDFs/sec: 12250 Aborts: 0
[   22s] Txns: 274 RDFs: 274000 RDFs/sec: 12454 Aborts: 0
[   24s] Txns: 308 RDFs: 308000 RDFs/sec: 12833 Aborts: 0
[   26s] Txns: 326 RDFs: 326000 RDFs/sec: 12538 Aborts: 0
[   28s] Txns: 335 RDFs: 335000 RDFs/sec: 11964 Aborts: 0
[   30s] Txns: 355 RDFs: 355000 RDFs/sec: 11833 Aborts: 0
[   32s] Txns: 361 RDFs: 361000 RDFs/sec: 11281 Aborts: 0
[   34s] Txns: 373 RDFs: 373000 RDFs/sec: 10971 Aborts: 0
[   36s] Txns: 384 RDFs: 384000 RDFs/sec: 10667 Aborts: 0
[   38s] Txns: 397 RDFs: 397000 RDFs/sec: 10447 Aborts: 0
[   40s] Txns: 409 RDFs: 409000 RDFs/sec: 10225 Aborts: 0
[   42s] Txns: 428 RDFs: 428000 RDFs/sec: 10190 Aborts: 0
[   44s] Txns: 445 RDFs: 445000 RDFs/sec: 10114 Aborts: 0
[   46s] Txns: 458 RDFs: 458000 RDFs/sec:  9956 Aborts: 0
[   48s] Txns: 463 RDFs: 463000 RDFs/sec:  9646 Aborts: 0
[   50s] Txns: 466 RDFs: 466000 RDFs/sec:  9320 Aborts: 0
[   52s] Txns: 470 RDFs: 470000 RDFs/sec:  9038 Aborts: 0
[   54s] Txns: 474 RDFs: 474000 RDFs/sec:  8778 Aborts: 0
[   56s] Txns: 477 RDFs: 477000 RDFs/sec:  8518 Aborts: 0
[   58s] Txns: 479 RDFs: 479000 RDFs/sec:  8258 Aborts: 0
[  1m0s] Txns: 482 RDFs: 482000 RDFs/sec:  8033 Aborts: 0
[  1m2s] Txns: 486 RDFs: 486000 RDFs/sec:  7839 Aborts: 0
[  1m4s] Txns: 488 RDFs: 488000 RDFs/sec:  7625 Aborts: 0
[  1m6s] Txns: 492 RDFs: 492000 RDFs/sec:  7455 Aborts: 0
[  1m8s] Txns: 493 RDFs: 493000 RDFs/sec:  7250 Aborts: 0
[ 1m10s] Txns: 493 RDFs: 493000 RDFs/sec:  7043 Aborts: 0
[ 1m12s] Txns: 495 RDFs: 495000 RDFs/sec:  6875 Aborts: 0
[ 1m14s] Txns: 499 RDFs: 499000 RDFs/sec:  6743 Aborts: 0
[ 1m16s] Txns: 502 RDFs: 502000 RDFs/sec:  6605 Aborts: 0
[ 1m18s] Txns: 507 RDFs: 507000 RDFs/sec:  6500 Aborts: 0
[ 1m20s] Txns: 512 RDFs: 512000 RDFs/sec:  6400 Aborts: 0
[ 1m22s] Txns: 517 RDFs: 517000 RDFs/sec:  6305 Aborts: 0
[ 1m24s] Txns: 522 RDFs: 522000 RDFs/sec:  6214 Aborts: 0
[ 1m26s] Txns: 528 RDFs: 528000 RDFs/sec:  6140 Aborts: 0
[ 1m28s] Txns: 533 RDFs: 533000 RDFs/sec:  6057 Aborts: 0
[ 1m30s] Txns: 538 RDFs: 538000 RDFs/sec:  5978 Aborts: 0
[ 1m32s] Txns: 545 RDFs: 545000 RDFs/sec:  5924 Aborts: 0
[ 1m34s] Txns: 551 RDFs: 551000 RDFs/sec:  5862 Aborts: 0
[ 1m36s] Txns: 556 RDFs: 556000 RDFs/sec:  5792 Aborts: 0
[ 1m38s] Txns: 560 RDFs: 560000 RDFs/sec:  5714 Aborts: 0
[ 1m40s] Txns: 565 RDFs: 565000 RDFs/sec:  5650 Aborts: 0
[ 1m42s] Txns: 566 RDFs: 566000 RDFs/sec:  5549 Aborts: 0
[ 1m44s] Txns: 570 RDFs: 570000 RDFs/sec:  5481 Aborts: 0
[ 1m46s] Txns: 580 RDFs: 580000 RDFs/sec:  5472 Aborts: 0
[ 1m48s] Txns: 587 RDFs: 587000 RDFs/sec:  5435 Aborts: 0
[ 1m50s] Txns: 595 RDFs: 595000 RDFs/sec:  5409 Aborts: 0
[ 1m52s] Txns: 603 RDFs: 603000 RDFs/sec:  5384 Aborts: 0
[ 1m54s] Txns: 610 RDFs: 610000 RDFs/sec:  5351 Aborts: 0
[ 1m56s] Txns: 616 RDFs: 616000 RDFs/sec:  5310 Aborts: 0
[ 1m58s] Txns: 619 RDFs: 619000 RDFs/sec:  5246 Aborts: 0
[  2m0s] Txns: 622 RDFs: 622000 RDFs/sec:  5183 Aborts: 0
[  2m2s] Txns: 625 RDFs: 625000 RDFs/sec:  5123 Aborts: 0
[  2m4s] Txns: 631 RDFs: 631000 RDFs/sec:  5089 Aborts: 0
[  2m6s] Txns: 639 RDFs: 639000 RDFs/sec:  5071 Aborts: 0
[  2m8s] Txns: 646 RDFs: 646000 RDFs/sec:  5047 Aborts: 0
[ 2m10s] Txns: 653 RDFs: 653000 RDFs/sec:  5023 Aborts: 0
[ 2m12s] Txns: 658 RDFs: 658000 RDFs/sec:  4985 Aborts: 0
[ 2m14s] Txns: 664 RDFs: 664000 RDFs/sec:  4955 Aborts: 0
[ 2m16s] Txns: 668 RDFs: 668000 RDFs/sec:  4912 Aborts: 0
[ 2m18s] Txns: 674 RDFs: 674000 RDFs/sec:  4884 Aborts: 0
[ 2m20s] Txns: 679 RDFs: 679000 RDFs/sec:  4850 Aborts: 0
[ 2m22s] Txns: 685 RDFs: 685000 RDFs/sec:  4824 Aborts: 0
[ 2m24s] Txns: 691 RDFs: 691000 RDFs/sec:  4799 Aborts: 0
[ 2m26s] Txns: 700 RDFs: 700000 RDFs/sec:  4795 Aborts: 0
[ 2m28s] Txns: 709 RDFs: 709000 RDFs/sec:  4791 Aborts: 0
[ 2m30s] Txns: 717 RDFs: 717000 RDFs/sec:  4780 Aborts: 0
[ 2m32s] Txns: 725 RDFs: 725000 RDFs/sec:  4770 Aborts: 0
[ 2m34s] Txns: 733 RDFs: 733000 RDFs/sec:  4760 Aborts: 0
[ 2m36s] Txns: 740 RDFs: 740000 RDFs/sec:  4744 Aborts: 0
[ 2m38s] Txns: 746 RDFs: 746000 RDFs/sec:  4722 Aborts: 0
[ 2m40s] Txns: 754 RDFs: 754000 RDFs/sec:  4712 Aborts: 0
[ 2m42s] Txns: 761 RDFs: 761000 RDFs/sec:  4698 Aborts: 0
[ 2m44s] Txns: 769 RDFs: 769000 RDFs/sec:  4689 Aborts: 0
[ 2m46s] Txns: 777 RDFs: 777000 RDFs/sec:  4681 Aborts: 0
[ 2m48s] Txns: 782 RDFs: 782000 RDFs/sec:  4655 Aborts: 0
[ 2m50s] Txns: 782 RDFs: 782000 RDFs/sec:  4600 Aborts: 0
[ 2m52s] Txns: 782 RDFs: 782000 RDFs/sec:  4547 Aborts: 0
[ 2m54s] Txns: 783 RDFs: 783000 RDFs/sec:  4500 Aborts: 0
[ 2m56s] Txns: 784 RDFs: 784000 RDFs/sec:  4455 Aborts: 0
[ 2m58s] Txns: 786 RDFs: 786000 RDFs/sec:  4416 Aborts: 0
[  3m0s] Txns: 788 RDFs: 788000 RDFs/sec:  4378 Aborts: 0
[  3m2s] Txns: 790 RDFs: 790000 RDFs/sec:  4341 Aborts: 0
[  3m4s] Txns: 792 RDFs: 792000 RDFs/sec:  4304 Aborts: 0
[  3m6s] Txns: 794 RDFs: 794000 RDFs/sec:  4269 Aborts: 0
[  3m8s] Txns: 796 RDFs: 796000 RDFs/sec:  4234 Aborts: 0
[ 3m10s] Txns: 799 RDFs: 799000 RDFs/sec:  4205 Aborts: 0
[ 3m12s] Txns: 801 RDFs: 801000 RDFs/sec:  4172 Aborts: 0
[ 3m14s] Txns: 804 RDFs: 804000 RDFs/sec:  4144 Aborts: 0
[ 3m16s] Txns: 807 RDFs: 807000 RDFs/sec:  4117 Aborts: 0
[ 3m18s] Txns: 809 RDFs: 809000 RDFs/sec:  4086 Aborts: 0
[ 3m20s] Txns: 811 RDFs: 811000 RDFs/sec:  4055 Aborts: 0
[ 3m22s] Txns: 814 RDFs: 814000 RDFs/sec:  4030 Aborts: 0
[ 3m24s] Txns: 816 RDFs: 816000 RDFs/sec:  4000 Aborts: 0
[ 3m26s] Txns: 819 RDFs: 819000 RDFs/sec:  3976 Aborts: 0
[ 3m28s] Txns: 821 RDFs: 821000 RDFs/sec:  3947 Aborts: 0
[ 3m30s] Txns: 823 RDFs: 823000 RDFs/sec:  3919 Aborts: 0
[ 3m32s] Txns: 826 RDFs: 826000 RDFs/sec:  3896 Aborts: 0
[ 3m34s] Txns: 828 RDFs: 828000 RDFs/sec:  3869 Aborts: 0
[ 3m36s] Txns: 830 RDFs: 830000 RDFs/sec:  3843 Aborts: 0
[ 3m38s] Txns: 832 RDFs: 832000 RDFs/sec:  3817 Aborts: 0
[ 3m40s] Txns: 834 RDFs: 834000 RDFs/sec:  3791 Aborts: 0
[ 3m42s] Txns: 836 RDFs: 836000 RDFs/sec:  3766 Aborts: 0
[ 3m44s] Txns: 838 RDFs: 838000 RDFs/sec:  3741 Aborts: 0
[ 3m46s] Txns: 839 RDFs: 839000 RDFs/sec:  3712 Aborts: 0
[ 3m48s] Txns: 841 RDFs: 841000 RDFs/sec:  3689 Aborts: 0
[ 3m50s] Txns: 843 RDFs: 843000 RDFs/sec:  3665 Aborts: 0
Number of TXs run         : 845                                                                     
Number of RDFs processed  : 844056
Time spent                : 3m50.890242568s
RDFs processed per second : 3669
[ 3m52s] Txns: 845 RDFs: 844056 RDFs/sec:  3638 Aborts: 0
[ 3m54s] Txns: 845 RDFs: 844056 RDFs/sec:  3607 Aborts: 0
[ 3m56s] Txns: 845 RDFs: 844056 RDFs/sec:  3576 Aborts: 0

Nice! Well, my shot is that when you do “- /Users/nfeldman/learn/dgraph:/dgraph” You are mixing Darwin’s file systems with Linux (docker). This can cause instability. Or maybe you were not cleaning the path. When you use volumes in this way, Docker copies everything in that Path to the volume inside the container. It’s good always to start from scratch.

About the performance. This depends a lot on the configuration of your docker. Usually Local Docker does not use 100% of the machine’s resources. The container you are using may be using 1 core and you are technically using two processes that requires processing. Check your docker configs.

OK. Clearly I have a lot to learn about docker.
Thank you for your help and patience.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.