Report a Dgraph Bug
When using live loader with an S3 bucket URI like s3://dgraph-dev-backups, live loader will fail with:
Get "https://dataset.s3.dualstack.us-east-1.amazonaws.com/1million.schema": 301 response missing Location header
What version of Dgraph are you using?
Dgraph version : v21.03.0
Dgraph codename : rocket
Dgraph SHA-256 : b4e4c77011e2938e9da197395dbce91d0c6ebb83d383b190f5b70201836a773f
Commit SHA-1 : a77bbe8ae
Commit timestamp : 2021-04-07 21:36:38 +0530
Branch : HEAD
Go version : go1.16.2
jemalloc enabled : true
Have you tried reproducing the issue with the latest release?
Yep.
What is the hardware spec (RAM, OS)?
n/a
Steps to reproduce the issue (command/config used to run Dgraph).
-
docker-compose up -d
version: "3.5" services: zero: image: dgraph/dgraph:v21.03.0 command: dgraph zero --my=zero:5080 --replicas 1 --raft idx=1 container_name: zero alpha: image: dgraph/dgraph:v21.03.0 environment: DGRAPH_ALPHA_SECURITY: whitelist=0.0.0.0/0 AWS_ACCESS_KEY_ID: REDACTED AWS_SECRET_ACCESS_KEY: REDACTED command: dgraph alpha --my=alpha:7080 --zero=zero:5080 container_name: alpha
- Download 1million dataset and upload to a bucket:
export AWS_PROFILE=dgraph-dev-backups PREFIX=https://github.com/dgraph-io/benchmarks/raw/master/data/ FILES=(1million.schema 1million.rdf.gz) for FILE in ${FILES[*]}; do curl --silent --location --remote-name $PREFIX/$FILE aws s3 cp $FILE s3://dgraph-dev-backups/dataset/ done
- Live Load from S3 Bucket
docker exec -t alpha dgraph live -C \ -s s3://dgraph-dev-backups/dataset/1million.schema \ -f s3://dgraph-dev-backups/dataset/1million.rdf.gz \ -z zero:5080 \ -a alpha:9080
Expected behavior and actual result.
Expect
I would expect this process not to cause a stack trace:
Number of TXs run : 1042
Number of N-Quads processed : 1041684
Time spent : 2m4.010134056s
N-Quads processed per second : 8400
Actual
A stack trace:
Processing schema file "s3://dgraph-dev-backups/dataset/1million.schema"
2021/06/02 01:52:00 Get "https://dataset.s3.dualstack.us-east-1.amazonaws.com/1million.schema": 301 response missing Location header
Error while reading file
github.com/dgraph-io/dgraph/x.Checkf
/ext-go/1/src/github.com/dgraph-io/dgraph/x/error.go:51
github.com/dgraph-io/dgraph/dgraph/cmd/live.(*loader).processSchemaFile
/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/run.go:257
github.com/dgraph-io/dgraph/dgraph/cmd/live.run
/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/run.go:799
github.com/dgraph-io/dgraph/dgraph/cmd/live.init.0.func1
/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/live/run.go:134
github.com/spf13/cobra.(*Command).execute
/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830
github.com/spf13/cobra.(*Command).ExecuteC
/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
github.com/spf13/cobra.(*Command).Execute
/go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
github.com/dgraph-io/dgraph/dgraph/cmd.Execute
/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/cmd/root.go:78
main.main
/ext-go/1/src/github.com/dgraph-io/dgraph/dgraph/main.go:99
runtime.main
/usr/local/go/src/runtime/proc.go:225
runtime.goexit
Workaround
When using the full long form of the S3 URI, this works
docker exec -t alpha dgraph live -C \
-s s3://s3.us-east-2.amazonaws.com/dgraph-dev-backups/dataset/1million.schema \
-f s3://s3.us-east-2.amazonaws.com/dgraph-dev-backups/dataset/1million.rdf.gz \
-z zero:5080 \
-a alpha:9080