I want to deploy a dgraph live loader to load the .rdf files for a server. It works fine when I manually download the .rdf.gz file, unzip it and deploy the live loader locally, by using the following code:
dgraph live -f g01.rdf -a alpha:9080
However, when I tried to do get the g01.rdf.gz directly from the AWS S3, it always shows up that there is no files in the folder. Here is the code that I tried:
dgraph live -C -f s3:///bucket-name/directory-with-rdf -a alpha:9080
I have input my access id and secret id as environment variable so this part should be fine. Can anyone help me with this? Thanks!
@mattZhang17 I think I found the problem and solution. The S3 URI has to be the long form (s3://s3.<region>.amazonaws.com/<bucket>), instead of the short form (s3:///<bucket>) used with aws cli.
Following up further: I realized that I did not include the triple slash form short for of S3 URL, e.g. s3:///<my-bucket>. After making this correction, I did not encounter problems, and thus I was not able to reproduce this.
I ran the command inside the container, as it has dgraph binary in it. The container has AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY set as environment variables.
Both types of S3 URLs worked for me:
docker exec -t alpha \
dgraph live \
-s s3:///<bucket>/<path>/1million.schema \
-f s3:///<bucket>/<path>/ \
-z zero:5080 \
-a alpha:9080
docker exec -t alpha \
dgraph live \
-s s3://s3.<region>.amazonaws.com/<bucket>/<path>/1million.schema \
-f s3://s3.<region>.amazonaws.com/<bucket>/<path>/ \
-z zero:5080 \
-a alpha:9080
mkdir data && pushd data
PREFIX="https://github.com/dgraph-io/benchmarks/raw/master/data/"
FILES=(1million.schema 1million.rdf.gz)
export AWS_PROFILE="<profile-with-priv>"
# upload data and schema
for FILE in ${FILES[*]}; do
curl --silent --location --remote-name $PREFIX/$FILE
aws s3 cp $FILE s3://<bucket>/<path>/
done
# verify
aws s3 ls "s3://<bucket>/<path>/"
popd