Live uploader: No data files found in

I’m using Live uploader data files to dgraph.

  • It works fine if I pass full path to data files it works
  • I get “No data files found” error, if I pass directory path to --files as value

As per docs link, passing directory should work.

Can someone help here?

  1. Put all gzip data files in a local c:\dgraph directory
  2. On my local windows machine, start dgraph latest container
    docker run -ti --add-host host.docker.internal:host-gateway --network host -v c:\dgraph:/root/dgraph --name dgraph dgraph/dgraph:latest dgraph zero
    
  3. port forward zero and alpha to cloud
    # On new terminal zero loadbalancer
    kubectl port-forward svc/dgraph-dgraph-zero 5080:5080 -n dgraph
    
    # On new terminal zero loadbalancer
    kubectl port-forward svc/dgraph-dgraph-alpha 9080:9080 -n dgraph
    
  4. Connect to the local dgraph container
    # get container_id by running docker ps -a
    docker exec -it <container_id> /bin/sh
    
  5. Run dgraph, pass folder path to --files parameter
    dgraph live --files /root/dgraph/upload/ --schema /root/dgraph/schema/Person.rdf --alpha host.docker.internal:9080 --zero host.docker.internal:5080 --format=rdf --upsertPredicate "xid"
    
    #Notice this will fail with below error
    Dgraph version   : v21.12.0
    Dgraph codename  : zion
    Dgraph SHA-256   : 078c75df9fa1057447c8c8afc10ea57cb0a29dfb22f9e61d8c334882b4b4eb37
    Commit SHA-1     : d62ed5f15
    Commit timestamp : 2021-12-02 21:20:09 +0530
    Branch           : HEAD
    Go version       : go1.17.3
    jemalloc enabled : true
    
    For Dgraph official documentation, visit https://dgraph.io/docs.
    For discussions about Dgraph     , visit http://discuss.dgraph.io.
    For fully-managed Dgraph Cloud   , visit https://dgraph.io/cloud.
    
    Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
    Copyright 2015-2021 Dgraph Labs, Inc.
    
    Running transaction with dgraph endpoint: host.docker.internal:9080
    
    Processing schema file "/root/dgraph/schema/Person.rdf"
    Processed schema file "/root/dgraph/schema/Person.rdf"
    
    No data files found in /root/dgraph/upload
    
  6. If I run the step #5 by passing explicit data file name instead of directory, it works fine.

We don’t support Windows anymore.

I don’t quite understand what is happening. You mention “Cloud” and shows a K8s command. Ok, but you are running a zero instance locally? why? If your cluster is in the Cloud, you don’t need a zero instance locally. You just need to run a live loader instance. Nothing more.

The schema in Dgraph isn’t RDF, it is plain text.

Are the files with the RDF extension?

This is issue is now resolved.

The issue was, my data files have .csv.gz extension, dgraph expects them to be rdf.gz

Just to update…

  1. By cloud, I meant AKS
  2. My zero and alpha are running in ASK
  3. To run Live uploader, was running a stance of dgraph container on Windows and portForward to AKS
  4. Live uploader is work flawless

dgraph expects to be gz, rdf or JSON.

CSV? Dgraph doesn’t support CSV via live or bulk.

What else is blocking you?

  • What is the optimal number of files and file size that can be submitted to Live uploader? I’m asking this question because dgraph is getting killed with 500 files each file is 50MB. My drgaph command is…

    dgraph live --files /root/dgraph/data/1642449041/diagnosis --schema /root/dgraph/schema/Patient.rdf --alpha host.docker.internal:9080 --zero host.docker.internal:5080 --format=rdf --upsertPredicate "xid" -b 50000
    
  • With Ratel, edge node labels weren’t getting rendered properly. See the below picture…

Seems like 2 new questions, you may want to put them as topics for better SEO

No idea, it is unusual (at least for me) to have several files loaded in LiveLoader. In general is a big RDF file. Sometimes with GB in size. I guess this is some OS limitation, maybe RAM limitation or maybe something in the Cobra Lib. Just guessing.

There are some rules that I don’t remember now. Also some predicates are rendered if you zoom in. Share the query and a copy of the result in JSON.