Need more guidance about importing data into Dgraph Cloud

Hi there,

I just started to try Dgraph Cloud and immediate I was confused about importing data at scale. I was able to launch a backend, define a schema, and add a couple of records via mutation through the web UI. But I couldn’t find a clear step-by-step tutorial of importing some data into the cloud backend. Here is a list of my questions, and thank you in advance.

  1. what’s the difference between the CLI tool ‘slash-graphql’ and ‘dgraph’? Is slash-graphql a dedicated tool for Dgraph Cloud and the dgraph is more generic tool for self-deployed version? How to config user credentials on whichever tool that is better interacting with the cloud?

  2. I found something about importing data with slash-graphql here. Something like $ slash-graphql import-data -e https://frozen-mango.cloud.dgraph.io/graphql -t <apiToken> ./import-directory What format of data is expected in this “import-directory”? What’s the difference between this one and the dgraph live way of importing data?

  3. dgraph live needs to know Dgraph Alpha and Dgraph Zero addresses. How do we find out the addresses in Dgraph Cloud Web UI? I also saw it requires the ‘.schema’ file. Is this file the same as the ‘.graphql’ file that we can download from the Web UI?

Hope to get answers soon so I can keep exploring. Dgraph seems to be a great option for Graph DB.

Thanks a lot!

Hi there @chqsark, welcome to the Dgraph forums.

Allow me to answer some of your questions:

First, some terminology (please pay attention to the formatting):

  1. Dgraph is a database.
  2. Dgraph Cloud, or SlashGraphQL are basically like Amazon RDS and Hasura had a baby.
  3. dgraph is a program. It has many sub programs. The main use case of this program is the database (you run dgraph zero and dgraph alpha).
  4. slash-graphql is a CLI application to manage and manipulate Dgraph Cloud instances.
  5. dgraph live is the Dgraph live loader, which allows you to load data into an already running database.

Consequently, you do not have to run dgraph if you are using Dgraph Cloud (for the most part).

We’re currently working on the future trajectory of slash-graphql so I shan’t comment much there. We currently encourage the use of the cloud api to programmatically interact with Dgraph Cloud instances.

And now, to answer your question - I think you got it correct mostly. You use dgraph live to load data into the database. You can find the address in the Overview section of your Dgraph Cloud instance. There should be a link called GraphQL endpoint. The same endpoint, minus the /graphql is where your alpha and zero sits. You will need to specify the ports… so I will have to check and get back to you.

Thanks @chewxy, that clarifies a lot of things.

I was able to interact with Dgraph Cloud via python client. However, there seems to be some problem of insertion.

At first, I added some sample data via API Explorer through the mutation API call with the schema I defined in ‘Schema’ section on the Dgraph Web UI.

Second, I was able to query those data through the Dgraph python client.

Third, I added one more data item through the python API. Something like

p = {
    'Product.id': "1", 
    'Product.risks': [
      { 'Risk.product_id': "1", 
        'Risk.job_id': "xxx, 
        'Risk.word': "xxx",
        'Risk.serial_number': "xxx",
        'Risk.severity': 3,
        'Risk.raw_score': 99.9,
        'Risk.amount': 5
      }, 
     ]
}

txn = client.txn()
try:
  resp = txn.mutate(set_obj=p)
  print(resp)
  txn.commit()
except pydgraph.AbortedError as e:
  print(e)
finally:
   txn.discard()

Then I couldn’t find the new item in the Dgraph Web via the API Explorer query. The query is something like and it returned all samples I added through the API Explorer except the one I inserted with the python client.

query MyQuery {
  queryProduct {
    id
    risks {
      severity
      raw_score
      amount
    }
  }
}

Fourth, at this moment, the python client query returned the correct amount of data items (the number of items I added through API Explorer + 1 item I added through the client). However, the result json seems missing fields. My query is like

query = """query all($a: string) {
  all(func: eq(Product.id, $a))
  {
    Product.id
    Product.Risks {
        Risk.product_id
        Risk.job_id
        Risk.word
        Risk.serial_number
        Risk.severity
        Risk.raw_score
        Risk.amount
    }
  }
}"""
variables = {'$a': '1'}

txn = client.txn()
res = txn.query(query, variables=variables)

# If not doing a mutation in the same transaction, simply use:
# res = client.txn(read_only=True).query(query, variables=variables)

pp = json.loads(res.json)
print(pp)
# Print results.
print('Number of products": {}'.format(len(pp['all'])))
for p in pp['all']:
  print(p)

And the above code is only printing

{'all': [{'Product.id': '1'}, {'Product.id': '1'}, {'Product.id': '1'}, {'Product.id': '1'}]}
Number of products": 4
{'Product.id': '1'}
{'Product.id': '1'}
{'Product.id': '1'}
{'Product.id': '1'}

So why are we missing everything else except for the ‘id’?

In short, 2 questions.

  1. why the python client and the API Explorer see inconsistent data?
  2. what did I do wrong in the query so that I’m missing some fields in the query result?

Thanks a lot!