Throughput of the dGraph

Hello everyone!

I would like to know how many updates can be performed on dGraph database per second? I would like to achieve around 30-50 update requests per second. Is it possible to achieve this kind of behavior? If it is possible, I would like to know what is the best way to achieve it.

Right now, I’ve deployed a standalone image in Docker container with (2 GB of RAM and 1 vCPU).
Basically, I’ve created type that looks like this:

type Sensor {
    id: ID!
    xid: String! @id @search
    value: Int! @search
    timestamp: DateTime!
    unit: String!
}

I’ve created a single node of that particular type, and now I would like to update value & timestamp properties several times in a single second. To do this, I’ve used python scripts that implements GraphQLClient, this is the code:

from python_graphql_client import GraphqlClient
client = GraphqlClient(endpoint="http://localhost:8080/graphql")
data = client.execute(query=someQuery)

Unfortunately, after 5–6 updates, I get ConnectionERROR (Max retries exceeded with URL). Could I achieve this behavior using dGraph python Client?

Also, I saw that there is some kind of subscription regarding the GraphQL API, is there a way to subscribe to the database, so I get some kind of event notification whenever something gets updated?

Thanks in advance!

This limit is based on quite a number of factors, so there’s no “stock” answer, but Dgraph can manage your expected rate of 30-50 without blinking. The following code manages to insert 1,000 “Sensor” nodes in about 120-140 milliseconds.

import datetime

from python_graphql_client import GraphqlClient

# Instantiate the client with an endpoint.
client = GraphqlClient(endpoint="http://localhost:8080/graphql")

# Create the query string and variables required for the request.
query = """
mutation AddSensor($input: [AddSensorInput!]!) {
  addSensor(input: $input) {
    sensor {
      id
    }
  }
}
"""

list = []
for i in range(1, 1001):
    list.append({"xid": "xid" + str(i), "value": i*i, "timestamp": "2019-01-03", "unit": "C"})

start = datetime.datetime.now()
data = client.execute(query=query, variables={"input": list})
end = datetime.datetime.now()
elapsed = end - start

print(len(list), "records in", elapsed.microseconds / 1000, "milliseconds")

I’ve put up this code in my Dgraph sandbox if you want to try yourself: GitHub - matthewmcneely/dgraph-sandbox at example/python-graphql-client

Regarding the subscription API, I’d invite you to look through the documentation here: GraphQL Subscriptions - GraphQL

Thanks for the answer! Unfortunately, it doesn’t completely “demystify” my confusion. Actually, my question is a bit unclear.

The example that You have provided is some sort of “batching” but I would like to avoid that if possible.

My current setup is subscribed to the MQTT server and whenever a new message is received it should be sent to the database. I would like to avoid batching, for instance if the batch is consisted of 20 updates for single sensor, I would only get the last one as the notification in subscription.

Therefore, I need to call client.execute() whenever a new message is received.

Also, I would like to know if this kind of update (the one that you proposed) ensures that sensors will be added in correct order, like [xid1, xid2, xid3, …]? I’ve tested this using @withSubscription, and I didn’t get the correct order.

OK, so instead of appending to the array in the loop, make your client.execute() there instead. That would mimic the message coming off the queue.

Ordering in subscriptions is not guaranteed. You can mitigate that with your own incrementing counter (a common tactic in pub-sub systems).