Dgraph benchmark

I`m doing a benchmark for Dgraph, my cluster consists of 3alpha and 3zero, alpha deploy in AWS m5.large instance, zero deploy in AWS t3.medium instance, my test request include 1 query and 2 mutate. My cluster just have 40qps throughput, benchmark result like this

1. wrk parameters: wrk2 -t5 -c20 -R40 -d2m

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   363.30ms  196.14ms   1.43s    73.15%
    Req/Sec    19.42      4.29    31.00     77.21%
  Latency Distribution (HdrHistogram - Recorded Latency)
 50.000%  329.73ms
 75.000%  453.38ms
 90.000%  606.21ms
 99.000%    1.06s 
 99.900%    1.24s 
 99.990%    1.44s 
 99.999%    1.44s 
100.000%    1.44s 
#[Mean    =      363.302, StdDeviation   =      196.141]
#[Max     =     1434.624, Total count    =         4384]
#[Buckets =           27, SubBuckets     =         2048]
----------------------------------------------------------
  4785 requests in 2.00m, 1.38MB read
Requests/sec:     39.86
Transfer/sec:     11.79KB

alpha   cpu  %48 iops 366
zero     cpu  %34 iops 244

2. wrk parameters: wrk2 -t5 -c40 -R80 -d2m
50.000%  973.82ms
 75.000%    1.11s 
 90.000%    1.26s 
 99.000%    1.54s 
 99.900%    1.70s 
 99.990%    1.85s 
 99.999%    1.85s 
100.000%    1.85s 
#[Mean    =      978.974, StdDeviation   =      217.198]
#[Max     =     1853.440, Total count    =         4508]
#[Buckets =           27, SubBuckets     =         2048]
----------------------------------------------------------
  4973 requests in 2.00m, 1.44MB read
Requests/sec:     41.42
Transfer/sec:     12.26KB
alpha   cpu  46%  iops  277
zero     cpu  46%  iops 156

why are resources not being fully utilized?

@dmitry did a benchmark as well.

Did I hit 1B+ Transactions?

Maybe the link can help!

@abhijit-kar Thanks for the reply, This is my query result

{"data":{"code":"Success","message":"Done","uids":[]},"extensions":{"server_latency":{"parsing_ns":30952,"processing_ns":1583801668},"txn":{"start_ts":415734,"commit_ts":415798,"preds":["1-ww\/attr\/enterFee","1-ww\/attr\/socialLevel","1-ww\/attr\/storeTime","1-_predicate_","1-ww\/attr\/manageLevel","1-ww\/attr\/tryFlag"]}}}

processing_ns:1583801668 looks like too long, but my dgraph instance load is not very high.

What do your inserts look like with regards to predicates? Ie do you have lots of predicates with high cardinality of values, or just a couple predicates that you write to heavily?

@seanlaff Thanks for the reply,my scheme like this

<xid>:string @index(hash) @upsert .
<rel1>:uid @reverse .
<rel2>:uid @reverse .
<attr1>:int @index(int) .
<attr2>:int @index(int) .
<attr3>:int @index(int) .
<attr4>:int @index(int) .
<attr5>:int @index(int) .
<attr6>:int @index(int) .
<attr7>:int @index(int) .

My test request process is

1. check xid exists
2. create node
3. set attr

Dgraph looks like just can provide 40qps when I scale dgraph instance(more vpu,more mem), its still just provide 40qps,im confused.

What do these parameters mean?

Also have you tried increasing the concurrency for your requests? Can you share the exact queries and mutations with as script on how to reproduce these results? The throughput numbers depend on your queries/mutations and the machine that you are using. If you can share exact steps to reproduce then we could help here.

@pawan Thanks for the reply.

wrk2 -t5 -c20 -R40 -d2m

wrk2 is a modern HTTP benchmarking tool. wrk2

 Parameters mean :
-t, --threads     <N>  Number of threads to use
-c, --connections <N>  Connections to keep open
-d, --duration    <T>  Duration of test 
-R, --rate        <T>  work rate (throughput)     
                           in requests/sec (total)    
                           [Required Parameter]

Test model:

wrk2 -> PHP service -> dgraph

Test env:

dgraph v1.0.16 deploy in AWS m5d.xlarge( single host ) 
m5d.xlarge instance has 130G NVMe local storge

Test scheme:

<xid>:string @index(hash) @upsert .
<rel1>:uid @reverse .
<rel2>:uid @reverse .
<attr1>:int @index(int) .
<attr2>:int @index(int) .
<attr3>:int @index(int) .
<attr4>:int @index(int) .
<attr5>:int @index(int) .
<attr6>:int @index(int) .
<attr7>:int @index(int) .

Test request include query/mutate:

Start a transaction

curl -H "Content-Type: application/graphql+-" -X POST 172.30.29.135:8087/query -d '
{
    q(func: eq(xid, ["ww_u_167507121152"])) { uid xid }
}
'
curl -H "X-Dgraph-MutationType: json" -X POST 172.30.29.135:8087/mutate?startTs=0 -d '
{
    "set":[
        {
            "xid":"ww_u_167507121152",
            "uid":"_:ww_u_167507121152"
        }
    ]
}
'
Commit transaction

curl -H "Content-Type: application/graphql+-" -X POST 172.30.29.135:8087/query -d '
{
    q(func: eq(xid, ["ww_u_167507121152"])) { uid xid }
}
'
curl -H "X-Dgraph-MutationType: json" -X POST 172.30.29.135:8087/mutate?startTs=0 -d '
{
    "set":[
        {
            "attr1":49800,
            "attr2":100,
            "attr3":100,
            "attr4":0,
            "attr5":1596526586,
            "uid":"0x9d99f4"
        }
    ]
}

Test result:

wrk2 -t2 -c5 -R80 -d2m

  Latency Distribution (HdrHistogram - Uncorrected Latency (measured without taking delayed starts into account))
 50.000%   66.50ms
 75.000%   84.42ms
 90.000%  101.57ms
 99.000%  139.01ms
 99.900%  536.06ms
 99.990%    1.23s 
 99.999%    1.31s 
100.000%    1.31s 
----------------------------------------------------------
  7141 requests in 2.00m, 1.89MB read
Requests/sec:     59.50
Transfer/sec:     16.15KB

cpu 40%
DiskWriteOps 1500

You can try to reproduce. Thank you so much.

So I noticed that you specify R (rate) as 40/sec and you are getting the same as throughput which seems good to me. Maybe you should try with larger values of R to test the limits here?

Also you seem to be using a really old version of Dgraph. I’d recommend using the latest version as we would have made many improvements since 1.0.16.

@pawan When i change R to 80, Test result:

wrk2 -t2 -c5 -R80 -d2m

  Latency Distribution (HdrHistogram - Uncorrected Latency (measured without taking delayed starts into account))
 50.000%   66.50ms
 75.000%   84.42ms
 90.000%  101.57ms
 99.000%  139.01ms
 99.900%  536.06ms
 99.990%    1.23s 
 99.999%    1.31s 
100.000%    1.31s 
----------------------------------------------------------
  7141 requests in 2.00m, 1.89MB read
Requests/sec:     59.50
Transfer/sec:     16.15KB

cpu 40%
DiskWriteOps 1500

cpu and disk does not appear to be a bottleneck,

I will run the same test case at latest version.