Hi everyone!
We are adapting our production environment into to the dGraph style.
We have cluster with replication of 3 on r5d.2xlarge AWS instances (64GB RAM, 8 cores, SSD) - 3 aplhas and 3 zeros - one on each node.
When I try this mutation it takes about 27s for response (from js client, also with curl from localhost).
This is my schema .
I tried it also in several serial requests, but one request takes about 1s and it finish also about 28s.
What I can change to get better write speed?
dmai
(Daniel Mai)
December 6, 2018, 8:57pm
2
What version of Dgraph are you running and what’s the Zero and Alpha config set up? Are all these instances in the same region?
My cluster run in same region in 3 availability zones: eu-west-1a, eu-west-1b, eu-west-1c.
I used version 1.0.10 and also 1.0.11-rc4.
Commands for zero are:
docker run -d --name=zero --hostname=1.zero.weave.local --restart unless-stopped -v /dgraph/zero/w:/dgraph/zw -p 6080:6080 -p 5080:5080 dgraph/dgraph:v1.0.11-rc4 dgraph zero --my=1.zero.weave.local:5080 --idx 1 --replicas 3
and for alphas:
docker run -d --name=alpha --hostname=1.alpha.weave.local --restart unless-stopped -v /dgraph/export:/dgraph/export -v /dgraph/alpha/w:/dgraph/w -v /dgraph/alpha/p:/dgraph/p -p 8080:8080 -p 9080:9080 dgraph/dgraph:v1.0.11-rc4 dgraph alpha --export=/dgraph/export --my=1.alpha.weave.local:7080 --lru_mb 21504 --zero 1.zero.weave.local:5080 -p /dgraph/p --idx=1 -w /dgraph/w --badger.vlog=disk --whitelist 172.17.0.0:172.20.0.0 --bindall=true --custom_tokenizers=/dgraph/plugins/nfd.so
dmai
(Daniel Mai)
December 6, 2018, 9:26pm
4
I ran a cluster and added your schema and mutation (thanks for sharing). I also compiled your nfd tokenizer recently shared in a GitHub issue for the nfd index.
I ran a 1 Zero/1 Alpha cluster locally with v1.0.10 and v1.0.11-rc4, ran the schema udpate, and then the mutation. Both the schema update (2 secs) and the mutation (1 sec) finished quickly.
selmeci:
--badger.vlog=disk
Is there a reason to set this when you said your set up is with SSDs?
dmai
(Daniel Mai)
December 6, 2018, 9:27pm
5
Can you clarify what you mean here? What happened for the 1s request and for the 28s request? Is 28s for the total time taken to do the several requests in serial?
selmeci
December 14, 2018, 1:39pm
6
my test env has only 16GB of RAM and there was issue with replication… but when I try to run it whit this command:
docker run -d --name=alpha --hostname=1.alpha.weave.local --restart unless-stopped -v /dgraph/export:/dgraph/export -v /dgraph/alpha/w:/dgraph/w -v /dgraph/alpha/p:/dgraph/p -p 8080:8080 -p 9080:9080 dgraph/dgraph:v1.0.11-rc4 dgraph alpha --export=/dgraph/export --my=1.alpha.weave.local:7080 --lru_mb 21504 --zero 1.zero.weave.local:5080 -p /dgraph/p --idx=1 -w /dgraph/w --badger.vlog=mmap --badger.tables=ram --whitelist 172.17.0.0:172.20.0.0 --bindall=true --custom_tokenizers=/dgraph/plugins/nfd.so
the response time was same.
selmeci
December 14, 2018, 1:40pm
7
Yes, 28s is total time for all requests in serial.
dmai
(Daniel Mai)
December 14, 2018, 5:37pm
8
Can you share the server_latency
numbers in your responses? There are three of them: parsing_ns, processing_ns, and encoding_ns.
selmeci
December 14, 2018, 5:42pm
9
How I can get it in Javascript client?
BTW, I try to convert my JSON to RDF file and check how fast it will goes with dgraph live
, but it is slower.
Processing mutation.rdf.gz
[ 2s] Txns: 0 RDFs: 0 RDFs/sec: 0 Aborts: 0
[ 4s] Txns: 0 RDFs: 0 RDFs/sec: 0 Aborts: 0
[ 6s] Txns: 1 RDFs: 1000 RDFs/sec: 167 Aborts: 0
[ 8s] Txns: 1 RDFs: 1000 RDFs/sec: 125 Aborts: 0
[ 10s] Txns: 1 RDFs: 1000 RDFs/sec: 100 Aborts: 1
[ 12s] Txns: 1 RDFs: 1000 RDFs/sec: 83 Aborts: 1
[ 14s] Txns: 1 RDFs: 1000 RDFs/sec: 71 Aborts: 2
[ 16s] Txns: 1 RDFs: 1000 RDFs/sec: 62 Aborts: 2
[ 18s] Txns: 1 RDFs: 1000 RDFs/sec: 56 Aborts: 3
[ 20s] Txns: 1 RDFs: 1000 RDFs/sec: 50 Aborts: 4
[ 22s] Txns: 1 RDFs: 1000 RDFs/sec: 45 Aborts: 4
[ 24s] Txns: 1 RDFs: 1000 RDFs/sec: 42 Aborts: 5
[ 26s] Txns: 1 RDFs: 1000 RDFs/sec: 38 Aborts: 5
[ 28s] Txns: 1 RDFs: 1000 RDFs/sec: 36 Aborts: 6
[ 30s] Txns: 1 RDFs: 1000 RDFs/sec: 33 Aborts: 6
[ 32s] Txns: 2 RDFs: 2000 RDFs/sec: 62 Aborts: 6
[ 34s] Txns: 2 RDFs: 2000 RDFs/sec: 59 Aborts: 6
[ 36s] Txns: 2 RDFs: 2000 RDFs/sec: 56 Aborts: 7
[ 38s] Txns: 2 RDFs: 2000 RDFs/sec: 53 Aborts: 7
[ 40s] Txns: 2 RDFs: 2000 RDFs/sec: 50 Aborts: 7
[ 42s] Txns: 2 RDFs: 2000 RDFs/sec: 48 Aborts: 8
[ 44s] Txns: 2 RDFs: 2000 RDFs/sec: 45 Aborts: 9
[ 46s] Txns: 2 RDFs: 2000 RDFs/sec: 43 Aborts: 9
[ 48s] Txns: 2 RDFs: 2000 RDFs/sec: 42 Aborts: 10
[ 50s] Txns: 2 RDFs: 2000 RDFs/sec: 40 Aborts: 10
[ 52s] Txns: 2 RDFs: 2000 RDFs/sec: 38 Aborts: 11
[ 54s] Txns: 2 RDFs: 2000 RDFs/sec: 37 Aborts: 11
[ 56s] Txns: 3 RDFs: 3000 RDFs/sec: 54 Aborts: 11
[ 58s] Txns: 3 RDFs: 3000 RDFs/sec: 52 Aborts: 11
[ 1m0s] Txns: 3 RDFs: 3000 RDFs/sec: 50 Aborts: 12
[ 1m2s] Txns: 3 RDFs: 3000 RDFs/sec: 48 Aborts: 13
[ 1m4s] Txns: 3 RDFs: 3000 RDFs/sec: 47 Aborts: 13
[ 1m6s] Txns: 3 RDFs: 3000 RDFs/sec: 45 Aborts: 14
[ 1m8s] Txns: 3 RDFs: 3000 RDFs/sec: 44 Aborts: 14
[ 1m10s] Txns: 3 RDFs: 3000 RDFs/sec: 43 Aborts: 15
[ 1m12s] Txns: 3 RDFs: 3000 RDFs/sec: 42 Aborts: 15
[ 1m14s] Txns: 3 RDFs: 3000 RDFs/sec: 41 Aborts: 15
[ 1m16s] Txns: 4 RDFs: 4000 RDFs/sec: 53 Aborts: 15
[ 1m18s] Txns: 4 RDFs: 4000 RDFs/sec: 51 Aborts: 16
[ 1m20s] Txns: 4 RDFs: 4000 RDFs/sec: 50 Aborts: 16
[ 1m22s] Txns: 4 RDFs: 4000 RDFs/sec: 49 Aborts: 17
[ 1m24s] Txns: 4 RDFs: 4000 RDFs/sec: 48 Aborts: 17
[ 1m26s] Txns: 4 RDFs: 4000 RDFs/sec: 47 Aborts: 18
[ 1m28s] Txns: 5 RDFs: 4537 RDFs/sec: 52 Aborts: 18
[ 1m30s] Txns: 5 RDFs: 4537 RDFs/sec: 50 Aborts: 18
[ 1m32s] Txns: 5 RDFs: 4537 RDFs/sec: 49 Aborts: 19
[ 1m34s] Txns: 5 RDFs: 4537 RDFs/sec: 48 Aborts: 19
[ 1m36s] Txns: 5 RDFs: 4537 RDFs/sec: 47 Aborts: 20
[ 1m38s] Txns: 5 RDFs: 4537 RDFs/sec: 46 Aborts: 20
[ 1m40s] Txns: 6 RDFs: 5537 RDFs/sec: 55 Aborts: 20
[ 1m42s] Txns: 6 RDFs: 5537 RDFs/sec: 54 Aborts: 20
[ 1m44s] Txns: 6 RDFs: 5537 RDFs/sec: 53 Aborts: 20
[ 1m46s] Txns: 6 RDFs: 5537 RDFs/sec: 52 Aborts: 21
[ 1m48s] Txns: 6 RDFs: 5537 RDFs/sec: 51 Aborts: 21
Number of TXs run : 7
Number of RDFs processed : 6537
Time spent : 1m48.420856484s
RDFs processed per second : 60
selmeci
December 14, 2018, 5:48pm
10
I do not know whether it matters, but my dataset has already about 125 000 000 of nodes.
But CPU and RAM is ok.
dmai
(Daniel Mai)
December 14, 2018, 7:20pm
11
You can get server latency numbers from the Response#extensions.server_latency
field.
selmeci
December 14, 2018, 8:55pm
12
Response#extensions
is nil for me, but I .getLatency
return this: 14097299, 27136539745
selmeci
December 15, 2018, 4:53pm
13
I tried cluster of 3 nodes with replication of 1, but writing speed is same - about 30s. Why? Why sharding does not help?
dmai
(Daniel Mai)
December 19, 2018, 10:02pm
14
I took another look at your schema. A number of predicates have a count index, which are expensive for writes. If the count indexes aren’t required for your use case, then your mutation response times should improve significantly without them.
1 Like