DGraph Times Out Processing Graph

Moved from GitHub dgraph/4231

Posted by jm4games:

Dgraph Version?

Dgraph version : v1.1.0
Dgraph SHA-256 : 7d4294a80f74692695467e2cf17f74648c18087ed7057d798f40e1d3a31d2095
Commit SHA-1 : ef7cdb28

Hardware Spec ?

OS: Arch Linux, (running dgraph in docker env)
CPU: AMD 3900x
RAM: 64GB

Steps to reproduce the issue (command/config used to run Dgraph).

schema.txt - graph layout
trait.txt - initial seed data
graph.gz - (full graph)

Commands:
Update schema through dgraph web UI (copy and paste schema.txt into schema bulk edit)
Then run the following from shell in dir where downloaded trait.txt.gz and graph.gz exist.

$ curl -X POST -H 'Content-Encoding: gzip' -H "Content-Type: application/rdf" localhost:8080/mutate\?commitNow=true --data-binary @trait.txt.gz
$ curl -X POST -H 'Content-Encoding: gzip' -H "Content-Type: application/rdf" localhost:8080/mutate\?commitNow=true --data-binary @graph.gz

Expected behaviour and actual result.

Dgraph should process the graph, but instead spins on it and never completes processing.

Other

I know the graph itself is sound as I have processed smaller data sets with the exact layout. It seems once a graph reaches a certain size dgraph can no longer process it. Maybe I have exceed some sort of internal limit. I have also attached log of dgraph server and it looks like a raft consensus is timing out.

Files

graph.gz
error.log
schema.txt
trait.txt.gz

MichelDiz commented :

What exactly are you doing? Is it a bulk load? live load?

The file graph.gz isn’t correct for bulk. Bulk or Live only accepts Nquads or JSON. No upsert or even the format:

mutation {
set {

    }
}

If you wanna change something, you should do it after the load via upsert. Not during the load.

What is the config of your cluster (how many groups, instances) do you have?

jm4games commented :

This is on a single box using 1 dgraph/zero/ratel server (e.g. the getting started docker-compose). This uses the standard localhost:8080/mutate endpoint (with commitNow=true). The graph.gz is a dump of what I am posting to the endpoint.

MichelDiz commented :

How much RAM does it consume? doing this.

jm4games commented :

It would do occasional spikes and then eventually sit at 6~10GB. I would have to run it again to make sure and I won’t be at my dev box for a while. But the graphj.gz is only ~168 mb unzipped, so I am kind of surprised it was taking so much ram.

jm4games commented :

@MichelDiz I updated description with proper repo steps (sorry should have done that to begin with). Got onto a laptop and performed repo, I killed dgraph once it started to exceed 10GB in RAM.

Also fixed traits.txt (now traits.txt.gz)

MichelDiz commented :

As you have a lot of RAM available, try to run your cluster with these flags (Just for this situation):

--badger.tables=ram
--badger.vlog=mmap

dgraph alpha -h

 --badger.tables string  [ram, mmap, disk] 
Specifies how Badger LSM tree is stored. Option sequence consume most to least RAM while providing best to worst read performance respectively. (default "mmap")
 --badger.vlog string  [mmap, disk] 
Specifies how Badger Value log is stored. mmap consumes more RAM, but provides better performance. (default "mmap")

jm4games commented :

Ok I will have to give it a shot tonight, my current laptop doesn’t have near as much ram :slight_smile:

MichelDiz commented :

What I have indicated is a way of assessing whether it could be bottleneck by I/O.

I would then recommend that you have an actual cluster, where each group has its own resources. A single instance with slow HDD tends to have poor performance. But if you have multiple instances with own resources, even with slow HDDs and well-defined groups, there is no performance loss.

Also, Dgraph/BadgerDB works better with SSD/NVMe.

jm4games commented :

So I am running on a corsair MP600, that gets about 4250MB/s in write throughput, so I doubt its IO bound. You wont even see those speeds in a small cluster setup. The only thing that i can think that might prevent this is I am using a docker volume and that could be IO limited.

jm4games commented :

So it seems these flags did not make any difference. I’ve also noticed that since the request failed to fully process, dgraph tries to recover it when I restart the server. The recovered “job?” causes the server to start spinning again. The server also doesn’t seem to be responsive to simple query commands.

jm4games commented :

Also, I’ve been monitoring the server trying to process the job for over 15 min and the server is sitting at 7.5GBof ram use (which has stayed consistent when reached, slowly rising though) and fluctuates from using 2 vcores to all 24 vcores. I don’t see much io activity (via iotop). I suspect maybe golangs GC is kicking in for a massive cleanup when i hit all 24 cores?

MichelDiz commented :

Quick question, how do you run it? 185MB via Ratel?

jm4games commented :

how to I run the request? I run all servers via docker-compose and send the request via localhost:8080/mutate which is alpha

jm4games commented :

Looks like job finally finished (but didn’t commit graph changes)

server_1  | W1101 14:16:42.413590       1 draft.go:916] Inflight proposal size: 189262822. There would be some throttling.
server_1  | W1101 14:16:42.413866       1 draft.go:958] Raft.Ready took too long to process: Timer Total: 1.064s. Breakdown: [{proposals 1.064s} {disk 0s} {advance 0s}] Num entries: 0. MustSync: false
server_1  | I1101 14:58:11.000551       1 draft.go:637] Blocked pushing to applyCh for 41m28.587s
server_1  | W1101 14:58:11.000616       1 draft.go:958] Raft.Ready took too long to process: Timer Total: 41m28.587s. Breakdown: [{proposals 41m28.587s} {disk 0s} {advance 0s}] Num entries: 0. MustSync: false
server_1  | I1101 14:58:13.865419       1 log.go:34] 1 is starting a new election at term 5
server_1  | I1101 14:58:13.865450       1 log.go:34] 1 became pre-candidate at term 5
server_1  | I1101 14:58:13.865457       1 log.go:34] 1 received MsgPreVoteResp from 1 at term 5
server_1  | I1101 14:58:13.865498       1 log.go:34] 1 became candidate at term 6
server_1  | I1101 14:58:13.865504       1 log.go:34] 1 received MsgVoteResp from 1 at term 6
server_1  | I1101 14:58:13.865546       1 log.go:34] 1 became leader at term 6
server_1  | I1101 14:58:13.865554       1 log.go:34] raft.node: 1 elected leader 1 at term 6
server_1  | I1101 14:58:14.765325       1 groups.go:808] Leader idx=0x1 of group=1 is connecting to Zero for txn updates
server_1  | I1101 14:58:14.765349       1 groups.go:817] Got Zero leader: zero:5080
q

MichelDiz commented :

ah, just saw your update in the first comment

curl -X POST -H 'Content-Encoding: gzip' -H "Content-Type: application/rdf" localhost:8080/mutate\?commitNow=true --data-binary @graph.gz

jm4games commented :

fyi, server is sitting at 8.8 GB of ram use after job completion. That seems really high… given how small the data set was.

MichelDiz commented :

Well, I’m running your dataset for the last 26 minutes. Only the transaction to be aborted in the logs (Maybe some typo in the dataset). There’s no issue as I could tell.

(I’m not using Docker. Just a local instances of Dgraph in Darwin)

It uses 10.13GB of RAM. Which is normal. And have some spikes in CPU. Which is also normal due to the background activities.

Use Upsert Block with 185MB of data (decompressed) was never tested or the intention to do in that kind of query. Upsert is very powerful, and its use must be done with care.

I believe these issues below cover what you need:

https://github.com/dgraph-io/dgraph/issues/4048

jm4games commented :

I would be surprised if there was a typo in data set, since it programattically generated and it works on smaller data sets (but i guess there is always the chance). Are you saying I should use the mutate endpoint instead and should expect better perf? I really just use upsert so that i can get the trait node ids and insert new data with them in one call.

MichelDiz commented :

I would personally recommend that you do this in small chunks with longer intervals. And add a load balancing logic to the logic of your code.

For example, instead of firing all chunks for just one instance, you can fire like this:

localhost:8080/mutate\?commitNow=true --data-binary @graphchunk1.gz
localhost:8081/mutate\?commitNow=true --data-binary @graphchunk2.gz
localhost:8082/mutate\?commitNow=true --data-binary @graphchunk3.gz
localhost:8083/mutate\?commitNow=true --data-binary @graphchunk4.gz

Sending chunks for each available instance. That way you split the load well in your cluster.