Bulk upload with dgraph-js

Hi,

I’m trying to upload a lot of data with dgraph-js. I’m getting errors such as (1) “NQuad count in the request: 1934360, is more that threshold: 1000000” and (2) “Transaction has been aborted. Please retry”.

I guess the second one (2) is showing because I try to upload too fast and first is self-explanatory. The only possibility is to send smaller queries and slow them down? I need to upload data as fast as I can.

We recommend that you send batches of 5k to 10k. Per transaction*. You should create a code that manager the load for you just like Bulk/liveloader does. You can also create some algo to balance the load between instances, this will make the load smoother. Never concentrate the load into a single instance, it will OOM fast.

Dgraph has some limits by default that you can workaround by change some config flags. Use this command to find out what flag you need to use(I don’t know off the top of my mind, so use this) dgraph alpha -h | grep nquad or some similar word. For sure you will find a related flag.

But keep in mind that it is not recommended to push hard your cluster. Do balancing management.

Cheers.

Ok, thanks. I’m trying to do some load balancing.

To get rid of the "“Transaction has been aborted. Please retry” error I must make script very slow. I’m getting this error when I’m sending bulk data (even when it’s much less than NQuad count threshold!) and must chunk data to small pieces and make huge delay between chunks. Dedicated version would help with that?

What is the best way (fastest) to upload data to cloud?

What is that?

Use live or bulkloader.

Live or bulkloader, has all necessary logic to load and retry if necessary. If you wanna a safe load you should replicate that code for JS.

I mean - dedicated cloud server. I’m using shared version right now.

It’s possible to use with cloud version of the service?

It depends on your need. For sure a dedicated or bare metal is N times better. That’s a personal case to evaluate.

As far as I know it is possible, you should open a ticket there and request it. To use live load is pretty simple, just enable the DQL(there is an option for that) and start the load locally pointing to the remote server.