How to insert batch data faster with mutation?

What I want to do

I’m using this kind of DQL with a python client to insert batch data. But it’s a bit slow, about 25s for 1000 triples. I’d like to ask if there’re faster ways to do it? Thank you very much~

nquad = "
<0x01> <friend> <0x02> .
<0x01> <friend> <0x03> .
<0x01> <friend> <0x03> ."

txn = self.client.txn()
mutation = txn.create_mutation(set_nquads=nquad)
request = txn.create_request(mutations=[mutation], commit_now=True)
txn.do_request(request)

Dgraph metadata

dgraph version

The version of the dgraph I am using is v21.03.2.

Try it with the live loader and compare speeds - if it’s better you can look into why since it’s using the same API.

If it’s the same-ish then you will probably want to profile your system.

1 Like

Hey, I checked up on live loader before, but since I have to use Python client, I’m not sure that I can use it. I’m thinking to use async functions to execute it concurrently. Is multithreads supported in Python clients? I didn’t find this in the documents.

Thank you very much~

I was not suggesting to use the live loader instead of python to build your application, just to identify if your dgraph instance (which I assume you are running yourself) is the cause of the slowness or your program.

If the live loader is faster you know there is something you can do in your program to make it faster - because both the live loader and the python bindings use the exact same gRPC interface to dgraph. If the live loader is not much faster, you should look at properly resourcing your dgraph instance.

OK, I will try this, thank you~.

But I have another question. I’m using a distributed framework called ray to accelerate it. But it seems that when I use dgraph in my code, it doesn’t run concurrently, the tasks are lined up. So I’d like to ask if there’s some restriction of the client connection?

I don’t know how Ray does it’s thing (maybe just a multiprocess orchestrator?) - and I have never seriously used python for anything, so could not help you there. The bindings are pretty thin around the gRPC bindings (if anything like the go dgraph bindings, of which I am far more familiar) so I doubt there is much in there that could be interfering - that is conjecture, though.

Ok maybe not a very helpful reply - sorry!