Batch upserts in dgo

milosgajdos · March 10, 2021, 8:34pm

What I want to do

I’m wondering if there is any way I can do batch upserts into dgraph.

I understand dgraph live loader has that functionality and I feel like it would be super handy in dgo, too.

I’m basically trying to upsert a bunch of predicates into dgraph and I don’t want to do it one by one. Mind you we’re not talking about gazillions of entries. At the same time it’s not a small amount and doing a one-by-one roundtrips is extremely inefficient.

Now, a single upsert is pretty simple, I do something like this:

	query := `
	{
		entity(func: eq(xid, "` + e.XID()+ `")) {
			e as uid
		}
	}
	`

	obj := &Entity{
		UID:       "uid(e)",
		XID:       e.XID(),
		Name:      e.Name(),
		DType:     []string{"Entity"},
	}

    // do the JSON encoding dance here
	mu := &dgapi.Mutation{
		SetJson: pb,
	}

	req := &dgapi.Request{
		Query:     query,
		Mutations: []*dgapi.Mutation{mu},
		CommitNow: true,
	}

    // execute the transaction

Now, this is all nice, etc., but I’m not sure how would I go about doing a batch upsert, in particular, I’m not sure what should the query should look like.

Creating a dgapi.Mutation for every item in the batch and appending it to Mutation slice is indeed possible, but the problem there is the query which is “global” per dgapi.Request, not per mutation.

Am I missing something or is this not possible and I really do need to do one-by-one upsert?

Dgraph metadata

dgraph version

[Decoder]: Using assembly version of decoder
Page Size: 4096

Dgraph version   : v20.11.0
Dgraph codename  : tchalla
Dgraph SHA-256   : 8acb886b24556691d7d74929817a4ac7d9db76bb8b77de00f44650931a16b6ac
Commit SHA-1     : c4245ad55
Commit timestamp : 2020-12-16 15:55:40 +0530
Branch           : HEAD
Go version       : go1.15.5
jemalloc enabled : true

For Dgraph official documentation, visit https://dgraph.io/docs/.
For discussions about Dgraph     , visit http://discuss.dgraph.io.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2020 Dgraph Labs, Inc.

chewxy · March 11, 2021, 12:21am

I’m the current maintainer of dgo. I’m abit swamped atm. If you send a PR I’d be happy to accept it.

milosgajdos · March 11, 2021, 9:58am

Yeah, I feel ya, time is my enemy, too

I should have some free time next week hopefully, so will give it a proper think and hopefully hack something up

iluminae · March 15, 2021, 3:43pm

Yea I do this by hashing the variables coming out of the query for use across all mutations. My use case is streaming data into dgraph from pubsub and I want to batch unrelated data into fewer round-trips to the server. My only query is a bunch of:

query {
  UidForX as var(func: eq(tenant.xid,"X")))
  UidForY as var(func: eq(tenant.xid,"Y")))
}

Just to do a xid:uid lookup for every node in the mutations. My problem of course is failure in any part of the batch fails the whole batch and have to try again later. (Happens mostly from aborts since multiple pods are writing to dgraph simultaneously). My personal wish list for mutations is a gRPC streaming api… But I don’t know what the semantics would be exactly.

Just wanted to put my use case on here in case it helps with design.

Topic		Replies	Views
Struggled at making upsert with assignment via the dgo client Dgraph	5	361	March 15, 2020
Many small mutations vs one large. Best usage patterns Users kind:question	5	631	April 20, 2023
How to batch upsert with the Golang client Dgraph kind:question	1	128	June 26, 2024
Unique Identifer based on specified unique predicates Dgraph mutation , example	3	896	June 11, 2019
Bulk/massive upserts feature Dev	6	467	August 22, 2019

Batch upserts in dgo

What I want to do

Dgraph metadata

Related topics