Dgraph is using actually assigned uids after bulk import

pablog · January 12, 2020, 4:39pm

Hello.
Not sure if this is a bug or the expected behaviour.

I’m running dgraph bulk in a machine and once the process finishes I’m copying the p folder to a different machine where alpha runs (and zero too).

I have recently noticed that once I get dgraph running with the bulk import, when I run a mutation containing a blank uid, dgraph is assigning an already assigned uid (like 0x1). This is causing some problems because it is actually overwriting data. I expected zero to actually calculate the last used uid and continue assigning them from that.

The first questions here are:

Is it the expected behaviour?
Am I doing it wrong because the zero instance used to create the bulk import is the one expected to run with this data?

Then, how to proceed? Two ideas come to my mind

Get the last assigned uid in the zero instance I have used to create the bulk import and request assigning the same number of uids in the final instance
Just copy the zw directory to the final zero instance

Derived from these questions, I have another one: how to retrieve to last used uid? So far I haven’t found an easy way but I have two workarounds:

dgraph debug -w <zw_dir> shows it
curl localhost:6080/assign?what=uids&num=1 shows you the next one

Thanks

dmai · January 12, 2020, 5:22pm

You should be running the same Dgraph Zero instance for the bulk loader and for the active cluster. Dgraph Zero handles the UID assignment.

As for figuring out the last used UID, those workarounds work. You can also check Zero’s /state page, which shows the maxLeaseId. That’s not necessarily the last used UID, but rather it tells you the highest possible UID that was handed out by Zero which is a good approximate. e.g., the maxLeaseId could be 10000 from which Zero has handed out up to 9942.

pablog · January 12, 2020, 5:37pm

Thanks @dmai. Makes sense

Topic		Replies	Views
Preserve UIDs in bulk loader Users	5	653	June 27, 2019
Bulk-Import does not really work Dgraph	17	892	May 15, 2019
Is it possible to run dgraph bulk several times concurrently? Dgraph	2	343	January 6, 2020
Understanding bulk data loads, and bulk updates, with XID in v0.8 Users	2	850	November 1, 2017
Where is the mapping of xids to uids which is created by bulk Users	3	660	April 5, 2018

Dgraph is using actually assigned uids after bulk import

Related topics