Dgraph is using actually assigned uids after bulk import

Hello.
Not sure if this is a bug or the expected behaviour.

I’m running dgraph bulk in a machine and once the process finishes I’m copying the p folder to a different machine where alpha runs (and zero too).

I have recently noticed that once I get dgraph running with the bulk import, when I run a mutation containing a blank uid, dgraph is assigning an already assigned uid (like 0x1). This is causing some problems because it is actually overwriting data. I expected zero to actually calculate the last used uid and continue assigning them from that.

The first questions here are:

  • Is it the expected behaviour?
  • Am I doing it wrong because the zero instance used to create the bulk import is the one expected to run with this data?

Then, how to proceed? Two ideas come to my mind

  • Get the last assigned uid in the zero instance I have used to create the bulk import and request assigning the same number of uids in the final instance
  • Just copy the zw directory to the final zero instance

Derived from these questions, I have another one: how to retrieve to last used uid? So far I haven’t found an easy way but I have two workarounds:

  • dgraph debug -w <zw_dir> shows it
  • curl localhost:6080/assign?what=uids&num=1 shows you the next one

Thanks

1 Like

You should be running the same Dgraph Zero instance for the bulk loader and for the active cluster. Dgraph Zero handles the UID assignment.

As for figuring out the last used UID, those workarounds work. You can also check Zero’s /state page, which shows the maxLeaseId. That’s not necessarily the last used UID, but rather it tells you the highest possible UID that was handed out by Zero which is a good approximate. e.g., the maxLeaseId could be 10000 from which Zero has handed out up to 9942.

1 Like

Thanks @dmai. Makes sense