How feasible is it to assign all uids

vinaypillai · December 22, 2020, 6:52am

Hi! In order to avoid querying for uids to mutate nodes, I was interested in using a 64 bit hash to assign uids based off an unique external identifier, and had a series of questions regarding the viability of the endeavour. Is it possible to assign all uids in the uint64 range? How could it best be done and would it cause any problems with how the database functioned? Are there any reserved uid ranges that should be avoided?

MichelDiz · December 22, 2020, 1:15pm

Technically, yes.

Only via endpoint /assign?what=uids&num=100 see => More about Dgraph Zero - Deploy

Nope. I think only the uid 0x0 should be avoided.

vinaypillai · December 22, 2020, 5:57pm

Hi @MichelDiz,
So I started testing mass allocating uids and it seemed to mostly work, though I did run into a problem with the maxLeaseId. Although the /assign endpoint does let you allocate 2^64 - 1 uids, this seemingly triggers an overflow in the maxLeaseId, which apparently gets set to (2^64 - 1) + 10000. As a result, the highest realistically allocatable uid seems to be (2^64 - 1) - 10000, which would then leave the maxLeaseId at 2^64 - 1. However this still results in the remaining 10000 uids as potentially reusable by the controller. Any suggestions as to how to get around this?

MichelDiz · December 22, 2020, 6:00pm

Personally I have no idea. I never did this before. Maybe @dmai could help.

dmai · December 22, 2020, 10:58pm

Hm… there’s an enhancement we can do here to prevent the overflow the leased UIDs goes over uint64.

Are you concerned about collisions with a 64-bit hashing scheme? The automatic UID assignment by Dgraph guarantees that each node you add has a unique UID.

vinaypillai · December 22, 2020, 11:33pm

Collisions are a pretty big concern we have, making hashing a fairly flawed solution to our problem, perhaps to the point of not really being viable. For our use case, we will have a dataset of around 600m-1.5b nodes and we need an efficient way to perform weekly batch updates. In order to make the process relatively performant, we’re experimenting with different methods to avoid unnecessary queries to handle the xid to uid mapping for updates.

dmai · December 23, 2020, 2:47am

If you have a separate store of the xid/uids, then you can use that to send mutations directly for the appropriate nodes in Dgraph. The /assign endpoint lets you manage your own range of UIDs to do that. Other users have also asked about managing their own UIDs which is where /assign helps.

vinaypillai · December 23, 2020, 3:22am

Is the live loader xidmap well-suited for handling high volumes of uids, or do you think it would be a better idea to go down the route of setting up a separate store for the uid/xid map?

Topic		Replies	Views
Cannot use entire UID space from clients Dgraph kind:question	8	1323	February 1, 2021
Using assigning uids - what benefits are there? Dgraph	5	1423	December 25, 2018
Is it a problem if different types share the same UID? Dgraph	5	355	October 26, 2021
Assign UIDs in Slash GraphQL Dgraph Cloud slash-graphql	7	783	February 8, 2021
Auto-generated IDs VS Dgraph UIDs Dgraph	1	552	June 16, 2020

How feasible is it to assign all uids

Related topics