Unique Indexes on predicates

tamethecomplex · November 17, 2017, 11:38am

Hi Dgraph team,

This is a followup from a couple of posts by @calummoore linked below.

I am working on writing several inbound integrations to dgraph, and it would be extremely helpful to have a ‘unique’ tokenizer on string indexes that would guarantee that only one node in the whole database has the same value for that predicate. @calummoore, hopefully I am accurately representing your original request. Is this something you guys can see implementing in the short term?

calummoore · November 17, 2017, 11:45am

Hi

Dgraph now supports transactions! So all of the above can now be done using them.

If you want to assert a unique index, you would:

Create a transaction (this is done locally) by the client
Send a query (via the transaction) to check if the predicate value already exists
If it does, abort. If it doesn’t send a mutation to add the key as part of the transaction.
Commit the transaction. Dgraph will abort if the query/mutation is no longer valid.

You can find out how to create a transaction for your given client on the client docs.

EDIT: you can read more about transactions here: Releasing distributed transactions in v0.9 - Dgraph Blog

tamethecomplex · November 17, 2017, 11:56am

Hi @calummoore, makes sense. I was hoping to have the guarantee of uniqueness without having to write the extra check in #2 above. Do you think this is overkill or would this be a useful feature regardless?

Another reason I can think of that enforced uniqueness would be useful is if there are multiple people developing an app on top of a given instance, and someone creates a bug that fails to check uniqueness before writing. Having this rock-solid uniqueness guarantee at the database level would still be a useful feature to have, IMHO.

mrjn · November 17, 2017, 8:19pm

We’ve planned this issue for v0.9:

github.com/dgraph-io/dgraph

Assign single UID to predicate

opened 11:06PM - 22 Sep 17 UTC

closed 03:44AM - 11 Jan 19 UTC

calummoore

kind/feature

As posted on the [community forum](https://discuss.hypermode.com/t/single-uid-assign…ed-to-predicate/1787): It would be useful to be able to assign a `uid` to a predicate directly without adding it to a Posting List. For example, I might have a predicate `teacher`, where it is only ever possible to have one `teacher` at a time. At the moment I would have to do the following to update the predicate, which means 2 round trips and is unnecessarily verbose. ``` mutation { delete { <0x4> teacher * . } } mutation { set { <0x4> teacher <0x9> . } } ``` Ideally, you would want to be able to specify in the schema that that predicate only accepts a single uid. ``` mutation { schema { teacher: uid @unary } } ``` It would also have the added benefit of being able to return an object instead of an array when the unary predicate is being queried, as it would know there could only ever be one node/object.

@pawan is assigned to this.

tamethecomplex · November 17, 2017, 8:30pm

Thanks @mrjn, but I’m not sure we’re referring to the same thing. The @urnary predicate seems it would apply to a Node → Predicate → Node relationship where there can be only one such relationship per origin node. But for the “unique index”, I was thinking:

Node -> Predicate with UIX -> "Unique literal value"

Where there is a database-wide guarantee that for “Predicate with UIX”, one and only one node has a given “Unique literal value”.

That way if I have ten source systems all sending entities into dgraph and populating “Predicate with UIX” with hypothetically conflicting values, I have a uniqueness guarantee at the database level that can never be violated even if someone messes up by not checking first to see if that value already exists. Basically extending the enforced uniqueness that UID’s have to other predicates as well.

mrjn · November 17, 2017, 8:49pm

This is similar to the login system described in the blog post:

q := fmt.Sprintf(`
    {
        login_attempt(func: eq(email, %q)) {
            checkpwd(pass, %q)
        }
    }
`, email, pass)
resp, err := txn.Query(ctx, q)

In this example, we check that the email is unique, by using a hash index on email attribute.

That’s the benefit of transactions, such checks are possible completely via client code. If you have multiple people writing to Dgraph, they should all be running this logic transactionally.

I now reckon this is the same response as by @calummoore.

tamethecomplex · November 18, 2017, 2:42pm

Thanks guys. As I understand it, transactions used in this way will ensure that uniqueness is not violated due to concurrent write operations conflicting with each other.

It doesn’t reach the original goal I had in mind of making it impossible to have a duplicate value for that predicate in the database, which could happen if someone wrote without checking to see if the value already exists. Maybe this is OK though, provided no one writes directly to the database but rather goes through an API that enforces this check for them.

My frame of reference for this request is from the relational database world, where a unique index on a column would prohibit a duplicate value ever existing in that table for that column. I like the “security blanket” that this database-level check provides, especially for cases where a violation would break the application. So that even if someone messes up with a mutation, the constraint can never be violated because the database would reject the transaction. But maybe this does not translate naturally to Dgraph or the use cases you anticipate?

Either way, thanks for your responses, and thanks @calummoore for writing the client library for node which I expect we’ll be integrating into our code base within the next couple of months.

Topic		Replies	Views
Can predicate add a unique index？ GraphQL kind:question	5	2041	October 4, 2020
Any plans for unique Index on predicates? Dgraph	2	1489	August 24, 2018
Provide unique index support for GraphQL+- Dgraph dgraph , kind:feature , area:querylang , area:indexes	3	1328	October 8, 2020
Transactions in Dgraph Users	2	2281	October 25, 2017
Unique constraint in DQL Dgraph	11	1702	March 24, 2023

Unique Indexes on predicates

Related topics