Can predicate add a unique index?

Can predicate add a unique index?

If you mean a UNIQUE constraint like in SQL, the answer is no. We don’t support UNIQUE at the moment. And it doesn’t make sense to support UNIQUE as it would quite defeat the purpose of a graph database

I don’t completely agree with this statement.

A graph database’s purpose is for storing data in a graph format instead of a rdbs. having a unique constraint would not defeat this purpose but increase its effectiveness and supported applications.

Furthermore, unique is already somewhat supported with xids. The problem being that it is limited to a single predicate per type and the ID type cannot also be used. If I wanted to add a UNIQUE constraint with a emailAddress predicate on a Person type, then every person would only be known by the email address.

What if their email address changes? Right now, we would have to create a new person with the new email and then update ALL references from the person to the new person, not the ideal situation.

But what about null values in a unique situation? Coming from the SQL world, every null is treated as a unique null allowing for multiple rows to have a null value and yet still be unique. This is useful for when we want to force unique emailAddress but not require it. In Dgraph there is no way to do this. :frowning:

It should be feasible though, because it solved the unique null value already by not even creating an edge if the value is null. So under the hood there could theoretically be a unique constraint that when a predicate is set it looks at all of the values for that predicate (ie: has(Person.emailAddress) function) and then looks to see if the value already exists in that var block.

Not saying, this wold be the most performant way, but it is a feasible way.

Update: This is probably more specific to the graphql endpoint and schema rules, and generated mutations then it is on DQL. The main difference is the in DQL a predicate can be reused across multiple types. For instance, name. A type Person can have a name and a type Dog can have a name. If we were to do unique in DQL it would be more difficult to implement because the underlying logic would have to decide if the predicate should be unique for a type, or be unique system wide. Using the example application above, the has(name) function would get the predicates across all types instead of a specific type. This is actually one advantage (or disadvantage depending on viewpoint) to the GQL schema translation into DQL schema.

@amaster507 For DQL, I think if the user uses the Upsert Pattern he’s is good to go. You just need to add few blocks checking the constraints you need. But the annoying problem here is that Upsert Block can’t throw custom errors(there is a ticket for this tho), that would be useful for cases like this. But It is usable.

I think that UNIQUE constraints can be at some point a barrier. It can increases latency as that field must be checked all the time in any circumstances. I prefer investing in Upsert instead of constraints that would drag down Dgraph and users would start to complain more and more about performance.

Cheers.

I think that UNIQUE constraints can be at some point a barrier.

I think that NOT having UNIQUE constraints can be at some point a barrier.

It can increases latency as that field must be checked all the time in any circumstances.

Could you explain this part a bit more? Are you saying that for queries that don’t attempt to mutate a unique field, DGraph would still need to check that a unique constraint wasn’t violated? Or are you saying that every time a unique field is being mutated, that DGraph would need to check that the unique constraint wasn’t being violated?

You just need to add few blocks checking the constraints you need.

Is there less latency in forcing the client to perform constraint checks on data stored in DGraph than having DGraph perform those same checks itself?

PS. Note that my points is about adding a constraint directive in Dgraph itself, in GraphQL you can have kind of constraint logic. But it is not a native thing, it is a layer logic.

Why not? In my view, there’s no free energy. Any extra thing you add to a system will cost “energy”.

Any operation that needs to rely on extra lookups will, by logic, increase the latency. That’s not an issue per se, but Upsert Logic does the job. We don’t need to maintain extra directives for that IMHO.

BTW, I think that the GraphQL layer has some kind of constraint logic (e.g https://dgraph.io/docs/graphql/schema/ids - but that isn’t a native index in Dgraph). But that is another thing in another realm.

No, the point is, we already have ways to do complex constraints via Upsert Mutation. Why add one more thing to maintain? Upsert Block is a powerful query, you can do really complex things with it, it needs some extra features and fixes to be even more powerful. But it does the job already.

Let the DB fly free, and when you need to drag it with constraints, use the Upsert Block approach.

Cheers.

1 Like