Based on my testing, it does not. For a high volume system, I am therefore finding it hard to see if the pros out weigh the cons for the @upsert directive.
Pros: No duplicate data
Cons: Potentially a lot of failures when trying to create different instances of the same data (in concurrent transactions)
This issue is happening again. I thought my tests were verifying/clarifying things for me. Its possible i have inadvertently introduced a bug (but my code is v light at mo). I can revert/test earlier code, but below question still alot of value for me to understand…
I will summarise my use case. Firstly here is my schema:
The first upsert works, the second one fails with the exception:
Exception with upsert for ID “ed829200-96ac-11ea-b073-5b088a523213”, Label ocode, Node OcodeIdentity(ocode=ed829200-96ac-11ea-b073-5b088a523213, browsed=null)
java.lang.RuntimeException: java.util.concurrent.CompletionException: io.dgraph.TxnConflictException: Transaction has been aborted. Please retry
So ocode is now exact
I left userseed as term
The upsert logic for both nodes is the same though, and fully expect my userseed upserts to break if i throw enough concurrency at it… Can someone explain the relationship between upserts and indexes - and any other considerations. Thanks
Thanks, @damienburke for raising your concern. The @upsert directive checks for the conflicts for concurrently running txns mutating the same data. And the way it is done also depends on the type of index on that predicate. Data is matched using a key which is a function of predicate + value token(generated by index) and tokens depend on the type of index we are using. For example,
let’s take the following concurrent txns : _:node_1 <foo> "name" . . _:node_2 <foo> "(other) name" .
If we use hash Indexing on predicate foo then the key generated for both of them will be different as in the first case it depends on foo+"name" and foo+ "other name" in the second case respectively. But if we use term indexes on predicate foo then 2 keys will be generated in the second case(because of 2 tokens "other" and "name") which will be a function of foo+"other" and foo+"name" which gives us conflict with the key of the first transaction and hence one of the txn will abort. I hope it will help you understand the relation between upsert and indices. Feel free to ask any follow-up question.