Has() not working until transaction is committed

Moved from GitHub dgraph/3841

Posted by AlexandreDaSilva:

  • What version of Dgraph are you using?
    Dgraph version : v1.0.16
    Commit SHA-1 : 0590ee95
    Commit timestamp : 2019-07-11 11:52:54 -0700
    Branch : HEAD
    Go version : go1.12.5

  • Have you tried reproducing the issue with latest release?
    Yes, v1.1.0-rc1, and the behavior is the same.

  • What is the hardware spec (RAM, OS)?
    Windows 10 Pro, version 1903
    Intel Core i7-9750H
    16GB of RAM
    SSD 512GB

  • Steps to reproduce the issue (command/config used to run Dgraph).

From a clean install of Dgraph, I create the following schema:

<Profile.age>: int .
<Profile.firstname>: string @index(hash, trigram) .
<Profile>: string @index(exact) .

I use the Profile predicate to type my nodes.

In the same transaction, I try to create a node, then verify it exists.

const txn = client.newTxn();

const p = {
    Profile: "__type",
    "Profile.age": 10,
    "Profile.firstname": "u1"
};

const mu = new Mutation();
mu.setSetJson(p);
await txn.mutate(mu);

const result = await txn.query(`
    {
        profiles(func: has(Profile)) {
            uid
            expand(_all_) {
                uid
            }
        }
    }
`);

const json = result.getJson();

// Log: Object {profiles: Array(0)}
// Should be: Object {profiles: Array(1)}
console.log(json); 

await txn.commit();
  • Expected behaviour and actual result.

The above code snippet should return one result. has() function should work during a transaction and behave like the others functions (for example, eq). Currently, this code snippet doesn’t work, and I get no result.

If I replace has(Profile) with eq(Profile, "__type"), I can find my node and I get the expected result.
If I execute the query using has(Profile) after the transaction has been committed, I can find my node.

After some digging, I found that the last version where has() works as I think it should during a transaction is v1.0.9. Starting from v1.0.10, the behavior is different.

#2724 was introduced in v1.0.10, it may be related.

manishrjain commented :

Somewhere in those commits last year, there was one which removed an overlay of uncommitted transactions that we were managing on top of the committed data set. It was causing us performance issues to maintain that.

With those changes, has and most indices would only work once the txn gets committed, not before. So, this is expected behavior.

AlexandreDaSilva commented :

Coming from relational databases with ACID transactions, I find this behavior odd. When you are in a transaction, reads and writes should be consistent. If I can’t see the data I just wrote in the same transaction, I think it can lead to fragile code, especially in large projects with many developpers.

Suppose I have a function that checks if a promotional code has been applied to a basket, and applies it if it has never been applied. For some reason, because there are many developpers on the same project, this function is called multiple times in the same transaction: this will lead to the promotional code being applied multiple times.

This example is not the best in terms of best practices, but it illustrates the risk of having duplicated data or incoherent state. When you compose multiple functions in a same transaction, there is always a risk that some of the functions may execute database queries to verify preconditions, possibly the same ones, and act accordingly.

In my case, every query is at risk, because I filter every query by type (e.g. @filter(has(Profile))). Therefore, when I compose multiple functions in the same transaction, I would have to think that I just can’t see in queries the data I just created. In a large code base, I think it is not viable. I have a lot of queries and mutations in my project, and I’m afraid that this change in behaviour will have subtle repercussions.

I don’t know if I make sense, what do you think about it?

In any case, I would have liked to have been informed of this change in behaviour more clearly before proceeding with the upgrade, because it is not nothing :). When I encountered this problem, I consulted the documentation and the changelog to find out more about the behavior of has() and other functions in transaction, and I found nothing to answer my question. If you think this is expected, I think it should be indicated.

mangalaman93 commented :

related to #3721

AlexandreDaSilva commented :

Thanks for the link. I read #3721 and #3103, and I’m kind of puzzled by your stance on transactions. I would like to open a debate on this subject and discuss it with any of you, because I think it is important.

What you call “an inconvenience caused by the internal design of Dgraph”, for someone who has experience with relational databases supporting ACID transactions, I would call it a bug.

When I read, for example, “Multiple read - mutate pairs of operations on the same data inside the same transaction can lead to stuff like this” or “For now only do one read and mutate per transaction just to be safe”, it is really surprising for me because I didn’t have this kind of information when I started using Dgraph, and that explains a whole bunch of strange behaviors I’ve had to deal with, especially having made my own Object Graph Mapper. The documentation should really emphasize this kind of behaviour, and the practices that result from it, because for someone who comes from the world of SQL, these are not expected behaviours for an ACID-compliant database.

Personally, I do a lot of mutations and queries in a same transaction, sometimes on the same data, because the transaction is useful for 2 things:

  1. Of course, it ensures that two simultaneous transactions do not produce an inconsistent state, and all other related ACID guarantees.

  2. But also, and I think this is an important feature, it lets you revert all the actions at anytime by discarding the transaction.

And that, I can’t do it if I can’t do all the queries and mutations corresponding to a certain workflow in one and the same transaction. By dividing the workflow into several transactions, this adds to the developer’s responsibility to manage data recovery in the event of an error. Badly handled, this can result in incoherent states in the database. I had already been a little surprised by your position when you suggested to users who encounter a TxnTooBig error to divide the workload into several transactions, because this prevents them from benefiting from point #2 mentioned above.

This also joins my previous message: when you start to have a complex application, divided into several modules that interact with each other, if you try to perform several actions in a workflow that execute each of the queries and mutations within a single transaction, it is difficult to know who did what, and if I can’t rely on the read state of the database to know the state of my transaction at a time T, it can cause design problems.

That being said, I understand that Dgraph is intended for heavy use requiring good performance, and that this may influence the design of the underlying application. But in this case, it should be better highlighted, because for someone who uses Dgraph as a day-to-day database for moderate workloads, I don’t expect to have to change the way I architecture my application, as I could do with a SQL database for example.

I’m sorry if this message seems to be offensive, that’s not the goal and English is not my native language. I am interested in the development of Dgraph, I use it in production in a large project, and I want to bring my point of view, in relation to my experience. :slight_smile:

mooncake4132 commented :

@AlexandreDaSilva Dgraph also had performance issues with big transactions. See #3046. I haven’t retested so I don’t know if performance has been improved since.

I also came from a SQL world and agree with most of what @AlexandreDaSilva said. I understand Dgraph is still very young project and have other priorities but I do hope Dgraph team can put more love into Dgraph’s transaction system.