This may be obvious, but how can a single transaction write data and then read that data back to the client? If the transaction is immediately committed, it can’t be rolled back later if something else errors. Am I missing something obvious here?
I thought that I read somewhere that you could read-write-read-commit in a single transaction (can’t find it now though, so obviously I may be wrong in this thought. Thanks for the clarification. Haven’t needed to handle transactions by hand yet, but good to know what is under the hood.
Well, Dgraph can do mutations and queries(like upsert) in the transaction context. But I don’t think that you can self query it “on the fly” you know? You can query only existing data in the cluster(that’s what an Upsert Block does. It uses the transaction context in one batch). You can’t open an transaction, then mutate and then query for it in without committing. This could be possible, but I don’t see why. You can control this way before deciding to send to the DB. Unless your need is to hold the transaction for days/weeks open. Which also doesn’t make sense in a DB transaction logic. Right?
I think these things like roll back, handling the transaction on the fly and so on could be a thing in the future. But this sounds more like perfumery than actual necessity. Please, correct me if I’m wrong.
Honestly, I think this is pretty fundamental behavior that I hope Dgraph solves for in the near future.
There are cases where you’d want to roll back the transaction after inserting data. For example, you would often want to roll back transactions in tests so that each test is isolated and doesn’t bleed into other test contexts. Write → read and assert schema → rollback.
But here is a common API scenario:
Mutate: create user with organization, 2 nodes.
Query: user and its associated org
Random application layer bug. The mutation has already committed, but the application layer will send an error code to the client.
In the linked thread the answer provided does state:
The way Dgraph works needs an extra transaction. Cuz the data needs to be distributed(some cases replicated) across the cluster. Without that confirmation, you can’t query the data.
This is actually rather interesting and I’d learn to understand this better myself @MichelDiz . Your answer seems to imply that Dgraph only replicates on commit. So the transaction is performed against the replica you are accessing and then once commit it initiated ensures all replicas catch up?
Does this leave room for inconsistent transactions or breaking consensus? or is it because all writes are forwarded to a leader, applied and then replicated?
I agree with @samfinan performing this in two transactions presents implications. For example in a case where we’re using transactions as a testing paradigm. We can’t safely rollback to reset the test state. This is a common testing pattern. I mean my work-around or solution is to label the nodes/parts of the schema and purge them after the test runs.In other cases the lack of being able to read imposes design implications not to mentioned on odd occasion performance.
Cockroach and even Mongo are capable of this. It’s not to poo poo on Dgraph it’s just curious because Cockroack especially uses a raft model. So as users we kind of expect/hope this isn’t a stretch for Dgraph. If it is, understanding the limitation is critical for us to understand for future planning.
Not exactly. I’m not 100% aware of it guts. But the data will be available only after the commit. You can’t run queries against uncommitted mutations. Neither on the “cluster RAM”. Also, the process of writing is a “distribution”, replication is available only if you set a replication configuration. And this happens right away, the commiting is just a ending/confirmation.
No, in the paper
Dgraph is a distributed database with a native graph backend. It is the only
native graph database to be horizontally scalable and support full
ACID-compliant cluster-wide distributed transactions. In fact, Dgraph is the
first graph database to have been Jepsen tested for transactional
For case of curiosity, can you share a doc about this? I wanna understand how other DB does it and why.
If I understand @samfinan correctly, this actually does work – at least, I’ve done this, mostly successfully – mutate and then query in the same transaction. The documentation here suggests this is supported: https://dgraph.io/docs/clients/overview/
A transaction always sees the database state at the moment it began, plus any changes it makes — changes from concurrent transactions aren’t visible.
 I think there’s a bug that causes the wrong data to be returned sometimes – the query will sometimes return an array containing the old and new value, rather than just the new value, but there’s a workaround. I’m not sure though, so I haven’t filed it as an actual bug report.