Current status of Jepsen tests

Hi guys! I noticed on Github, a bunch of issue threads started by aphyr (Kyle) are still open Issues · dgraph-io/dgraph · GitHub

Can these issues be addressed/closed? What’s the current status of them? Have they been fixed in the latest version? Are there any outstanding Jepsen issues that have not been resolved?

Is there any way for the public to see the status of the nightly Jepsen test runs?

Also are there plans to do a new Jepsen test? Is that why these threads were never marked as closed?

Sorry for all the questions :stuck_out_tongue: Thanks for any info!

I’ll go through the issues later but I believe all these issues were addressed before the latest Jepsen report was published.

Currently we don’t share the results of our Jepsen tests. No plans for another Jepsen report at the moment. The issues are most likely still open because we forgot to close them.

Currently all the Jepsen tests pass on master but we are working on running nightly tests reliably. There are some issues with Jepsen that cause some tests to crash early due to issues during the cluster setup (not an issue with Dgraph itself). We are working on making the tests more reliable but at the moment there are no failing tests (tests where the Jepsen analysis fails).

1 Like

The move tablet tests were failing before. Is that no longer the case?

I believe this was fixed when the issues with split posting lists were fixed. I haven’t seen a failure yet but I’ll keep those issues open for now and check again when we have stable nightly tests. Last time I ran the full suite all the tests were passing.

I closed a couple of issues that are marked as fixed in the Jepsen report.

1 Like

Could we run these tests repeatedly to verify if they have been fixed? The nightly would be good, but it won’t run the same test like 20-30 times.

1 Like

Any specific workload and nemesis combination that I should try?

move tablet is the one which was breaking – I think all the issues mentioned above are probably for move tablet.