Formal methods for Dgraph


(Guido Stepken) #1

When reading this: http://jepsen.io/analyses/dgraph-1-0-2 , for me it again became clear, that you can’t test quality into complex software. If you don’t systematically cover all possible scenarios for all numbers of engines, shards, functions … in a database cluster in a reproducible way, you get lost. Adding a single function can have unforeseen side effects, that must be somehow represented in the source code. For a very good reason there is a wide use of code generators, finite state machine compilers (See e.g. Ragel) to cover at least all scenarios within a certain context. Getting ‘states’ right in a cluster of undetermined number of machines in all possible configurations and setups, even mixed modes, is almost impossible. Number of scenarios, that must be tested, explodes exponentially. Adding one single function here makes writing hundreds of thousands of test scenarios neccessary.

That’s what e.g. Leslie Lamport had to consider when implementing Microsofts multi paradigm CosmosDB.

Just my 2¢


(James) #2

that post is a very interesting test result,
but the tested version is outdated.

Great to see that DGraph contracted Jepsen for testing. Its a marvelous approach to getting better software bugs discovered and fixed and improving database performance and stability. Great decision! :sunglasses:


(Daniel Mai) #3

I’d also love to see more formal methods being used (e.g., Leslie Lamport’s TLA+).