At the end of 2020, Forrester assessed the Graph Data Platform landscape, providing a report on “The 12 Providers That Matter Most and How They Stack Up”. Dgraph was thrilled to be featured as a contender, particularly as The Forrester Wave™ showcased that our product offering is the strongest in the category, and even ranked above some in the next competitive tier.
According to the report inclusion criteria, to be selected for this report, companies had to include an enterprise-class graph data platform offering, a standalone graph data platform, a publicly available release, and a referenceable install base. More interesting to our team was that there had to be customer interest in the product with multiple Forrester inquiry calls about the vendor in the past 12 months – clearly, people are asking about us.
This inclusion marks the long path we have traveled from the initial v0.1 release of Dgraph in Dec 2015. Dgraph is now the most popular open source Graph database on GitHub and also the youngest company in the graph landscape to be featured in Forrester’s report.
So, we’re really grateful to Forrester and Noel Yuhanna for including us in this report. There are though, certain categories of evaluation where we feel we could have been ranked better. In this blog post, I’ll talk about those categories where we got an average score, but we felt we deserve a top score.
Data loading, Performance and Scalability
“When we performed a stress test on a thousand concurrent queries, Dgraph was still able to maintain a response time of 50ms, while simultaneously achieveing 15000/s QPS; the performance is excellent.” Pan Gao, Chief Search Architect at KE Holdings
Dgraph’s history starts at Google, where I had worked on a project to build Google’s first graph indexing and serving system. It was built to scale to serve Google’s Knowledge Graph via Search.
Outside Google, Dgraph was built from the ground up to serve terabytes of graph data while providing low-latency, high-throughput operations with arbitrary depth joins – scalability and performance is in Dgraph’s DNA. In fact, the D in Dgraph stands for “distributed”.
Dgraph is natively distributed with synchronous replication. It does automatic data sharding and balancing to ensure our users’ graph would scale. In fact, Dgraph is one of the few offerings in the market which perform really well on commodity hardware, without requiring 100s of GBs of RAM to load everything into memory before executing queries.
Our customers are running dozens of terabytes of data in a single Dgraph cluster, sharded and replicated 3-ways to run a 12-node cluster. Dgraph’s bulk loader can load data at the rate of millions of edges per second on a 32-core AWS machine out of the box.
“When we completed our build, the Core Symbology database had 160 million nodes with two billion edges and many billions of facets. I found that Dgraph traverses FactSet’s large graph quickly with most queries requiring less than 20 milliseconds.” Mark Boxall, Principal Software Engineer at FactSet
Transactions, High-Availability and Fault Tolerance
Dgraph is quite inspired by Google’s Spanner. Dgraph has a very strong consistency model. It provides MVCC, read snapshots and distributed transactions. Dgraph is the only graph database in the Forrester report to have gone through the rigorous Jepsen test, a state of the art analysis for distributed systems. With an effort to improve the safety of distributed databases, queues, consensus systems, Jepsen has tested databases like Postgres, MongoDB, CockroachDB, Cassandra, etc. and is renowned for its “no punches pulled” analysis of databases.
Dgraph provides cluster-wide distributed ACID transactions, snapshot isolation and linearizable reads. In fact, as the Dgraph’s Jepsen test shows, it provides strong transactional guarantees in the face of machine crashes, network partitions, clock skews, disk failures, and many other fault conditions. It is designed to survive machine, rack, and datacenter crashes. As such, it provides high availability without losing transactional guarantees. I wrote about our experience solving issues found by Jepsen here.
Deployment Options and Cloud
Dgraph is designed to be a cloud-native database. It not only survives Kubernetes, but actually thrives in it. Most of our customers use k8s to run Dgraph, and so do we. With Slash GraphQL and Dgraph Cloud, we have a solid cloud offering, making it easier to adopt Dgraph.
Moreover, under Apache 2.0, Dgraph is a liberally licensed open-source database. Anyone can analyze the entire source-code of Dgraph, modify it and run it on-premises without ever talking to us.
App Development
Dgraph is the only database in the report to provide GraphQL support out of the box. GraphQL, since its launch in 2015, has taken the world by storm. There is a foundation around it, lots of big companies have switched to it and many developers are actively buildings their apps in GraphQL.
Dgraph adopted GraphQL as query language, with JSON and gRPC outputs from early days, making it modern and friendly for today’s developers.
The benefit of GraphQL is that the query structure naturally models the graph that the query is traversing. A GraphQL query is like a structured explanation of the graph you’d like as a result. Other graph query languages (and even SQL joins) follow edges in the graph, but return lists of results, losing the relationships between the entities in the response.
Because Dgraph was built for native GraphQL support, Dgraph’s unique approach to graphs solves common GraphQL problems by default, without any special code or handling. Issues like N+1 problems and scale are solved in the database (including under and over fetching, overloading the endpoint with tons of calls, etc.) instead of requiring ongoing problem-solving efforts.
So a GraphQL solution built with Dgraph doesn’t have the engineering concerns that keep GraphQL engineers up at night.
“From scalability, fault-tolerance, and read speed, to handling joins and rebalance of data across shards, Dgarph offers a powerful suite of tools that every GraphQL developer wants. No ORMs, no N+1 problem, GraphQL queries, Dgraph is essential to QuillerBee’s growth efforts.” - Abhijit Kar, Software Engineer at QuillerBee
We are Solid Contenders
Contender: a person who tries to win something in a contest especially: a person who has a good chance of winning – Merriam Webster
At Dgraph Labs, we take pride in our pioneering work on graph database technology. We are confident that we are building the most technologically advanced solution on the market.
Over the last five years, we have worked tirelessly to deliver a cutting-edge graph database that is blazingly fast, highly scalable, and robust. As engineers, we are proud of the research that has gone into this reinvention of graph systems and the results it has produced.
Dgraph is an early-stage startup, in fact, the youngest company in Forrester’s list. Even though we haven’t yet garnered the depth and breadth of customers that some older graph platforms have built, Dgraph has quickly become the top choice for startups and Fortune 500 companies building graph databases in production environments.
With 15k Github stars, 7M Docker pulls, and 100 Billion queries run over Dgraph instances in 2020, Dgraph has gained a solid market reaction in a short span of time.
“Dgraph won because it was open source and, according to Gao, only Dgraph was able to handle the volume of data KE had to manage and still deliver query results in milliseconds.” - TechRepublic, China’s Zillow alternative goes open source to scale a 10 billion node graph database
Thanks for reading this blog post!
Curious about the tech? Read the Dgraph paper. If a performant, developer-friendly, graph system could be useful to you, do check out Dgraph’s cloud offering. Also, if you want to help us solve challenging problems, we’re hiring!
This is a companion discussion topic for the original entry at https://dgraph.io/blog/post/dgraph-forrester/