TL;DR;
If there are any newer users or even engineers, CEO, CTO, etc. I think it is VERY important for you to go back and try to get a grasp on Dgraph from the very beginning. What was the driving factor to even create Dgraph? What was the reason for each of the main pivots? What effect did these pivots have on the company and the project itself? At the very least, read the quote snippets at the bottom that I have compiled that tell the history I am going to outline
I’m prompted to write this in response to the conversation from this recent article alongside some personal conversations amongst Dgraph dev peers.
First, a driving principle that I wholeheartedly believe in:
Those who cannot remember the past are condemned to repeat it.” – George Santayana, The Life of Reason, 1905. From the series Great Ideas of Western Man.
I am not writing this to blame or call names, although some names will be used to help tell the story. What is important is to learn from this history that I believe many, even including the new organization taking over R&D/support, may not be aware of.
There was a decision made in late 2015 to use GraphQL as the main query language to build a new graph database. This was a decision that set in motion where we are today with Dgraph. Not too long after that decision and running into issues with realizing GraphQL is not a database query language, another choice was made to deviate from the GraphQL specification which initially became known as GraphQL±, now known as DQL.
If you have been a software developer for any length of time, you have made these same kinds of decisions. You see a problem and find a solution that at the time looks to be the right choice. Later on, you realize that your decision has flaws and you have to pivot. I know I have been down this same path many times.
This is the history of Dgraph, design decisions with big pivots. Sometimes a pivot leads to a complete rebuild (Java → Rust), while other times it leads to a completely new-looking, and named, project (Grakn → TypeDB). The difference with Dgraph is that almost every time Dgraph has pivoted, backward compatibility is mostly maintained at almost all costs. This seemingly has led to half-baked solutions that are 80% completed. The last 20% of every project is always the hardest and usually becomes the 300%, lol. There is one big instance where I saw this pattern was not followed, replacing RocksDB with Badger. Side thought, the claim that “data is handled directly by Dgraph, and not given off to another database layer” is not true at all.
So we have these pivots so far in Dgraph: (these may be out of order, I tried)
- GraphQL → GraphQL± (DQL)
- Non Transactional ACID → Transactional ACID
- No Schema Support → Schema and Schemaless Support
- RocksDB → Badger
- Commitment to Gremline, Cypher, and GraphQL Support → Only GraphQL support.
- Non Spec GraphQL → native GraphQL Spec
- MultiOS (including Windows) Support → No Windows Support
- Authorization Hooks support → Query only JS Lambda hooks
- Primary DB → Secondary Blockchain Indexing DB (failed attempt)
- Vector Support with Lambdas → Native Vector Support (ongoing?)
With most of these pivots a common trend has been to leave the former solution at 80% completed. DQL is missing a ton of features that were promised and were lead to believe were close to release, but then GraphQL spec came into view and those features have been mostly untouched ever since. Auth within GraphQL spec compliant was developing rapidly and then came along Lambda hooks, and Auth limitations have been mostly untouched ever since. Do I even need to mention again the commitment to add support for Gremlin and Cypher to only “punt” on that commitment?
I can agree that Vector Databases have some interesting use cases and is currently where investors are looking toward, but we (I can speak for many in the community) believe that by making another pivot will leave the existing 80% completed project in the dust with only fixing small things that will add value to the main push of vector support. Side thought, maybe we’ll finally get actual list support instead of just sets
and maps
as vectors need true list support I would think.
Pivots are usually done for 3 different reasons. 1) Financial Steering, 2) Technical Challenges, or 3) User/Developer Adoption. Dgraph has a history of making big pivots for a few user/developer adoption, and seemingly ridiculous pivots because of financial steering. This in return caused much fustration from existing developers who were promised new features and are left in the dust without seeing the light of day while catering to someone else. This has happened now more than once with Dgraph, and it is very hard to keep trust and hope in Dgraph during these transitions. I have seen this both second hand and first hand.
Another problem arises with these pivots due to user adoption, because if not managed very carefully and strategically, you can easily lose both the existing users and the new users you are trying to cater to. You have to decide if this is a risk worth taking before making such a pivot. It can be hard to get the pulse of the current users to see if they are willing to take a step back in priority so that the project can “evolve” into something better. Note, not all evolutions are better.
If I look at the activity of once new developers and projects using Dgraph in one way or another, I can’t help but see a big drop off. I’m sure I’m not the only one who has seen this too. I want to carefully caution the new team of making this new pivot without being very very transparent about decisions being made, as this is still considered an open source project. If it wasn’t open source, then just ignore us—the community, and do whatever money dictates you do.
If there are any newer users or even engineers, CEO, CTO, etc. I think it is VERY important for you to go back and try to get a grasp on Dgraph from the very beginning. What was the driving factor to even create Dgraph? What was the reason for each of the main pivots? What effect did these pivots have on the company and the project itself? At the very least, read the snippets below that I have compiled that tells the history I have outlined above.
[October 22, 2015] I’m working on building Dgraph… It’s still early stage, and I’m debating which graph query language to support. Facebook just launched GraphQL… But, I’ve also heard a lot about Gremlin. What do you think of them? I don’t want to stretch out too thin and support both, at least at this stage. Which one do you think would be worth aiming at (given it’s a new graph database)? - Manish Rai Jain
[October ??, 215] I like GraphQL, it has most of the nice properties of MQL1. Gremlin has more Hadoop support if that matters - Manish’s ex-manager at Google
[December 1, 2015] Thanks for your advice! Went with GraphQL, quite like the query language so far.
[4/18/2016] Dgraph… is a native graph database in the sense that the data is handled directly by Dgraph , and not given off to another database layer. Apart from use with diverse social and knowledge graphs, Dgraph can also be used to: build real-time recommendation engines, do semantic search, pattern matching, serve relationship data, and serve web apps via GraphQL… - Introducing Dgraph by Manish Rai Jain
[6/21/2016] Hey, I just checked out the demo, and it looks like the query language is similar to GraphQL in syntax, but doesn’t follow many critical parts of the specification, meaning it won’t be able to work with GraphQL clients such as Apollo Client and Relay that expect spec-compliant results. - @stubailo
[6/21/2016] Yeah, I know that our implementation of GraphQL isn’t exactly as mentioned in the spec. This is because GraphQL is meant as a REST API replacement, and not really a graph query language. So, we’re making modifications to it to ensure it can operate as a full-fledged graph query language… At some point, once we’re mature enough and have better understanding and implementation of GraphQL, we can release our mods to the official spec; and see whether they should live separately or be merged into the official GraphQL spec. - @manishrjain
[6/21/2016] I would argue it’s a bit misleading to specify that you are using GraphQL as the query language, even though it is not compatible with any GraphQL tools. - @stubailo
[6/21/2016] Honestly, my concern here is that we’re figuring out about GraphQL as we’re moving forward. And I just don’t know to what extent can we push it to behave like a Graph language and yet remain within its specs. If we’re going to deviate anyway, I don’t want to push too hard to stay within the specs. OTOH, if there is a convergence path, then I’m all for it. GraphQL ecosystem is surely growing, and we’d love to be part of that. - @manishrjain
[6/21/2016] I think if you find that GraphQL doesn’t work as a language for your database (I’ll admit it is a bit misleadingly named because it isn’t really a query language for graphs), it could be good to rephrase the documentation and marketing to say “uses a query language similar to GraphQL” so that people know what to expect. - @stubailo
[6/30/2016] I’ve been thinking about this over the past days. A lot of people get interested in Dgraph because of GraphQL, and so I think it would be worth our effort if we try to bring our implementation as close to the spec as possible. - @manishrjain
[3/28/2017] I think the GraphQL spec is not versatile. Everything has to be keys and values; there are no functions and no function chaining. …GraphQL has a lot of unnecessary stuff that we’ve decided to avoid implementing at all. So, I doubt we’d be able to reach parity with GraphQL. I think what we can do is to build a JS library that helps people interact with Dgraph. - @manishrjain
[7/26/2016] We need to have a thorough review of our QL, and see if we can get it to be close to compatibility with GraphQL. - @manishrjain
[11/15/2017] We’ll make a push to try to reconcile GraphQL± with GraphQL past v1.0. - @manishrjain
[12/31/2017] Dgraph uses GraphQL +/- only as a query concept. Dgraph does not have a schema defined as GraphQL… Basically if the Dgraph is to accept GraphQL natively, it will have to create a context mimicking the original idea. - @MichelDiz
[1/1/2018] Support GraphQL spec - @manishrjain
[1/18/2018] Dgraph needs to natively handle standard GraphQL queries, or GraphQL queries should be “compiled” into GraphQL+/- (or other supported language e.g. gramlin, cypher). Regardless, the result would be an opinionated way to structure the graph data. Is that fair to say? So, rather than “Make Dgraph work with standard GraphQL”, should we instead focus on how such an opinionated data model would look? - @ptpaterson
[1/1/2018] Yeah, we’ll try and support [GraphQL] as close to the spec as possible. I think GraphQL compatibility is needed by a lot of users. - @manishrjain
[6/27/2018] Dgraph already natively supports a modified version of GraphQL. So, supporting the official spec would be native and should perform better than the overlay support that Neo4j and others have implemented. - @manishrjain
[8/10/2018] The authors of GraphQL have stressed on multiple occasions that it isn’t intended as a complete query language for traversing graph dbs, or server-to-server. It’s a server-client API. That’s why Dgraph modified the syntax in the first place. Far better than standard GraphQL would be full support of both Gremlin and Cypher… Having client-compatible GraphQL really doesn’t accomplish much because 99% of the time Dgraph will still have to pass through the API server anyway to implement the things that are far outside the scope of responsibility for a database. - @frankdugan3
[8/10/2018] At the end of the day, it doesn’t matter what the technicalities are and what happens under the hood. Having no first class GraphQL<->Client support has inhibited dgraphs outlook as a deal breaker. - @D1no
[8/11/2018] This issue has the participation of less than 1% of the Dgraph userbase because connecting directly to the client via GraphQL is simply not a normal expectation for a DB. - @frankdugan3
[8/12/2018] Dgraph has many other areas of growth that are FAR more important for adoption before this proposal. When a team is evaluating an up-and-coming DB, the primary question is not going to be, “Did they shoehorn the API query language we like into the client drivers?” The questions are going to be the fit to domain model, the constraints, the type system, the reliability, the scalability, the financial situation of the company developing it, etc… Some people are asking for direct-to-client GraphQL API’s for databases, but far from the majority. I used to think this was important, but I think that was my inexperience showing. The idea of being able to bypass writing the API server is tempting, but it only works in very, very simple scenarios, and it burdens the DB with many problematic concerns that already have great solutions in API frameworks. - @frankdugan3
[8/12/2018] I personally find dgraph after 1,5 years In a “stuck in the middle approach” of not using a industry standard DSL like Cypher but also not implementing / caring / driving “a recent” data layer innovation like GraphQL. - @D1no
[12/18/2018] I’m building this, which will address lots of the GraphQL points for Dgraph. - @michaeljcompton
[12/20/2018] Update: We’ve punted on Cypher and Gremlin support for the roadmap. The focus is on GraphQL and other features mentioned here. Two of them are already being worked upon, i.e. binary backups and access control lists. - @manishrjain
[1/14/2019] Support official GraphQL spec natively - @manishrjain
[1/14/2019] GraphQL is a great language for apps to be built on – and that’s the aim here, is to support it to allow building apps easier on Dgraph. Dgraph is a great graph DB, but also a great, general purpose primary DB for apps; and we see more and more people/companies use Dgraph to build apps. - @manishrjain
[1/17/2019] That is smart and for that reason, I think it would be a wasted effort to offer a pure GraphQL connection to Dgraph, with the intentions of using Dgraph as a “backend” for client-side apps. There must be a layer of business logic in front of Dgraph and behind the GraphQL endpoint for any sized application to be safe and work smartly. - @smolinari
[1/22/2019] Dgraph shouldn’t be placed in the same boat as Prisma, AWS AppSync or PostGraphile. It is ONLY a database, currently and I highly doubt it will get to be much more or rather, I don’t think it should. - @smolinari
[10/29/2019] October 22, 2015… I’m [Manish] working on building Dgraph… It’s still early stage, and I’m debating which graph query language to support. Facebook just launched GraphQL… But, I’ve also heard a lot about Gremlin. What do you think of them? I don’t want to stretch out too thin and support both, at least at this stage. Which one do you think would be worth aiming at (given it’s a new graph database)? The reply was quick, short and sweet. I [ex-manager at Google] like GraphQL, it has most of the nice properties of MQL1. Gremlin has more Hadoop support if that matters… Thanks for your advice! Went with GraphQL, quite like the query language so far. And that was how GraphQL became the native query language for a new database called Dgraph. …we were having doubts about whether GraphQL can really be a query language for a database. We were no longer convinced that the official spec matched the needs of a database… So, we realized we needed to build something custom into the spec to allow for inserting and modifying data in a standardized way. …we deviated just enough from the GraphQL spec that we could no longer call it GraphQL. So, we switched the name of our query language to GraphQL±. Plus, because we added things to the query language…, and Minus, because we removed things from it… We did not want to deviate from the spec, we just had to do so to allow continue using GraphQL as our native query language, while building a graph database… Dgraph’s basis in GraphQL means that it is the closest thing available to a native GraphQL database… Ultimately, we feel interoperability with the growing GraphQL ecosystem is too important for our users to not be addressed… Today, we are changing that… putting together a native spec-compliant GraphQL support into Dgraph tapping into the power of GraphQL±… - Building a Native GraphQL Database: Challenges, Learnings and Future by Manish Rai Jain and Michael Compton
[10/29/2019] Breaking changes in 1.1 caused (and are still causing) us real headaches. As we continue to invest real effort in applications using Dgraph, could we have some clarity… - mikehawkes
[2/23/2020] Data can be retrieved from Dgraph using GraphQL and a modified version of GraphQL, called GraphQL±. GraphQL± has most of the same properties as GraphQL. But, adds various properties which are important for a database, like query variables, functions and blocks. More information about how the query language came to be and the differences between GraphQL and GraphQL± can be found in this blog post. - Dgraph Paper
[4/26/2020] I’m interested in graph databases and have taken a look at Dgraph… I think GraphQL is the wrong choice for a graph query language. GraphQL has nothing to do with graphs, except it’s unfortunate naming choice. It was designed to query specific parts of JSON data from an API endpoint… I guess you made it work with GraphQL±… However, the underlying assumption is still wrong… Put it simply, using GraphQL for a graph database query language is like trying to use a fork as a knife. I wish Dgraph would implement a better QL. It doesn’t have to be an existing QL like Cypher or Gremlin, personally I don’t like either of them. Whatever query language you choose in the end, it better not be a fork when what’s needed is a knife. - bsquaster
[6/3/2020] As of 20.03.0 Dgraph has support for spec compliant GraphQL. - @michaljcompton
References