Is anyone using dgraph in production?

Dgraph seems like a really interesting project. I’m looking to migrate an elasticsearch database that contains ~500 million objects (150G of data) to a graph database to allow for better data retrieval across various document types in which denormalization is not an option.

I’ve been looking at neo4j and it seems interesting, but the cost to run it would be over ~200K in licensing.

Has anyone created a large dgraph stack in production? What are some issues that they faced along the way?

1 Like

we are close to.
issues we are facing now are mostly related to temporarily unavailability while loading data to dgraph. also, sometimes dgraph stops handling all requests with “DEADLINE_EXCEEDED” (with infinite raft elections in logs), and nothing helps until complete redeploy.
(such problems are related to 1.0.8, didn’t test it on 1.0.10 yet)

My worry is about the sustainability of the project. RethinkDB was promising but then they run out of funding and couldn’t figure out a sustainable business model. Now the product is dead.

Well guys, no fears. Just get involved with Dgraph. Push it to the limits, create issues, suggest things, send us PRs and etc. No project die if the community gets involved proactively.

The more people are using Dgraph, even for testing or small study projects. The more people giving constructive criticism with great arguments. The more people discussing the Dgraph here and outside the community. Best. This is like firewood for the fireplace.

About having people using Dgraph, we have good names, but we can not reveal who they are. They are actively using and helping to improve Dgraph. Closely.

Cheers.

5 Likes

What would be an alternative?

1 Like

The only serious alternative for dgraph is arangodb at the moment…

janusgraph is another one opensource distributed graph db with production deployments

Intuit has built K-Atlas on top of Dgraph - Intuit K-Atlas

VMware has open sourced their Project Purser last year, it probably was used internally long before going public.

A handful of other funded startups list Dgraph on their Stackshares

4 Likes

Dgraph is also being used in production at Fortune 500 companies and being trialed at various popular silicon valley companies.

Hi everyone!

I’ve suggested Dgraph as the core database tech in my company for an important project. It’ll be used along other technologies such as Elasticsearch and Minio.

Before effectively choosing Dgraph we evaluated the technology along with ArangoDB, and JanusGraph.

We discarded JanusGraph 'cause we preferred not having to deploy and support a Cassandra installation (only reliable way of getting distributed data AFAIK).
We discarded ArangoDB (which I liked a lot too). Although they mentioned distributed data, it said that better distributed algorithms were available using a licence.

We then investigated Dgraph project. Distributed data as a first class citizen was most and seemed more plausible using an RDF database. (LPG based graphs databases such as Neo4J and ArangoDB, are much harder to distribute)

The main setback when dealing with RDF (for us) was the query language (usually SPARQL) which was far from friendly. But again, Dgraph used a GraphQL like syntax.

So far it looked very appealing.

Finally, came maturity… that is well… at least to our experience, a bit underrated. It may sound harsh, but let explain to you why.

Years ago we had a huge searching data problem. Solr and OriendDB had been around for a while and nevertheless… we went for Elasticsearch. At the time, they had just released version 1.4. They company was not nearly the size it is now. It turned out to be a great technical choice, and we saw certain cues which made us believe the project would grow. It did much more than what we could have imagined.

Long story short, we made an educated guess and bet on what seemed to have a good future.

We are not only choosing Dgraph. We are betting on it. Dgraph has certain red flags, but all new and old technologies do. Let see how they work them out. To our experience technologies mostly get better given time.

Elasticsearch had data loss and weird cluster problems in its beginning. So did Kafka. Now Elastic.co and Confluent.io are doing quite well, and have a great reputation.

We are going in production on June having Dgraph as our core database. The best we can do is participate on it’s promotion, issue tracking, or whatever we can, since it’s on our own best interest that this technology takes off and turns into a successful company.

Uuupppp… seems like I extended myself a bit too much.

Well, just hope this helps.

9 Likes

Glad to hear that you’re going into production with Dgraph, @dszanto. Let us know how we can help!

2 Likes

Dgraph is built on a solid foundation: Badger

Badger is used by many projects, specially in blockchain. We regularly discover new projects using Badger as their core KV store. We stopped listing them because the list was getting ridiculously long.

Dgraph is a distributed database, and building distributed systems is hard. But that’s exactly why we do this. Simplicity comes at the cost of solving the difficult problems transparently. Our users don’t need to know how hard it is, they just reap the benefits.

We are a small team but we are working diligently to get issues fixed as quickly as possible and at the same time making improvements and adding features. Our work and reputation are our most valuable assets. But if you find any issues, please submit them to our issue tracker.

3 Likes

@dodyg and this is a shame as their cross-datacenter replication is really neat.

@mrjn Could we get names of these Fortune500 and Silicon Valley companies using Dgraph? It’ll help me a lot with my pitch to my higher-ups to use Dgraph as our primary database for our new products. We need a natively distributed graph database and currently only Dgraph fits the bill. Thanks for your work!

2 Likes

Hey George,

We’re not allowed to use their names in public. But, you can talk to @santo – he can help you with your queries.

1 Like

While I acknowledge there’s always a risk with any startup – Dgraph has both funding from top VCs in the valley (Redpoint and Bain Capital), and a healthy and growing user base. RethinkDB had other issues – they were competing the same field as MongoDB, and MongoDB took off way faster than Rethink. Dgraph is coming into a field (Graphs), where there’s a clear need for scalable, fast graph solution – and no other graph DB comes close to the level of scale and speed that Dgraph provides. We have a great product and an expanding market. Not to mention, Dgraph is open source under Apache license, so it can be used, modified, etc. forever.

Those issues are now resolved.

I have made Dgraph(v1.2.1) in production and found some problems;
1.single tenancy. it means one cluster only have one graph.I have create an issue early.but until now, it is not supported.
2.snapshot and rollup.when Dgraph snapshot and rollup.the write will be blocked.it make me lost data during blocking.I found an issue about it.hope it can be solved.

Hi @geoyws, I am sending you a direct message

Hi @Willem520,

Thanks for sharing this feedback

Regarding 1. We understand and agree that multi-tenancy is an important feature. It is in our roadmap and we are actively working on it

Regarding 2. We have a pull request open with some optimizations. You will find the link in this other discussion: Would snapshot stop mutation and read - #3 by ashishgoswami . You might want to subscribe to the corresponding GitHub ticket to get a mail notification when the issue is closed: when alpha Creating snapshot, the response time writing or querying of txn is takes several seconds · Issue #4250 · dgraph-io/dgraph · GitHub

I hope the above helps,