DGraph as a key-value/graph hybrid (multimodal)?

briantrusso · May 20, 2019, 7:32pm

I have a use case that essentially looks like…

Millions (billions maybe? eventually?) of documents. Each document is a mix of various attributes (either simple or arrays) + full-text + location (optional), and so forth.
From these documents, we are extracting entities & relationships + performing various enrichments. Generally we expect the number of entities to grow quickly at first, then have the slope level off.
Relationships can be among/between entities, documents, etc. Queries include full-text search, geographic, graph, simple attribute comparison, etc.

One way of course to do this, is to store the documents in something like Cassandra and store the graph in… a graph database. In this case we basically store a pointer to the document in the graph database.

Another is to store everything in one database.

My question is how well DGraph supports this latter use case and if so, any gotchas/design suggestions you recommend to minimize refactoring downstream.

My frame of reference on this sort of problem is ArangoDB & Neo4J.

Thanks.

marvin-hansen · January 20, 2020, 9:52am

@briantrusso I have a similar use case although on a different scale.

Have you looked at SciDB? It’s used at CERN to store a few hundred of Petabytes while enabling in-DB cluster processing & machine learning.

I did a long evaluation across different systems and here are my insights:

ArangoDB fell flat because of missing GraphQL support, otherwise, it would be a contender. We did a test with Neo4J but concurrency performance was very problematic so we dropped it after countless issues. TigerGraph is my personal preference and has certainly the best overall package but also lacks GraphQL support. DGraph, however, fell flat, because it caused way too many problems during a test deployment, so we never really were able to do a proper test. We are not using it.

Where you able to get DGraph to a usable state?

Currently, FaunaDB and SciDB are the closest contenders to build my system.

santo · January 20, 2020, 10:59am

Hi Marvin, could we organize a quick call this or next week to talk a bit more about your experience with Dgraph, and how we can support you more?

Could you send me an email at my-first-name at dgraph dot io?

Thanks!

marvin-hansen · January 22, 2020, 5:31pm

@santo

Yes, I already send you an email and would absolutely welcome some help to get this done!

Thanks

marvin-hansen · January 24, 2020, 1:21pm

@santo

the problems have been solved.

We deploy another database.

Thanks
Marvin

beepsoft · February 7, 2020, 6:36am

@marvin-hansen could you share what was the root of your problem with your original installation/trial of DGraph? Maybe it could be a problem for others as well.

Topic		Replies	Views
Dgraph vs Datomic vs Neo4j etc Dgraph	8	2889	July 16, 2018
Is anyone using dgraph in production? Dgraph	21	7178	January 23, 2021
Performance: storing large documents associated to nodes Users	2	423	January 11, 2020
(What is) Native Graph Storage Dgraph kind:question	9	1569	September 25, 2020
Dgraph as candidate? Dgraph	1	361	April 27, 2020

DGraph as a key-value/graph hybrid (multimodal)?

Related topics