Live synchronization between Dgraph and other databases


(Murray Altheim) #1

Hi,

For the past year I’ve been investigating various ways of building out a
graph database (using Neo4j) that would serve as a backbone graph for an
enterprise MySQL database. This has been dealt with as a POC in mind of
eventually migrating portions of the enterprise database over to the graph
database.

This means a) we need to initially perform a mass import of data from the
source MySQL database, then b) either periodically or event-driven we’d
need to synchronize the graph and MySQL databases as users modify
their data (this system has a UI front end). BTW, both the enterprise
application and the graph database application are written in Java.

I have quite a lot of experience with Neo4j and enjoy working with it. I’ve
found Cypher invaluable in various tasks such as modifying the graph,
merging nodes, etc. and at the scale we’re dealing with (millions but not
billions of nodes and edges) things work fine. (so yes, I’d be very much
in favor of Dgraph supporting Cypher…)

Where we hit a hitch was in the synchronization requirements. Neo4j
has an ETL (Extract Transform Load) tool but it’s quite limited in functionality,
and while fine for an initial load, it’s not fit for purpose in doing live synchronization.

So with all that said, what I’m wondering is: given an enterprise application
written in Java and a requirement to build a graph database that remains
synchronized with some of the core entities and relationships, particularly
during a migration period, what would Dgraph’s engineers recommend as a
means of keeping these two databases synchronized?

I’m like to be using the Java Dgraph client, but as my Go skills improve it
might be possible to build a pure Go application, though selling that to my
team would be more difficult; we’re entirely at this point a Java shop.

Thanks for any advice,

Cheers,

Murray


(Michel Conrado) #2

For now we don’t have anything like that for “live sync”. But you could do something with the new JSON BulkLoader. You’ll be able to address any JSON file and keep consistency. Using “Dgraph Live” the live loader.

Another thing you could do is a trunking (T junction - I do not know if there is a technical name for this) using GraphQL. In the future this will be easier, but today you will have to create a GraphQL server for Dgraph and for your other database in the same code. And so forcing writing to happen in both DBs at the same time. That’s kind of “hack”, but it’s possible.


(Murray Altheim) #3

Hi Michel,

Thanks for replying so quickly.

So it sounds basically like I’m at the same place with both Neo4j and Dgraph, i.e., that

in either case I’ll be required to write my own synchronization functionality using some

kind of event trigger or periodic sync command. Since the enterprise MySQL database

is essentially the master and the graph db the slave, we were thinking of creating a SQL

change table where we’d write a journal of changes which could be retrieved by a call on

a web service endpoint, then the graph db would begin a sync process.

It’s all doable but that’s a lot of machinery to build, and a shame that neither graph database

provider has such a functionality available, as it seems to be a common requirement (i.e.,

keeping two enterprise databases in sync, a typical feature of SQL databases).

But I understand you’re a small team with limited resources (same as Neo4j, according to

their engineer), so I applaud Dgraph’s decision to open-source their code, which at least

permits your users to understand the Dgraph code base and perhaps create either closed

or open source compatible tools (depending on their own business requirements), such as

a sync tool. It’s an absolute requirement of our project so until I can propose a pragmatic

solution we can’t really move ahead with an implementation.

I’ll take this information back to our team in discussing the continued possibility of using

Dgraph for our application.

Cheers,

Murray


(system) closed #4

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.