Flink / Spark connector to Dgraph

Would it be possible to add Flink and Spark Connector?
As Dgraph supports real-time updates, it make sense to enable real time push down update from analytics / ETL engine such as Spark and/and Flink.
Flink has got integrated push down connector to RocksDB already.

Data flow model:
Kafka → Flink → Dgraph
Once data is enriched through Flink in real time, it would be updating existing graph in Dgraph. And if required data will be lifted entirely to flink/spark for MLib.

Imagine collecting live signals from mobile phones inc: lat,long,user_id,TS. Then transforming each log into geoCity,GeoCountry, Date/TOD/DOW, gender, interests while matching each log with static tables in Flink, then updating Graph as the data comes through so that user has a low latency access to live data.

I am happy to contribute if you can point me to a right direction.

3 Likes

Hey @karolsudol,

Sounds like we need to write a Flink connector, like these here:
https://github.com/apache/flink/tree/master/flink-streaming-connectors

I’m not sure what sort of output do you get from Flink, what format is it in? If it is in RDF format, that’d be relatively straightforward. Otherwise, we’ll have to write a parser for that data format.

I see quite a few search engines already integrated with Flink, so this is something we can prioritize right away. If you can explain what’s involved here, we can dig in.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.