Spark Connector for dgraph

Thanks @amanmangal
Let me know if you need any help from my side.

Alright, so for now, I just asked sbt to choose a particular guava dependency -

libraryDependencies += “com.google.guava” % “guava” % “16.0.1” force()

The current code works, but then, Dgraph4J is not happy with this version of guava. I am working on building a shadow jar that allows you keep two versions of guava.

I have raised a PR on dgraph4J repo for shading dependencies causing conflicts. We will add more when we find the issues, going forward. It will take us a couple of days to get it on maven central. You can manually point to this jar by following the steps below -

You will have to compile Dgraph4J locally by running the following command from root folder -

./gradlew shadoJar -x test

and then, point to local shadow jar in your project by modifying the line -

libraryDependencies += “io.dgraph” % “dgraph4j” % “1.7.3” from “path to shadow jar”

Yeah sure, let me try doing this.

Let me know once the jar is available in mvn central I shall start using that instead.

Thanks a lot for the help.
– Eshwar

2 Likes

hi, I also need a spark connector for dgraph, if you finished your spark connector , could you share your code?

Hi all,

I have released a working version of our Spark Dgraph Connector. Sources and details are available at github. I would love to hear your experience using that connector so I can focus on adding the most useful next features first.

Enrico

3 Likes

Thanks @EnricoMi, the repo looks really impressive already. I’ll try this out over the next few weeks, and try to contribute back where I can :+1:

1 Like

I think with the latest release of the Spark Dgraph Connector this discussion can be marked solved.

The connector partitions the graph by predicates and nodes, so it can load large (many nodes) and wide (many predicates) graphs into Spark. And it supports filter and projection pushdown so you can quickly load tiny sub-graphs.