Dgraph as a gateway for data federation

imkleats · December 5, 2019, 11:16pm

Hey there. Please bear with me - I’m an economist who is a graph enthusiast, not an engineer - but I was wondering if anyone more knowledgeable or skilled than myself might have some thoughts, feedback or critique about the idea of Dgraph serving as a gateway for data federation.

Problem Statement

There are many cases where you might have multiple data stores that have data that are related in some way but which have been separated for any number of reasons (i.e. slow migration to Dgraph as a primary data store, legacy systems, tightly controlled access to sensitive/protected data, etc.). It would be pretty nice if you could simply query across them like:

{
   dgraphResource {
       randomStringPredicate
       ...
       someExternalResource {
          externalPredicates...
          predicateRefToDgraphResource {
             otherDgraphPredicates
          }
       }
   }
}

Possible Approach

My understanding is that Dgraph currently shards based on predicate and, when querying, uses gRPC to convey the operation expected from any given shard as the subgraph is traversed. What if… these “external”/non-Dgraph predicate references could be routed to a specific cluster/node that could receive this gRPC request and return a conforming gRPC response, but leaving it to whomever implements the logic behind the scenes to dictate how that gRPC response is generated?

Edit: After some more source spelunking, here are some additional entry points for this implementation:

ProcessGraph (query/query.go) calls createTaskQuery to generate a protobuf for the query subgraph.
The query protobuf is then passed through the worker’s ProcessTaskOverNetwork (worker/task.go) method.
That method calls the worker’s groups to see which gid (group id) is serving the tablet for the attribute key on the query protobuf by polling the Zero server ShouldServe method.
If the attribute isn’t being served by any tablets (which it isn’t because the whole idea is external attributes), it’s going to yield a gid of 0 which will throw an errNonExistentTablet error and an empty Result.
If a gid could be received, it would be passed through to processWithBackupRequest that would use the gid to reference two server addresses to process the request.

So, a solution might involve registering a Tablet with the Zero server that houses all external attributes as defined in the schema, creating a Group (with gid) to serve that Tablet which routes to other backend sources, and otherwise letting Dgraph handle the query/result RPCs as it normally would.

Feedback?
It’s probably a hair-brained thought, but I’d love to hear your feedback (even if it’s negative).

michaelcompton · December 6, 2019, 1:44am

Hi, thanks for your thoughts.

Not hair-brained at all! In fact it’s such a good idea that I’ve been having it myself

We’ve been thinking about Dgraph as supporting federations as part of our GraphQL offering (which will land in a Dgraph release real soon, but you can already see here https://graphql.dgraph.io/). That will support Apollo federation (we already have a version that does), which is a way (not the only way) of supporting federation queries that look quite like your example with GraphQL.

But we’re also looking at supporting custom logic in our GraphQL offering and that will allow mixing our GraphQL and external results - which could be a local calculation or a call to something external.

So we aren’t quite there, and it’s not clear how we’ll pick our path towards these sorts of things, but it’s a great idea.

imkleats · December 6, 2019, 2:51am

Thanks for the quick response! I mean, I was aware that federation in itself is a good idea and certainly not a novel idea to have. I’ll be interested to check out your code and future examples/blogs around Apollo Federation. To be perfectly honest, I was looking for something not-Apollo that could be strongly integrated with other Go microservices. Being able to connect through the standard dgo client but getting all that juicy federation was a big part of what motivated the potential approach I pitched. Anyways, thanks again. Exciting times ahead, no doubt.

davemaier · September 28, 2020, 8:08pm

Hi, is there any news about this?
Seems like there is not much more information on dgraph with apollo federation except for this rather dated thread. Since I’m currently doing some technology research for a microservice that is inteded to work under apolly federation and dgraph looks really cool to me, this would be extremely usefull.

Topic		Replies	Views
Dgraph, Microservices, Time series, Scalability and CI/CD Dgraph dgraph	8	2056	July 26, 2020
Unify data and compute (micro-service) in GraphQL Dgraph dgraph , area:graphql	3	697	January 2, 2020
Dgraph proxy for backend App Development kind:question	2	752	January 30, 2022
Dgraph graphql query integration wtih custom grapql server GraphQL kind:question , dgraph	9	583	September 22, 2020
Custom sharding logic Dgraph kind:question	1	382	September 7, 2020

Dgraph as a gateway for data federation

Related Topics