Hi DGraph Community!
I’ve spent the past couple of weeks experimenting with DGraph and its GraphQL implementation.
I absolutely love DGraph and DQL, but not so much its GraphQL endpoint.
Here are some considerations by someone that have built countless GraphQL servers.
The idea of having an automatic GraphQL endpoint is nice, but practically with the current implementation, it is just useful for “PETS” projects or prototyping not for starting up a “real” project.
Here is why:
-
Designing the Contract: When starting a new project or startup idea, you want to be able to iterate fast. However, you also want to have control over the schema and database design as much as possible, so that the project can evolve as the idea and business grows.
Designing a good GraphQL schema at the start brings huge benefits, especially, since the clients are built around it. You can start having DGraph directly serving the data behind the GraphQL server, but at some point, you want to be able to transparently replace DGraph and position it behind a Microservice or at least be prepared.
Having the queries and mutations automatically generated constraints us to develop our clients and future microservices over a convention that we might not like and we cannot change.
The inputs are also super important. I want to be able to expose a minimal set of params in my GraphQL inputs, perform the filtering logic on the backend. At the moment DGQL is generating very complex and powerful inputs that allow querying almost the entire database from the UI.
I feel very uneasy deploying to production such a powerful discoverability.
Touching again on the design aspect, I’m very much constrained with the way DGraph is designing my inputs, so now if I ever want to have a GraphQL schema that is simpler by nature i can’t achieve that. -
GraphQL is the Database and the Database is GraphQL: I think this is a wrong approach.
I believe that the API Layer should be decoupled as much as possible from the database. There could be similarities between the 2, but they shouldn’t mutate one another.
If i change my API contract i shouldn’t be worried that my database will also change and vice versa.
What i do think is right is, to have a mapping of some sort that tells the 2 contracts how they relate to each other. It could be conventional or coded in.
Example:
## GraphQL Schema
type User {
id: ID!
fullname: String
age: Int
}
## DGraph Schema
User.id: uid
User.name: string
User.age: int
type User {
id
name
surname
age
}
-
The 2 types are almost identical except for the
name
&surname
field in the DGraph schema over thefullname
in the GraphQL schema. We might be able to map 80% of the fields automatically but the one we can’t the user can write a little piece of code (in the resolver or not) to tell how that is computed.
The main advantage of this is that we can evolve independently the API and the database we have full control over those 2.
The disadvantage is that we now have 2 schemas to maintain (which make a lot of sense to me). -
I want a powerful GraphQL server but I don’t have control over it: This is what is currently happening with Draph GraphQL implementation.
There is currently too much declarative logic in the wrong place, (the schema). For example, i can do http requests directly from the schema OMG, authentication, authorizations, custom DQL queries etc…
I think in a medium sized project using that approach will just end up being too complex to even understand what’s going on. (hence very good for pets projects)
What about:
- Input validations
- Input sanitisation
- Input aggregation
- transformation
- etc… etc…
I believe all of the above including authentication, authorization has to be done outside the schema definition. Preferably at the application level.
Ok, we have Lambda resolvers for this. But HEY! I want to be able to use Go, Rust, Java, PHP, Haskel, etc… Why do i need to deploy another system which I don’t have control of? It’s also Javascript!! (BTW i love JS too, but i’m just being THE GUY now.)
My point is that since I would need lambda resolvers anyway for all of my resolvers as I’m doing the (validatitons, sanitisation, etc…) at the application level I could, at last, have control of the runtime at this point (if I’d feel like).
Summary
To summarise as a developer that wants to build a future startup or personal project i want:
- Use a technology that allows me to design my contracts A-Z
- Use a technology that doesn’t get into my way once I my idea grows
- I want to be able to evolve the system, not rewrite it
- I want to iterate fast
Alright, I think those are the main problems that i’m seeing with the current GraphQL implementation, but please don’t think that DGraph is the only one guilty of the points above.
Hasura, Postgraphite, and a million other automatic GraphQL server generators suffer the same problem.
But since I really felt in love with DGraph i’m just passing by and trying to see if you guys are happy to be a little different and try to do this right.
Alright fenos how would you do it?
I don’t have the right answer ready for you, but I have some ideas which they can be right or wrong but at least they can start breaking the ice.
Let me start by saying that I understand the technical problem behind building an automatic GraphQL server or query planner. Less flexible the schema is the easier it becomes.
However, if we stick with trying to accomplish our goals mentioned above, we can take 2 routes:
-
Create a library that parses GraphQL query at runtime and generate DQL query plans to send directly to DGraph
-
Make DGraph GraphQL implementation super flexible, which will allow me to develop my database and graphql api independently, as well as hooking it up into my own server so that i can accomplish validation / sanitisation / authz etc… in my own runtime and just proxing the query / mutation operation to DGQL.
A library
Pros: Having a library which does the hard work gives us the most flexibility over all
- Can be used in our own runtime
- Can be easily extended / configured
- Schema less cluttered
- Fully decoupled from the database
Cons:
- A million languages to implement it in.
- Query planner can get tricky
A smarter DGraph GraphQL
This could be the “kill 2 birds with one stone approach.”
What we need to change:
- I can define my own inputs - return types
- Decouple the GraphQL schema from the Database Schema
Point 2 is fairly simple. Just store the 2 schemas independently.
Point 1 instead is the tricky bit. What we can do here is:
- Tell DGraph the known mappings between the schemas, so that it knows what to select.
- Once we receive a query that looks like this:
query Authors {
bestSellingAuthors(since: "2021-10-20") {
id
name
}
authors(filter: BEST_SELLING) {
id
name
}
}
We will send this pseudo mapping along with the original graphql query:
map[string]string{
"bestSellingAuthors": "@filter (lt(since, $since))",
"authors": "if $filter == BEST_SELLING $mapping.bestSellingAuthors($since)"
}
then it should be able to resolve the queries.
How i’m imagining in pseudo-code the whole implementation (using my runtime) is something like that:
app := dgraph.NewGraphQLServer()
app.GraphQLSchema("../*.graphql")
app.DGraphSchema("../*.dgraphql")
app.Resolver("bestSellingAuthors", dgraph.Resolver{
OnQuery: func(builder, args) {
return builder.Filter("bestSellingAuthors").Eq(args["since"])
},
Resolve: func(_, args, ctx, info, next) {
// custom logic validate, authorize, authenticate, etc...
resultFromDGraph := next(args, ctx, info)
// return or transform
return resultFromDGraph
}
})
app.Run(8000)
Now we should be able to have everything a developer wants (if the above code would work eheh).
- I Can evolve my own database / schema
- I Can customize my own queries
- I Can implement my business logic in my own runtime
- I let DGraphQL do the hard thinking of using my modifications (if any) to accomplish an optimised query
- We could also optionally opt in for the conventional inputs so that DGraph GraphQL works as it would now.
- If a resolver is not implemented (like above), it will try its best to return data (using the mapping between the 2 schemas or conventional mapping) ignoring the inputs if not providing conventional ones.
- Iterate super fast and ready for the future.
I have few other ideas but i believe that this is already a lot to digest.
Please team DGraph and Community let me know what you think about my point, I’m very open to anything you want to add / improve / criticise.
Note: It will take me a few edits to get the formatting right eheh
Regards