Hello there. I’m not so new to the overall concept of GraphQL, but I’ve never hosted my own. I came across Dgraph and Slash, and it looks amazing to me. I would like to experiment with it a bit. There are just a few questions that I cannot really find clear answers to, that I hope someone in here might answer.
Connecting to Dgraph / Slash directly from the front-end
Say I would be building a React app and I want to query some data. If I send GraphQL queries to the back-end (e.g. Slash) directly, that would expose my full schema, no? Is that considered bad practice anywhere? Is Dgraph / Slash intended to be queried directly by a front-end?
Context: With a regular GraphQL setup a front-end could use a public schema, for as far as I know, whereas trusted back-ends could use unexposed mutations (or REST calls) to do what they need to do, and the outside world will never know of their existence. Since Dgraph is fully accessed by GraphQL query language, would that mean that a front-end would possibly include only part of the schema (only the “public” part), whereas a back-end would perhaps know about the full schema? Or is it possible to query the whole schema, anyway, from some public endpoint (which would make that useless, because anyone can find out about the full schema)?
Rate limiting, max query depth, etc.
I know there are a few ways to protect the back-end from users with malicious intent. Are things like rate limiting, max query depth, etc. possible with Dgraph?
Using a authentication/authorization & other middleware proxy
To have more control over all the above, I could imagine it could be nice to write a small proxy that just forwards the requests to the actual Dgraph / Slash / GraphQL back-end. It could include authentication/authorization middleware, rate limiting, and other security checks. Would anyone deem this (un)necessary?
I believe that by default all GraphQL implementations out there exposes its schema.
Slash or the GraphQL endpoint, won’t expose your DQL schema tho (if you use it).
I’m not sure if you can hide it, let’s ping @gja and @graphql
Yes, the whole purpose of GraphQL is to be friendly with the front-end.
That’s an interesting concept. I think we should have something like that, “hidden” parts of the schema accessible by specific users B2B or admins. Maybe the @auth feature does that, but I don’t think that it hides the queries/mutations/subscriptions from the public schema.
Continuing, you could rely on the business logic in the auth directive feature. I think that is okay to be public, only if you have really sensitive models it should be possible to hide.
I’m not sure, I hope Tejas help here.
I personally totally agree. As GraphQL are HTTP requests, a proxy would be really nice. I would recommend Traefik for the job (not gRPC if you wanna use it, the gRPC support in Traefik is horrible - but in general is an amazing reverse proxy).
Putting a proxy would give extra control over the GraphQL endpoint and hide the others. Not sure about the other engineers, but I totally agree in using it.
Great Questions. To add to what @MichelDiz responded
GraphQL exposing it’s schema is part of the spec itself, under introspection queries. It shouldn’t be a problem to expose the methods, but it is very important that you lock down things that should not be called with the @auth directive
We do have rate limiting on our roadmap, but I can’t give you an exact date for support in dgraph or Slash GraphQL
What you are suggesting is definitely possible. Instead of running your own proxy, I’d suggest using a CDN like CloudFlare in front of Slash GraphQL, which could handle things like rate limiting for you on their free plan.
@gja, I know that while building my own GraphQL layer before Dgraph, I could disable introspection at the server level. This was a setting that could be flipped for a production system for example.
In development, Apollo Server enables GraphQL Playground on the same URL as the GraphQL server itself (e.g. http://localhost:4000/graphql ) and automatically serves the GUI to web browsers. When NODE_ENV is set to production , GraphQL Playground (as well as introspection) is disabled as a production best-practice.
Thanks a lot for your answers, that’s definitely answering pretty much everything! I’m going to mention @gja in here because he also replied to my question, with similar answers, also very useful (thanks again), and I have one last question.
I’m way more confident about exposing the schema now. But I’m new to Dgraph. In all the environments I have worked in before, we added GraphQL as a front-end entrypoint to our back-end services. The back-end services all had their own databases and exposed data via REST or gRPC. GraphQL talked to the back-end services and, in that way, really specifically gave access to only public data - stuff that you want to expose (the rest just doesn’t go into the schema).
With Dgraph / Slash it seems like there is only one way of accessing your data: through GraphQL queries that end up at Dgraph automatically. Let’s say I have some data that is not relevant to the end-user, like a user activity log; some stuff that only admins, or perhaps even systems, need to see. Would you store that in Dgraph? Or would you for example spin up another database next to it and, in it, store data that relates to entities in Dgraph (e.g. by pointing at their uids)?
For example, let’s say I’m building a small application - a to do list. It has users, organizations, and to dos. In a “classic” example you could have a few back-end REST APIs here that pull data out of a MySQL database, and a public GraphQL gateway to serve the data. The data is in the end stored in MySQL and if I want to do some data processing, I can also access that MySQL db directly from a data processing script/application. Would (could) I in the case of Dgraph / Slash be pulling the data directly out of Dgraph (I mean, in a perfect world perhaps I would use data streams, but just for example’s sake)? Would love to hear your thoughts on this.
Long story short, I think it boils down to: would you use Dgraph as a drop-in replacement for MySQL (and also have for example data processing jobs query Dgraph)? Or would you have Dgraph replace just for parts (the publically exposed, highly relational data) of the MySQL database, and run for example another database next to it?
Not exactly, you can use DQL too. But yeah, GraphQL is the main lang in Slash and it works like as it was a native DB lang for Dgraph.
You can do, you can store it outside the GraphQL context in the same DB. Dgraph’s GraphQL only will have access to the graph model you have created with the GraphQL schema. So, everything else which isn’t in that schema, won’t be accessible by the user at all.
You can do it, maybe using Kafka and forcing UID allocation in the other Dgraph clusters.
But the problem here is that those GraphDBs would be isolated. So DQL and GraphQL queries should be performed separately. You won’t have “a query to rule them all”.
Yes, via DQL. With DQL you can have WAY better control over your Graph. There are bulk mutations(upserts), var blocks(which you can do complex filtering), and so on. That’s equivalent to what you are asking.
@minitauros, I think you are exactly where I was when I started. What is DQL? How is that different from GraphQL? Why multiple endpoints? How to do authentication? How to hide sensitive data with authorization? And so many more…
I cannot stress enough the importance of this post:
That will help break some of the icy waters as you find these new terms such as DQL thrown about with not much clarification. Sorry, it is hard for us sometimes to remember what we didn’t know at one time ourselves.
Yes! I would, I did, and I will again! I replaced a growing namespace of 500+ databases with a single Dgraph’s Slash GraphQL instance.
In the perfect Dgraph setup, you would probably have GraphQL @auth rules publicly allowing the public data or user’s data such as their private to dos. This way you can actually build in the administration of your app on the same front end as your app. This takes away much of the need for management at the back end. You could run GraphQL queries/mutations directly through Slash UI or any other GraphQL tool to do ad hoc administration access. I myself have written a few admin scripts that we use to clean up data that most of them all work in pure GraphQL. The greatest thing about Dgraph is that the GraphQL is not an API layer, but it is the core functions of the database. So what do you need DQL for? DQL is similar to the GraphQL syntax but it has been extended for use cases such as upserts, rdf triples mutations, and variable passing blocks (stuff that GraphQL cannot do yet by spec). You may down the road run into something that cannot be done yet in GraphQL and then I would recommend to be ready to learn DQL. until then, I would jus stick to GraphQL. One example is that I needed to change a username that is tied to a field that is tied to the @id predicate. That was not possible in GraphQL, but a pretty simple feat in DQL. DQL, is really just a way to bypass the auth rules of GraphQL and use the extended functionality. It is powerful and IMO should not be public facing or tied to a front end like react. I would be very careful to keep DQL endpoint a secret. With the poor man’s auth on the DQL endpoint, you don’t have to worry about that much anymore, and still have access to it when you need it. Pretty sweet stuff if you ask me.
I think the main understanding that will answer this question in full is that even if someone knows about the schema, if they do not meet the auth rules for those parts of the schema they are disallowed from doing anything with that. They can only create, read, update, and delete what you say they can in the rules. Before you go to production with any Slash GraphQL instance, make sure you have locked down your auth rules correctly and have tested them.