I have submitted a ticket regarding this issue a while ago already. Unfortunately, this hasn’t been solved yet and so far it seems there is also no solution for this.
The issue
We have a couple of interfaces in our schema with query/mutation rules applied. These work fine for a while but every now and then the public accessibility for eg. an interface query seems to disappear for no reason (apparently it has something to do with the Ingress layers of the Dgraph infrastructure). So, having
@Poolshark Thanks for the report. I couldn’t find the issue you originally filed on this, can you share?
Also, I’m trying to understand the behavior. Are you saying that if the auth rules were applied to a concrete type (not an interface) then this works as expected? If so, I doubt ingress configuration is to blame (but I’ve seen stranger things).
Please add a bit more detail and I’ll be happy to work through the issue with you.
I believe it has not so much to do with the @auth rules itself but with the public access. We cannot set interfaces public/unpublic since they always depend on the type they are implemented in. So if I understood right, an interface will always be publicly accessible (unless the @generate is set accordingly) and therefore the fields must be protected via the @auth directive (unless you don’t need protection at all).
What happens in our case is that the interface query itself is not publicly accessible anymore. To speak in terms of my previous example, I cannot run
query InterFaceQuery {
queryTestInterface {
id
foo
bar
}
}
from my application anymore, unless I attach the Client Key to the header. Using the Client Key allows me to run queries/mutation if a specific type is not set to public.
The weird thing is that, this behaviour seems to suddenly appear without making any changes! I’ve previously suspected that if I make changes in the public rules, this messes with the interfaces but this is not the case. I’ve also suspected dead nodes connected to the interface, but also here no luck.
The bad thing is that once it happens, it stays like this until I completely delete everything in the database and re-deploy the schema again.
Hey @Poolshark Thanks for getting back to me. Would it be possible to get a minimal schema example from you that represents the issue? I’m working toward reproducing it and I don’t want to blindly guess at your schema.
Also, can you give me the ID of the issue for this that you raised earlier? I’ll look to see if any internal tickets refer to it.
As I’ve mentioned earlier, it will be hard to create a minimal example from our schema but I’ll try to figure sth out.Also, I’ll check if there is a chance to let you test on one of our Shared Clusters - maybe you have access to logs on the AWS or something like this.
I started a simple test on a local cluster. It simply executes a graphql query against a generated Interface api every 10 seconds. It’s just a simple Interface, no auth or generate directives anywhere in the schema.
Do you have any guidance on approximately when the interface query would become unavailable?
Thank you so much for the effort! I’m actually feeling a bit bad that I can’t contribute more than saying that “it is not working” …
Regarding your question: Unfortunately I have no clue if it is a matter of time when the interface query starts to be blocked or if it’s a matter of how and where the interface is implemented.
For us, the problem started to appear mainly when we altered the schema. That’s why in the beginning I suspected that I have created some dead nodes which then caused the issue. The only remaining hunch I have is, that maybe we have created some kind of predicate mismatch - this, at least for me - would explain why completely purging everything on the server solves the issue.
I’m super busy at the moment but I will ask my team if there is an opportunity to test directly on our Shared Cluster where we currently experience the issue.
@matthewmcneely any news from your end regarding the interface issue?
I had some follow up from @rarvikar on the ticket but he was only saying that “it’s been looked into”.
Since the Shared Cluster stopped working completely, I cannot tell if the problem still persists. Good news for the Dedicated Cluster though - so far we did not experience the issue there!
I ran the test (querying an interface-based generated graphql query endpoint) for three days on a cluster and I could not duplicate what you reported. Head scratcher.