I’ve seen comments saying that Subscriptions are implemented using a 1s (adjustable) polling query. I’m curious whether this is a stop-gap or permanent design? It seems slow and/or computationally wasteful, depending on polling settings. My intuition is that a sort of filter layer on updates would drastically reduce CPU cycles and allow for instantaneous broadcasts.
While 1 second works for many app use-cases, it’s not hard to conceive of cases where it’s an eternity…or where the polling cycles become a burden at scale - I would guess this design doesn’t scale effectively for online gaming, trading, etc.
Hey @CosmicPangolin1, you are right that Subscriptions are implemented using polling. We also do some smart batching there so that if different clients had a subscription for the same query, we’d just be starting one polling goroutine which would broadcast updates to all the clients.
In general, Dgraph has very good query throughput but if this doesn’t work for you, we are open to making improvements. We did think about broadcasting updates from mutations to ongoing subscriptions when we designed subscriptions but one of the problems there is that its hard to find out which subscriptions to re-run on updates. Say if you added/updated or deleted a user, then any subscriptions which has a filter on user type would need to be re-run. This would mean that each mutation potentially triggers a lot of queries which won’t scale as well.
It can be configured using as per your use case using a flag. We plan to make it configurable per subscription query as well.
Dgraph way of doing Subscription, i.e. polling, makes sense in such scenario.
If a user likes a post, but dislikes it immediately, then notifying author of the post immediately that someone liked & then disliked their photo will be heartbreaking.
Thanks for the explanation. I can see how queries with nested filters are much more complicated to handle than subscriptions to tables or collections of data (where it’s easy to send the marginal update along to the subscriber, as Firestore does for example). This doesn’t negatively affect me ATM, but I suspect in concert with returning the full query on updates it eventually will since I’m designing around a lot of real-time communications data. I’ll keep thinking about the problem.
If and when we hit limits with the current approach, we can make it more sophisticated to use streams within the DB to know when a response is likely to change. For now, this approach seems simple and workable.