-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Subscriptions #613
Comments
I spent some time the past few days rigging up a subscription system using graphql-ruby, so here are some of my preliminary thoughts. ProblemsThe first thing that I ran into when setting up subscriptions was essentially, "Here's a GraphQL document with a subscription query; what is the user subscribing to?" In other words, this is the process of mapping the fields in the subscription query back to events in the system. Given this GraphQL document for example: subscription {
somethingHappened(id: "...") { stuff }
somethingElseHappened(id: "...") { moreStuff }
} I need to somehow know that the user just subscribed to I looked at this problem from a couple of angles:
The thing that kinda sucks about the dry-run approach is two things:
Here's another problem that I ran into pretty quickly: The arguments that the user supplied to the subscription might need to be transformed (quite a bit) before it gives me information that I can use as a hook to my underlying event system. Just an example, the user might be passing in global node ID's, but those global ID's are really only a concept inside of my GraphQL layer. My subscription layer is publishing messages using database ID's or other information, not node ID's. A third problem I ran into was: What if the user puts multiple subscriptions into the GraphQL document? If I'm only firing the And the final problem I ran into: If the subscription is in response to an event, how do I get event-specific data into the field resolver? SolutionPulling all of this together, it seems that the subscriptions feature should have these characteristics:
So perhaps an API for creating a subscription could look like this: # Much like mutations, subscriptions seem like they'll have a decent amount of boiler-plate. So
# that can all get pulled into its own define API.
NewUserJoinedChatSubscription = GraphQL::Subscription.define do
# Name is used to derive the name of the auto-generated object type
name "NewUserJoinedChat"
# Specify the arguments for the subscription field. Unlike mutations, these are first-class
# arguments, rather than being wrapped inside an input object type.
argument :channelId, !types.ID
argument :friendsOnly, types.Boolean, default_value: true
# This is an optional, special resolver. It's executed on the initial execution of the subscription
# query, so that you can return any data that helps you figure out how to tie into your underlying
# event source.
# Default behavior is to just return the arguments that were passed in
metadata_resolve -> (root_object, arguments, context) {
{
channel_id: Node.item_id(arguments[:channelId]),
user_id: context[:access_token].user_id,
friends_only: arguments[:friendsOnly]
}
}
# Specify the fields that are on the auto-generated object type
return_field :newUser, UserType
return_field :channel, ChannelType
# Finally, specify a resolver
resolve -> (root_object, arguments, context) {
# normal resolve that we all know and love
}
end Once you've defined a subscription using that API, you attach it to your root subscription type as a field, the same way as you do with a mutation. The real magic happens when you execute a subscription query. Instead of getting your normal response, what you get is instead an array of object that return the metadata about the user's subscription: subscription_query = <<~GRAPHQL
subscription {
somethingHappened(id: "...") { ...f1 }
}
fragment f1 on SomethingHappenedPayload {
stuff
otherStuff
}
GRAPHQL
result = Schema.execute(subscription_query, variables: {}, context: {})
# => Array
# Here's how I imagine you could interact with result:
result.each do |item|
# Get the name of the subscription
item.subscription.name
# => "SomethingHappened"
# Get the metadata from the `metadata_resolve`. By default, it just returns the arguments:
item.metadata
# => { "id" => "..." }
# Callable that you can use to execute just this particular part of the query
# To inject event-specific data, you can override the root_value, or you can add more stuff
# to the query context:
item.execute(root_value: some_event, extra_context: { "event" => some_event })
# => { "data" => { "somethingHappened" => { ... } }, "errors" => [] }
end I think this would be enough to implement a subscription system for almost any transport. You can take the metadata you get from the initial run of the query and use it to figure out what events to listen to. And whenever the event happens, you can execute the corresponding query and send the results to the client through whatever transport mechanism. The nice thing about the callable approach is that it's probably the easier thing to build, and I'm guessing it's suitable for all websocket-like use cases. The downside is that you might want your subscriptions to be durable (ie, maybe you shovel it into a database), or maybe an entirely different process is going to execute the query and transmit the results). So an alternative to the callable approach would be to produce an already-sliced GraphQL document and set of variables, which could be easily serialized. And then sometime later you could just execute that sliced query with those variables. Those are my preliminary thoughts. I'd be interested to hear what others think? |
Wow, thanks for the detailed post! Having spent some time on subscriptions myself, it's nice to hear a lot of the same challenges. Maybe that means we're on the right track 😆 First, here a few ways I've dealt with some of the questions posed above:
How do those sound to you? Second, here are some questions that I haven't found a passable solution for:
Finally, I think it's very important to maintain this simple API for render json: MySchema.execute(query_str, ...) Otherwise, must a user inspect incoming queries and determine whether they're subscriptions or not? |
The only sad thing about it is that you get "can't return null" errors if your subscription fields are non-null.
So... I realized that I was thinking about it very differently. Because for us, normal queries are POST'd to
Maybe the real feature here (that I'm after) is the ability to specify, during the call to But I could definitely see a different way of doing it if we built some tools for manipulating GraphQL documents (a la #455). Given a document that has an operation X, and that operation queries fields A, B, C, then you get three documents back (A', B', and C') which contain only the variable declarations, fragments, and field selections to query from each of those fields respectively. Input: query X($inputB: SomeInput!) {
someAlias: fieldA { selections }
fieldB(input: $inputB) { ...b }
}
fragment b on BType { moreSelections } Slice it up: GraphQL.slice_document(document, operation: "X") Get two results: query X_someAlias {
someAlias: fieldA { selections }
} query X_fieldB($inputB: SomeInput!) {
fieldB(input: $inputB) { ...b }
}
fragment b on BType { moreSelections } Then you could take a subscription query, slice it into the individual fields, and register each one as a subscription. And maybe this is enough information that you don't even have to execute it. |
😖 maybe the same library improvement for skipping fields in the response can be applied in this case (instead of returning null)
👍 That makes sense, that's how I sketched it out over ActionCable a while back too. But now I'm looking at "what if you can't upgrade to Rails 5" (or what if ActionCable gives you the heebie jeebies) so you want to use a third-party pubsub provider (we use Pusher). |
A couple of thoughts as I work over at #672:
I'm still trying to figure which parts of ^^ above I should try to support from the framework vs leave to the application. |
I might be able to tease apart storage from transport ... hmmm |
I've posted the example I use for developing here: https://gist.github.com/rmosolgo/ba31acf93f07f8007d99ba365a662d8f I'll try to keep the gist up-to-date as I iterate 😬 |
Some other thoughts: if delivering a subscription means "unfreezing" query data & re-running the query, we can't currently support dynamic filtering. I've advised people to pass a filter into I think the solution to this is to allow adding filters after initializing the query. This way, filters could be added during brb 😬 |
Liking everything in here! @rmosolgo do you have a rough estimate when you think you would release it (even if it's beta?) We're building a new project right now and exploring the options we have in GraphQL. One of them is whether to do polling or subscriptions. Subscriptions are our long term goal but we thought we'd try something out relatively soon. If subscriptions are relatively close we would hold out on trying anything right now :) |
Hi @mull, thanks for taking a look! My guess is end-of-August at the latest. There are a few more Besides that, there are a few other folks implementing subscriptions on their own, so those might be released even sooner! |
@rmosolgo thanks for the answer! We love the library and will be purchasing the pro license in a few hours ❤️ |
There are some considerable issues to deal with when it comes to even a small bit of scaling. Think of 1000 connected users from 5 different platforms each using a different subscription query. Will you render the document 5 times or 1000? What happens if 500 subscribe to server A and 500 subscribe to server B? Absinthe (in Elixir) is just finishing getting subscriptions completed and will be out in their next release. It may help to learn how other implementations handled the issues you've come across (as well as bring to light new ones you haven't yet thought of). Sangria (in Scala) solved things in a similar way. You can see their explanation here of the issues they ran into here: Just some food for thought. Thanks for all your work on this gem! |
Thanks for the link @mgwidmann ! I love having a looking into other implementations, there are some really smart folks with good ideas there 😄 I'll share some of my thoughts on these specific topics:
My priorities are:
So, the default implementation will be to handle each subscription separately: load the document, run the subscription, transport the result, x1000. One thing that concerns me is that many Rails apps use For someone with this kind of volume, another scaling approach might be to use a cache for data fetching. This way, the database client (GraphQL) has complete flexibility, but the master database isn't swamped with queries. What do you think of that approach? Are there other ways to be safe-by-default but still have good efficiency?
Storage in memory is not sufficient! You need to manage state with a shared database. For this reason, the API will have two parts: a |
@rmosolgo I hear your concerns, especially with regard to I've talked several times with one of the team members on the Elixir GraphQL implementation and he said they had to implement functionality similar to how batching works inside a single document but across multiple documents as a subscription is fulfilled. For example, if you have a post w/ an author and 1000 subscriptions w/ half of them having an author and half without, the author and post will only be fetched once, not 500 or 1000 times. They do this once per server, so if you have 10 servers, you'll only fetch the post 10 times and the author 10 times.
There are problems with that too. Subscriptions are only relevant as long as a user is connected to some sort of live transport such as a websocket. When they're disconnected, those things need to be cleaned up or they'll leak. Regardless of where you put it, (database or redis or wherever) you cannot rely on the machine with the active connection to be the one to clean up. Servers frequently go down for numerous reasons and if they went down without running the "cleanup" code, those subscriptions would be leaked and clients would immediately reconnect to another up server, duplicating the subscription. Absinthe (and I suspect Sangria as well) keeps the subscriptions in memory since they're tied to the websocket being live and that way they're cleaned up if the websocket goes down even if you Also, as a general rule, I'd warn against any direct coupling of GraphQL to any 3rd party service such as a database or redis just because GraphQL is supposed to be storage agnostic. I'd suggest just relying on a pub/sub system w/ an adapter interface, then build out several projects to be able to use things like redis/memcache/ect as the pub sub. |
👌 I think the current implementation would support using
Same, the current implementation leaves storage & transport to the application, so you could use whatever implementation you want (eg ActionCable, Pusher) |
@rmosolgo Do you have another date estimate for the release or is still Agust 25? |
yep 😬 working through an example implementation on ActionCable now :) |
As I've had a few more subscriptions crop in my app, I've noticed that a common use case is:
The operation usually has a unique ID, which is unknowable until the mutation is finished. However it's possible that status updates happen between the time when the client receives the mutation response, and when it creates the subscription. So somehow, you want to get this as close to atomic as possible. One way that it could be achieved is to include a timestamp in the result of the mutation, showing when the last status update happened. Then at subscription creation, you provide both the ID and the timestamp; if new status updates have happened since that timestamp, the subscription is triggered immediately. Just thought that use-case might help when thinking through the feature. |
Great point, a similar question of atomicity was also discussed during the spec formalization: graphql/graphql-spec#283 (in the end, left vague). Your idea of putting the timestamp in the ID is really cool! So when you subscribed, the server can replay updates between that timestamp and the current time. I'd love to try implementing it sometime 😄 |
@rmosolgo Is subscription support now available? Should I use the 1.7.x branch for this? |
Sorry, I ended up writing |
I merged #672, should be released this week with 1.7.0 |
I'd be a little weary about using timestamps between systems. Timestamps and distributed systems never end well as they're almost certainly not the same value at any given time. Phoenix's pubsub/presence system uses CRDTs (which is some crazy math proof) instead of using timestamps like you described. This guy explains some of the problems and how CRDTs fix it. |
Right now we accept a subscription root type, but we don't provide any special support for registering , managing or delivering them. With the RFC well on it's way, what should be added to this library to support a third-party subscription backend?
I don't think this library should include a subscription backend, except maybe an in-memory one for prototyping. What could we support here that would actually be widely useful? ActionCable is cool, but I haven't heard of anyone actually running it in production yet (and not many folks are on Rails 5 to begin with). Using Redis Pub/Sub directly would be cool, but I think delivering those subscriptions would require a Websocket server outside the Rails server. Another option is Pusher, which I currently use to deliver live updates and which manages subscription status & delivery "for you".
So, certainly to start, I'd like to figure out what's missing in order to support subscriptions over those various backends. (Or, perhaps it's impossible to generalize about them!)
The text was updated successfully, but these errors were encountered: