-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors thrown when iterating over subscription source event streams (AsyncIterables) should be caught #4001
Comments
issue-:graphql#4001
I agree that the spec is agnostic, and that it would be useful for graphql-js to be consistent and provide explanatory errors. I think the spec should also be improved. For context, it seems from #918 that prior to that PR, all subscribe errors threw, and that the argument was made there that explanatory errors would be helpful in some cases. The parts of the PR that I skimmed through doesn't seem to indicate why explanatory errors to the client would not be helpful with iteration errors; my suspicion is that the PR was attacking the low-hanging fruit, and the authors/reviewers there would not necessarily object to even more explanatory errors. :) |
I think the next step would be to raise this topic at a working group meeting. @aseemk are you interested in championing this there? (I am potentially dangerously assuming that this hasn’t happened already…) |
graphql/graphql-spec#1099 has editorial changes to the event stream that I am not sure are 100% clear on this point. The way forward I think still goes through a discussion at a WG meeting. |
This was discussed at the November 2024 WG: My interpretation of the conclusions:
|
It's already possible (I think?) for users to wrap the async iterables that they return from |
Fwiw, the original poster @aseemk suggested this but was trying to avoid a per-subscription solution.
To your preference:
@benjie would you be able to elaborate more on your reasoning for this preference? More specifically, I think the proposal would be to return a final I think it's important to consider another failure mode. What is the What might be an example of a "request error" of this type? Well, it would not happen with failure of variable coercion, because the variables have already been coerced, that's one of the main differences between
Gaming out what might be driving your preference, maybe you are suggesting that even if that is the case now, it would be prudent to reserve the ability for services processing the response stream to distinguish between these two types of events, and so we should preserve the distinct failure modes. I think that's fair, but I would love to hear more about your exact motivation. I would say that the way this should be handled should not be on a per-subscription basis, but just by response stream processing services like |
My concern was that completing a stream successfully (but with the final payload having errors) and completing it with error (due to an underlying stream error) implicitly have different meanings, and was concerned that a final payload with just All that said, in the event of an error in a single subscription stream across a multiplexed protocol (such as |
Here's a go at making this change on top of Lee's editorial changes: |
One important note here is that internal errors will still result in the stream closing with an error - this should still not terminate the entire multiplex, only the individual stream within it. |
Just thinking of cases where internal errors span across the whole GraphQL instance - but couldnt think of any. That being said - I agree. graphql-ws should change to use the "error" message type for internal errors too. |
Tried a quick implementation at #4302 |
I like the way #4302 looks. Even if/when it lands I'll proceed with #4001 (comment). There will be a distinction, errors from #4302 will be a part of the |
@enisdenjo not sure I follow. Is there an |
@martinbonnin graphql-sse follows SSE semantics where all events are just data transmitted over the wire. The only reason why That being said, "Single connection mode", on the other hand, would need the P.S. I spoke too fast in my previous comment and have updated it in place to not confuse future readers. |
SSE interface NextMessage {
event: 'next';
data: {
id: '<unique-operation-id>';
payload: ExecutionResult;
};
} Because the payload is always PS: not saying this is a big deal but trying to understand if it's worth keeping this difference with graphql-ws. I think I'dd happily drop graphql-ws |
This is also an ExecutionResult, but an erroneous one: {
"errors": [{ "message": "whatever" }]
}
Yes, you're on the right track. I was also thinking about it yesterday and maybe having |
Makes complete sense and it's also what is pushed by the current spec (see latest editorial PR): A response stream can:
But if a transport cannot support the spec because it doesn't have 3. then it's a problem. I'd personnally vote for simplicity and only use |
Good point. If others deem it necessary, I'd be happy to add the However, #4302 is a "problem" then. A "problem" because the iterator wont throw, will instead catch exceptions and emit them as |
FWIW, I'm currently in the "opposite" team: the "let's simplify the spec by removing the need for That might be a bigger lift but also since we're on the topic now, it might be the good time to do it.
Agreed. And this is fine I think?
I came to the conclusion that the This is a loosely held opinion though and if anyone has use cases for |
This is the same as what From a consumer point of view, an error yielded deliberately by the server in the process of performing its GraphQL duties ( |
@benjie that's all fine, yielding
@martinbonnin I am on that team too! :D I love the idea that a GraphQL stream emits data until completed. And the data can contain an error. |
Yes, in the case of expected (normal, handled, non-internal) errors, then under https://github.com/graphql/graphql-spec/pull/1127/files the errors would be a regular payload like any other and would come through |
I agree, but that wont be the case if #4302 lands. Whatever the underlying error is, internal or not, #4302 will put it in the The way I see it, we have two options at the moment:
|
A: If you have an internal error: that's a bug. B: If you don't represent an internal error as an error: that's a bug. So if B ever occurs, fix A, and you've fixed both bugs 😉 |
FWIW #4302 would remove the need for
I'd say whatever we decide here when iterating the event stream should also apply to processing a single event. Re: what to decide, I think I'd prefer removing the |
Context
Hi there. We're using
graphql-js
and serving subscriptions over WebSocket viagraphql-ws
(as recommended by Apollo for both server and client).In our subscriptions'
subscribe
methods, we always return anAsyncIterable
pretty much right away. We typically do this either by defining our methods via async generator functions (async function*
), or by callinggraphql-redis-subscriptions
'sasyncIterator
method. Oursubscribe
methods effectively never throw an error just providing anAsyncIterable
.However, we occasionally hit errors actually streaming subscription events, when
graphql-js
calls ourAsyncIterable
'snext()
method. E.g. Redis could be momentarily down, or an upstream producer/generator could fail/throw. So we sometimesthrow
errors during iteration. And importantly, this can happen mid-stream.Problem
graphql-js
does not try/catch/handle errors when iterating over anAsyncIterable
:graphql-js/src/execution/mapAsyncIterable.ts
Lines 38 to 40 in 2aedf25
There's even a test case today that explicitly expects these errors to be re-thrown:
graphql-js/src/execution/__tests__/subscribe-test.ts
Lines 1043 to 1047 in 8a95335
graphql-ws
doesn't try/catch/handle errors thrown during iteration either:https://github.com/enisdenjo/graphql-ws/blob/e4a75cc59012cad019fa3711287073a4aef9ed05/src/server.ts#L813-L815
As a result, when occasional errors happen like this, the entire underlying WebSocket connection is closed.
This is obviously not good! 😅 This interrupts every other subscription the client may be subscribed to at that moment, adds reconnection overhead, drops events, etc. And if we're experiencing some downtime on a specific subscription/source stream, this'll result in repeat disconnect-reconnect thrash, because the client also has no signal on which subscription has failed!!
Inconsistency
You could argue that
graphql-ws
should try/catch these errors and send back anerror
message itself. The author ofgraphql-ws
believes this is the domain ofgraphql-js
, though (enisdenjo/graphql-ws#333), and I agree.That's because
graphql-js
already try/catches and handles errors both earlier in the execution of a subscription and later:Errors producing an
AsyncIterable
in the first place (the synchronous result of calling the subscription'ssubscribe
method, AKA producing a source event stream in the spec) are caught, and returned as a{data: null, errors: ...}
result:graphql-js/src/execution/execute.ts
Lines 1784 to 1793 in 2aedf25
Errors mapping iteration results to response events (the result of calling the subscription's
resolve
method) are caught, and sent back to the client as a{value: {data: null, errors: ...}, done: false}
event:graphql-js/src/execution/execute.ts
Lines 1726 to 1735 in 2aedf25
So it's only iterating over the
AsyncIterable
— the "middle" step of execution — wheregraphql-js
doesn't catch errors and convert them to{data: null, errors: ...}
objects.This seems neither consistent nor desirable, right?
Alternatives
We can change our code to:
AsyncIterable
never throw innext()
(try/catch every iteration ourselves){data, errors}
resolve
method just to unwrap this type (even if we have no need for custom resolving otherwise)resolve
methodthrow
anyerrors
orreturn data
if no errorsDoing this would obviously be pretty manual, though, and we'd have to do it for every subscription we have.
Relation to spec
Given the explicit test case, I wasn't sure at first if this was an intentional implementation/interpretation of the spec.
I'm not clear from reading the spec, and it looks like at least one other person wasn't either: graphql/graphql-spec#995.
But I think my own interpretation is that the spec doesn't explicitly say to re-throw errors. It just doesn't say what to do.
And I believe that
graphql-js
is inconsistent in its handling of errors, as shown above. The spec also doesn't seem to clearly specify how to handle errors creating source event streams, yetgraphql-js
(nicely) handles them.I hope you'll consider handling errors iterating over source event streams too! Thank you.
The text was updated successfully, but these errors were encountered: