-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Improve http.Client connection handling #1581
base: main
Are you sure you want to change the base?
Conversation
// closed the socket. In that case just call the on_close callback. | ||
char errmsg[] = "TLS TCP socket already closed?"; | ||
log_debug(errmsg); | ||
on_close->$class->__asyn__(on_close, self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We initially thought about raising an exception here instead, but http.Client
calls net.TLSConnection.write()
asynchronously. And I guess others would too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should blindly accept async write as the correct thing. It's not a complete design as-is...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, for close, I'm not sure we even need to close on_close. Like why do we end up here with a -1 stream?
If you call TLSConnection.close()
10 times in a row, would we expect the on_close()
callback to also be called 10 times? I think it makes more sense to treat it as a declarative thing, so if TLSConnection.close()
is called on a closed connection, the goal of having the connection closed is already reached. We don't actually transition, i.e. we don't go from established -> closed, and so on_close will not be called!?
The stream must have gone to -1 at some earlier point at which the on_error() callback should be called, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling on_close
even though we're already closed (stream == -1
) allows for chaining the callbacks in reconnect()
:
def reconnect():
close(_connect)
76acc0d
to
136a7f7
Compare
base/src/http.act
Outdated
@@ -460,6 +460,7 @@ actor Client(cap: net.TCPConnectCap, scheme: str, address: str, port: ?int, tls_ | |||
var close_connection: bool = True | |||
var tcp_conn: ?net.TCPConnection = None | |||
var tls_conn: ?net.TLSConnection = None | |||
var reconnecting: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to connecting
and initialize to True
since we are in fact not connected when we start up. This way we can buffer outgoing requests already from the start.
Separate to this, we should add arguments for configuring that sort of buffering. I think per default we want to allow for some queueing up of outgoing requests but it should definitely be bounded, perhaps both in number of requests as well as reconnection attempts and in time.
For a more "proper" and robust error handling, I think applications will want to crank down the amount of queued up request.
HTTP 1.0 servers close the connection automatically after sending the response. We updated the http.Client actor such that the users can use the high-level API like making several requests in a row without having to worry about reconnects. http.Client will buffer the requests and flush the buffer once it (re)connects the socket. Writing to a closed socket is bad and guaranteed to fail, so the user (in this case the http.Client actor) should be notified ASAP before making further writes. Rather than using the on_error callback from the underlying TCPConnection and TLSConnection actors we raise an exception from their write() methods. This implies the calls to write() are synchronous, but since we're not waiting on network I/O it should be fine?!
136a7f7
to
1178aa0
Compare
@@ -500,10 +511,10 @@ actor Client(cap: net.TCPConnectCap, scheme: str, address: str, port: ?int, tls_ | |||
r, buf = parse_response(buf, _log) | |||
if r is not None: | |||
if "connection" in r.headers and r.headers["connection"] == "close": | |||
close_connection = True | |||
_conn_close() | |||
# Is this really the right thing to do here? If the client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO entry here. We should inspect our queue, i.e. _on_response list to see if there is more stuff.
I'm also thinking like we can maybe have some basic heuristics, like when we create a http.Client the first time we obviously connect directly then we expect to run at least one query. Maybe we should not reconnect after that (in case of HTTP / 1.0 or not having persistent in later HTTP versions) until there is a second query. Now if we see a second query we can assume "there are multiple requests" and thus reconnect directly after the 2nd and subsequent requests. WDYT?
base/src/net.ext.c
Outdated
if (strstr(errmsg, "bad file descriptor")) { | ||
// This can happen if the socket is closed, se we raise an exception | ||
// and let the caller retry | ||
$RAISE(((B_BaseException)B_RuntimeErrorG_new(to$str(errmsg)))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are generally doing Python-like-exception stuff. It's not a specific design, i.e. we can deviate where we think there is good reason but if we have no idea we might as well follow Python. I think Python uses OSError for most of these, not RunTimeError.
Add some basic heuristics, like when we create a http.Client the first time we obviously connect directly then we expect to run at least one query. Maybe we should not reconnect after that (in case of HTTP / 1.0 or not having persistent in later HTTP versions) until there is a second query. Now if we see a second query we can assume "there are multiple requests" and thus reconnect directly after the 2nd and subsequent requests.
By grouping the socket errors under ConnectionError we can be more precise in handling the error in consumers of TCP connection actors.
@@ -344,6 +344,9 @@ class MemoryError (Exception): | |||
class OSError (Exception): | |||
pass | |||
|
|||
class ConnectionError (OSError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't actually looked much at how Python does things beyond looking at OSError. I see now that it subclasses things quite nicely. Should we not just copy that whole thing verbatim? Like it feels like a better starting point than starting from scratch. In particular, in this case for the error you later raise, I think it's a BrokenPipeError.
We probably want to consider changing the __init__
method of our OSError so we can take errno and stuff like that, again just like in Python. Without default args it'll be a little fiddly, but that's OK I guess.
If the (HTTPS) server closes the socket on us after a response, the stream becomes invalid. Let's communicate that to the user (http.Client) by
calling on_error. The user (http.Client) can automatically reconnect and resend the failed requests without any intervention from the actors further up the stack.The HTTP(S) server closing the socket is perfectly valid behavior for HTTP/1.0. Users of http.Client do not have to do anything, they just get the response as expected.