-
-
Notifications
You must be signed in to change notification settings - Fork 178
Using transport's socket with low level add_reader/add_writer #372
Comments
It appears that people use sockets returned from `transport.get_extra_info('socket')` with low-level APIs such as add_writer and remove_writer. If the returned socket fileno is the same as the one that transport is using, libuv will crash, since one fileno can't point to two different handles (uv_poll_t and uv_tcp_t). See also python/asyncio#372
Aren't they voiding the warranty by using get_extra_info('socket') that way?
I'd be in favor of raising an error rather than enabling bad behavior.
|
Yes, that's exactly what they do.
For that we'll need a weak-dict of all transports This will break backwards compatibility though. |
I guess we should first figure what aiohttp should do instead of this. Doesn't the selector know whether the FD is already registered? Couldn't we use that? I'm not too concerned about backwards compatibility with such a corner case. |
My original solution with duplicating the socket should do the trick. I also think that we should implement sendfile in asyncio, but that's an off-topic.
It knows, the problem is that there can be only one reader or writer per fd. Subsequent calls to add_writer or add_reader simply override the old writer/reader. So if a protocol sets up a writer, and then the application uses its socket and sets up another writer, the transport's writer callback will never be called. The situation is different when we duplicate the socket: the transport keeps using its own socket, and the app can do whatever it wants with its copy. The question is do we want asyncio users to duplicate the transport's socket if they want to use add_reader/add_writer for it, or we should fix asyncio to do that? For now I fixed uvloop to make a duplicate, so that it doesn't crash. |
OK, so aiohttp should dup the socket itself.
The question about add_reader/writer then is whether the "override previous
reader/writer" behavior is used elsewhere. It seems bad to depend on it
(you should really remove the previous reader/writer first) but maybe it is
used as a feature? E.g. when we want to change the callback?
|
OK, makes sense. So asyncio will raise an error when you add_reader/add_writer on a fd that is used by some transport. Should we do this only in debug mode? I think that keeping a WeakDict should be relatively cheap, so we can have this check always enabled.
Yes, I think people depend on this when they want to change the callback. I really don't think we can change this behaviour at this point.
What makes this thing nasty is that it's hard to catch with unittests. For instance, aiohttp implements sendfile on a non-blocking socket, and most of the time the file is sent on the first sendfile call. However, that call can return a So when we start to raise an error in add_writer, it might happen so that this new behaviour will only be discovered once the code is in production. |
Yeah, the behavior of add_reader/writer is specified clearly in PEP 3156...
I hate weakdict with a vengeance but I don't see another way. I think it's
fine to always check for this (but double-check the performance a bit if
you can).
Agreed that async app behavior is hard to debug/test... :-(
|
aiohttp is fixed. Maybe weakdict is not 100% necessary. |
No, transports should be garbage collected, we can't have strong references to them. I've seen tons of code which uses asyncio/streams that doesn't call Maybe we can replace weakdict with a set of fds: each transport would add its fd when connected/bound, and remove it in |
Closing this one since the relevant PR has been merged. |
Turns out people use transports' sockets with
add_reader
/add_writer
low level APIs. I've discovered this while debugging a crash of a webapp deployed with uvloop. The reason of that crash is how aiohttp implements sendfile: they usetransport.get_extra_info('socket')
to get the underlying socket, and then they useloop.add_writer
andloop.remove_writer
on that socket.The crash in uvloop is caused by how file descriptors are stored internally by libuv. I'll make a workaround for that by using
os.dup()
to return a duplicate socket fromtransport.get_extra_info('socket')
.However, I think we should do this in asyncio too. The thing is that when you use
add_writer
andremove_writer
on the transport's socket, you're messing with the internal state of the transport. For instance, a transport might be in the middle of writing data, and callingremove_writer
will cause the whole program to hang.I see two options:
transport.get_extra_info()
, hiding the actual socket that transport is attached to.add|remove_witer|reader
is used on a transport's socket.I'm more inclined to do the 1.
/cc @asvetlov
The text was updated successfully, but these errors were encountered: