Skip to content
This repository was archived by the owner on Nov 23, 2017. It is now read-only.

Using transport's socket with low level add_reader/add_writer #372

Closed
1st1 opened this issue Jul 5, 2016 · 10 comments
Closed

Using transport's socket with low level add_reader/add_writer #372

1st1 opened this issue Jul 5, 2016 · 10 comments
Assignees

Comments

@1st1
Copy link
Member

1st1 commented Jul 5, 2016

Turns out people use transports' sockets with add_reader / add_writer low level APIs. I've discovered this while debugging a crash of a webapp deployed with uvloop. The reason of that crash is how aiohttp implements sendfile: they use transport.get_extra_info('socket') to get the underlying socket, and then they use loop.add_writer and loop.remove_writer on that socket.

The crash in uvloop is caused by how file descriptors are stored internally by libuv. I'll make a workaround for that by using os.dup() to return a duplicate socket from transport.get_extra_info('socket').

However, I think we should do this in asyncio too. The thing is that when you use add_writer and remove_writer on the transport's socket, you're messing with the internal state of the transport. For instance, a transport might be in the middle of writing data, and calling remove_writer will cause the whole program to hang.

I see two options:

  1. We always return duplicate sockets from transport.get_extra_info(), hiding the actual socket that transport is attached to.
  2. We find a way to raise an exception if add|remove_witer|reader is used on a transport's socket.

I'm more inclined to do the 1.

/cc @asvetlov

1st1 added a commit to MagicStack/uvloop that referenced this issue Jul 5, 2016
It appears that people use sockets returned from
`transport.get_extra_info('socket')` with low-level APIs such as
add_writer and remove_writer.

If the returned socket fileno is the same as the one that transport
is using, libuv will crash, since one fileno can't point to two
different handles (uv_poll_t and uv_tcp_t).

See also python/asyncio#372
@gvanrossum
Copy link
Member

gvanrossum commented Jul 5, 2016 via email

@1st1
Copy link
Member Author

1st1 commented Jul 5, 2016

Aren't they voiding the warranty by using get_extra_info('socket') that way?

Yes, that's exactly what they do.

I'd be in favor of raising an error rather than enabling bad behavior.

For that we'll need a weak-dict of all transports {transport.fileno(): transport}, and to refactor user-facing add_reader, add_writer, remove_writer and remove_reader to raise an error. I think it's doable both for asyncio and uvloop.

This will break backwards compatibility though.

@gvanrossum
Copy link
Member

I guess we should first figure what aiohttp should do instead of this.

Doesn't the selector know whether the FD is already registered? Couldn't we use that?

I'm not too concerned about backwards compatibility with such a corner case.

@1st1
Copy link
Member Author

1st1 commented Jul 5, 2016

I guess we should first figure what aiohttp should do instead of this.

My original solution with duplicating the socket should do the trick.

I also think that we should implement sendfile in asyncio, but that's an off-topic.

Doesn't the selector know whether the FD is already registered? Couldn't we use that?

It knows, the problem is that there can be only one reader or writer per fd. Subsequent calls to add_writer or add_reader simply override the old writer/reader.

So if a protocol sets up a writer, and then the application uses its socket and sets up another writer, the transport's writer callback will never be called. The situation is different when we duplicate the socket: the transport keeps using its own socket, and the app can do whatever it wants with its copy.

The question is do we want asyncio users to duplicate the transport's socket if they want to use add_reader/add_writer for it, or we should fix asyncio to do that?

For now I fixed uvloop to make a duplicate, so that it doesn't crash.

@gvanrossum
Copy link
Member

gvanrossum commented Jul 5, 2016 via email

@1st1
Copy link
Member Author

1st1 commented Jul 5, 2016

OK, so aiohttp should dup the socket itself.

OK, makes sense. So asyncio will raise an error when you add_reader/add_writer on a fd that is used by some transport. Should we do this only in debug mode? I think that keeping a WeakDict should be relatively cheap, so we can have this check always enabled.

The question about add_reader/writer then is whether the "override previous
reader/writer" behavior is used elsewhere. It seems bad to depend on it
(you should really remove the previous reader/writer first) but maybe it is
used as a feature? E.g. when we want to change the callback?

Yes, I think people depend on this when they want to change the callback. I really don't think we can change this behaviour at this point.

I'm not too concerned about backwards compatibility with such a corner case.

What makes this thing nasty is that it's hard to catch with unittests. For instance, aiohttp implements sendfile on a non-blocking socket, and most of the time the file is sent on the first sendfile call. However, that call can return a BlockingError, in which case they start to use add_writer. I had to use an old version of Mac OS to understand why an asyncio application crashes from time to time.

So when we start to raise an error in add_writer, it might happen so that this new behaviour will only be discovered once the code is in production.

@gvanrossum
Copy link
Member

gvanrossum commented Jul 5, 2016 via email

@1st1 1st1 self-assigned this Jul 6, 2016
@asvetlov
Copy link

aiohttp is fixed.

Maybe weakdict is not 100% necessary.
We can use a strong dict with cleanup on transport.close/transport.abort.
It may require public helper methods in loop for registering/unregistering transport instances though for supporting custom transport classes.

@1st1
Copy link
Member Author

1st1 commented Jul 14, 2016

We can use a strong dict with cleanup on transport.close/transport.abort.

No, transports should be garbage collected, we can't have strong references to them. I've seen tons of code which uses asyncio/streams that doesn't call writer.close().

Maybe we can replace weakdict with a set of fds: each transport would add its fd when connected/bound, and remove it in __del__.

@1st1
Copy link
Member Author

1st1 commented Oct 5, 2016

Closing this one since the relevant PR has been merged.

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants