-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
asyncio: support fork #66197
Comments
It looks like asyncio does not suppport fork() currently: on fork(), the event loop of the parent and the child process share the same "self pipe". Example: import asyncio, os
loop = asyncio.get_event_loop()
pid = os.fork()
if pid:
print("parent", loop._csock.fileno(), loop._ssock.fileno())
else:
print("child", loop._csock.fileno(), loop._ssock.fileno()) Output: I'm not sure that it's revelant to use asyncio with fork, but I wanted to open an issue at least as a reminder. I found the "self pipe" + fork issue when reading the issue bpo-12060, while working on a fix for the signal handling race condition on FreeBSD: issue bpo-21645. |
Oh, I wanted to use the atfork module for that, but then I saw that it does not exist :-( |
A first step would be to document the issue in the "developer" section of asyncio documentation. Mention that the event loop should be closed and a new event loop should be created. |
It's not that it doesn't work after fork, right? Should we add a recipe with pid monitoring a self-pipe re-initialization? |
Actually I expect that if you share an event loop across different processes via form, everything's a mess -- whenever a FD becomes ready, won't both the parent and the child be woken up? Then both would attempt to read from it. One would probably get EWOULDBLOCK (assuming all FDs are actually in non-blocking mode) but it would still be a mess. The specific mess for the self-pipe would be that the race condition it's intended to solve might come back. It's possible that some polling syscall might have some kind of protection against forking, but the Python data structures that map FDs to handlers don't know that, so it would still be a mess. Pretty much the only thing you should expect to be able to do safely after forking is closing the event loop -- and I'm not even 100% sure that that's safe (I don't know what happens to a forked executor). Is there a use case for sharing an event loop across forking? |
I don't think so. I use forking mainly for the following two use-cases:
It would certainly be handy to have an ability to fork while the loop is running and safely close (and reopen) it in the forked child process. But now I can see that it's a non-trivial thing to do properly. Probably it's ~somewhat~ safe to re-initialize epoll (or whatever selector we use), re-open self-pipe etc, drop all queued callbacks and clear Task.all_tasks collection, but even then it's easy to miss something. I think we just need to make sure that we have documented that asyncio loops are not fork safe, and forks in running loops should be avoided by all means. |
I don't know if asyncio must have a builtin support for this use case. The minimum is to document the behaviour, or maybe even suggest a recipe to support it. For example, an event loop of asyncio is not thread-safe and we don't want to support this use case. But I wrote a short documentation with code snippets to show how to workaround this issue: We need a similar section to explain how to use asyncio with the os.fork() function and the multiprocesing module. |
That sounds about right -- it's a doc issue. Let me propose a paragraph: """ (I don't want to have to research what the various polling primitives do on fork(), so I don't want to suggest that it's okay to close the loop in the parent and use it in the child.) A similar note should probably be added to the docs for the selectors module. |
Hi, Actually, closing and creating a new loop in the child doesn't work either, at least on Linux. When, in the child, we call loop.close(), it performs: Both the parent and the child still refer to the same underlying epoll structure, at that moment, and calling remove_reader() has an effect on the parent process too (which will never watch the self-pipe again). I attached a test case that demonstrates the issue (and the workaround, commented). |
Martin, what is the magic call to make in the child (or in the parent pre-fork???) to disable the epoll object in the child without disabling it in the parent? (Perhaps just closing the selector and letting the unregister calls fail would work?) |
Guido, Currently in my program, I manually remove and then re-adds the reader to the loop in the parent process right after the fork(). I also considered a dirty monkey-patching of remove_reader() and remove_writer() which would act as the original versions but without removing the fds from the epoll object (ensuring I don't get bitten by the same behavior for an other fd). The easiest fix, I think, is indeed to close the selector without unregistering the fds, but I don't know if doing so would have undesired side effects on an other platform than Linux (resources leak, or the close call failing maybe). |
I said something wrong in my previous comment: removing and re-adding the reader callback right after the fork() is obviously subject to a race condition. I'll go for the monkey patching. |
How about not doing anything in the parent but in the child, closing the On Mon, Dec 1, 2014 at 1:29 AM, Martin Richard <report@bugs.python.org>
|
Currently, this is what I do in the child after the fork:
It replaces unregister() by _BaseSelectorImpl.unregister(), so "our" data structures are still cleaned (the dict _fd_to_key, for instance). If a fix for this issue is desired in tulip, the first solution proposed by Guido (closing the selector and let the unregister call fail, see the -trivial- patch attached) is probably good enough. |
I suggest to split this issue: create a new issue focus on selectors.EpollSelector, it doesn't behave well with forking. If I understood correctly, you can workaround this specific issue by forcing the selector to selectors.SelectSelector for example, right? |
Attached at_fork.patch: detect fork and handle fork.
I tried to minimize the number of calls to _detect_fork(): only when the self-pipe or the selector is used. I only tried test2.py. More tests using two processes running two event loops should be done, and non-regression tests should be written. The issue bpo-22087 (multiprocessing) looks like a duplicate of this issue. |
close_self_pipe_after_selector.patch only fixes test2.py, it doesn't fix the general case: run the "same" event loop in two different event loops. |
Updated patch:
|
Can someone review at_fork-2.patch? Martin: Can you please try my patch? |
I read the patch, it looks good to me for python 3.5. It will (obviously) not work with python 3.4 since self._selector won't have an _at_fork() method. I ran the tests on my project with python 3.5a1 and the patch, it seems to work as expected: ie. when I close the loop of the parent process in the child, it does not affect the parent. I don't have a case where the loop of the parent is still used in the child though. |
asyncio doc contains: It's not the case for selectors. Even if it would be possible to implement selector._at_fork() in asyncio, it would make more sense to implement it in the selectors module. @neologix: Would you be ok to add a *private* _at_fork() method to selectors classes in Python 3.4 to fix this issue? I know that you are not a fan of fork, me neither, but users like to do crazy things with fork and then report bugs to asyncio :-) |
In that case, I suggest a small addition to your patch that would do the trick: in unix_events.py: becomes: + def _at_fork(self): |
Not really: after fork(), you're hosed anyway:
all epoll sets automatically?
descriptor is a reference to an open file description What would you do with the selector after fork(): register the FDs in There's no sensible default behavior, and I'd rrather avoid polluting |
2015-02-17 20:16 GMT+01:00 Charles-François Natali <report@bugs.python.org>:
See the patch: + def _at_fork(self): EpollSelector._at_fork() does nothing on the current epoll object, Hum, I should maybe close explicitly the old epoll object.
What is wrong with the proposed patch?
If possible, I would prefer to implement "at fork" in the selectors |
The goal of the patch is to create a duplicate selector (a new epoll() structure with the same watched fds as the original epoll). It allows to remove fds watched in the child's loop without impacting the parent process. Actually, it's true that with the current implementation of the selectors module (using get_map()), we can achieve the same result than with victor's patch without touching the selector module. I attached a patch doing that, also working with python 3.4. I thought about this at_fork() mechanism a bit more and I'm not sure of what we want to achieve with this. In my opinion, most of the time, we will want to recycle the loop in the child process (close it and create a new one) because we will not want to have the tasks and callbacks scheduled on the loop running on both the parent and the child (it would probably result in double writes on sockets, or double reads, for instance). With the current implementation of asyncio, I can't recycle the loop for a single reason: closing the loop calls _close_self_pipe() which unregisters the pipe of the selector (hence breaking the loop in the parent). Since the self pipe is an object internal to the loop, I think it's safe to close the pipes without unregistering them of the selector. It is at least true with epoll() according to the documentation quoted by neologix, but I hope that we can expect it to be true with other unix platforms too. |
How do other event loops handle fork? Twisted, Tornado, libuv, libev, |
It looks like using fork() while an event loop is running isn't recommended in any of the above. If I understand the code correctly, libev & gevent reinitialize loops in the forked process (essentially, you have a new loop). I think we have the following options:
I'm torn between 2 & 3. Guido, Victor, Martin, what do you think? |
The only solution to safely fork a process is to fix loop.close() to So the idea is (I guess it's the 5th option):
Guido, do you still think that raising a "RuntimeError" in a child |
015-05-26 20:40 GMT+02:00 Yury Selivanov <report@bugs.python.org>:
If all the tasks are cancelled and loop's internal structures (callback However, it's indeed not enough: resources created by other parts of
|
I don't actually know if the 5th option is possible. My strong requirement is that no matter what the child process does, the parent should still be able to continue using the loop. IMO it's better to leak a FD in the child than to close a resource owned by the parent. Within those constraints I'm okay with various solutions. |
Surely this is too late for 3.5? |
I'm not 100% convinced that asyncio must support fork, so it's too late :-) Anyway, we don't care, asyncio will be under provisional status for one more cycle (3.5) :-p |
I've remarked it as "normal" priority and moved it to 3.6. Not my problem anymore! :D |
A note about this issue should really be added to the documentation - on OS X, it fails with the rather non-sensical "OSError: [Errno 9] Bad file descriptor", making this very hard to debug. I don't have any specific requirement for fork support in asyncio as it's trivial to move loop creation after the fork, but having to run the interpreter through GDB to diagnose the problem is not a good state of affairs. |
Python 3.7 got as new os.register_at_fork() function. I don't know if it could help: Can we close this issue? Sorry, I lost track of this issue and I see no activity since the end of 2015... |
The most reasonable IMHO would be for it to mark the event loop "broken" (or closed?) in the child, to forbid any further use. By the way, on Python 3 (which is pretty much required by asyncio), I really suggest using the "forkserver" method of multiprocessing, it removes a ton of hassle with inheritance through forking. |
Hum, the problem is that Python cannot guess if the event loop will be |
Possible answer: have a global WeakSet of event loops. In the child fork handler, iterate over all event loops and "break" those that have already been started. |
We can do this but only in debug mode. |
A compromise for the short term would be to detect fork in debug mode |
I'd prefer to fix it properly in 3.7. |
I'm not sure why it would be debug-only. You usually don't fork() often, and you don't have many event loops around, so the feature sounds cheap. In any case, I'm not directly affected by this issue, I'm merely suggesting options. |
I think you're right. If it's low or zero overhead we can have the check always enabled. |
I'm not sure about possible use cases that might conflict with this approach, but using a separate event loop for each pid seems very reasonable to me, as follows: _default_policy = asyncio.get_event_loop_policy()
_pid_loop = {}
class MultiprocessingPolicy(asyncio.AbstractEventLoopPolicy):
def get_event_loop(self):
pid = os.getpid()
loop = _pid_loop.get(pid)
if loop is None:
loop = self.new_event_loop()
_pid_loop[pid] = loop
return loop
def new_event_loop(self):
return _default_policy.new_event_loop()
asyncio.set_event_loop_policy(MultiprocessingPolicy()) |
Give up, document that fork() is not supported and close the issue :-) IMHO it's not worth it. |
I'll revisit this later. |
There is no activity for 2 years. Asyncio is mature now. It seems like users learnt how to work around this issue, or don't use fork() with asyncio. I close the issue. |
As far as I can tell, the problems of fork() and asyncio are not documented. It would be useful at least to have it mentioned in the docs that forking a running event loop is not supported. |
Well, fork() is more generally discouraged especially in networking applications. If you're using multiprocessing, the "forkserver" method is a reliable and safe alternative. That said, I agree it would be useful to document. Perhaps you would like to open a separate issue for that? |
sounds fine |
Most of this was found while debugging cao behavior when out of disk space. A number of extremely strange race conditions could happen when game startup fails in unusual ways, as well as in a subsequent attempt to shutdown the server. This commit tries to handle these a bit better. Unfortunately, the best solution I found for the big one (a race condition after fork, but before execvpe overwrites python) is simply to delay so that execvpe takes hold; fork + asyncio is not well behaved. See: python/cpython#66197 As far as I can tell, the promised solutions of safely closing the event loop in a child never materialized.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: