-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Program aborts when Python's garbage collector gets called from another thread and attempts to traverse an unsendable pyclass instance. #3688
Comments
Hmm. This is unfortunate, but not entirely a surprise. At least we crash safely.
I disagree that this deduction is incorrect. From PyO3's perspective this is true; the data is being read on another thread, which violates the One option could be to make |
I was afraid you'd say that! Unfortunate indeed.
I suppose that's fair. However the issue only arises in the fairly niche case where a GC call from another thread happens to occur while there are unsendable GC-integrated objects in a reference cycle waiting to be collected, so I'm not sure whether it would be a worthy motivation for removing functionality that works fine in most cases. But maybe it is? I did just think of another possible solution; see this github.dev link. Apparently the |
One other option I see is, instead of an error, to make unsendable pyclasses "invisible" to the GC when it is running on a different thread, i.e. turn |
I making them opaque to other threads is quite a reasonable option, we can also document this caveat as part of the offering of That said, I think it's possible that these things might still get collected by another thread running a GC collection? E.g. if the unsendable class itself does not directly contain the cycle but is referenced from an object that does participate in a cycle. Then when the cycle gets collected, the unsendable class gets dropped by the wrong thread. IIRC we leak and warn in this situation already, as per #3176, so I think this edge case is ok but unfortunate. (The only solution I can see to mitigate that would be to have a per-thread queue so that unsendable classes could post themselves to their owning thread instead of leaking, but I'm not sure that it's worth the complexity.) |
Will prepare a PR to turn
I think we should definitely try to reduce global state in PyO3, we already have quite to much and I would like to avoid adding more. If something like this is desired, I would prefer to have that in downstream code which actually how threading is used. |
Agreed very much so on that point 👍 |
Wow that was fast, thank you for your effort! I like that solution and implementation, great work guys |
So this means you tested your PoC using the proposed change and it worked as expected? |
I have created a repository providing a full breakdown and minimal reproducible example of the error
at https://github.com/JRRudy1/pyo3_gc_error. I will provide a summary below, but please check out
the repository instead as I put a lot of effort into clearly presenting and investigating the issue.
In summary, I have discovered an error, or perhaps an undocumented limitation, in the way
PyO3 handles thread-checking for "unsendable"
pyclass
instances as they are being traversedby Python's garbage collector (GC). In particular, this occurs when garbage collection is triggered
from a separate thread, and the pyclasses integrate with the GC by implementing the
__traverse__
magic method. The error (or limitation) results in a hard abort, and is particularly problematic
since it cannot be caught from Python using a
try
/except
block.The conditions and sequence of events leading to the error can be summarized as:
__traverse__
/__clear__
to break itgc.collect
fromPython or
GcCollect
from C)not the original thread and incorrectly deduces that the object was sent between threads
I have gotten reasonably familiar with PyO3's internals and may be interested in working on this,
but I would need some guidance from an "expert" with a more nuanced understanding of the
possible implications. It is possible that the limitation cannot be safely fixed, and the only solution
is to improve the error message and add a warning to the documentation.
As mentioned above, please visit https://github.com/JRRudy1/pyo3_gc_error for more information.
The text was updated successfully, but these errors were encountered: