Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Tracking issue for no-gil/freethreaded work #4265

Closed
14 tasks done
alex opened this issue Jun 20, 2024 · 66 comments · Fixed by open-spaced-repetition/fsrs-rs-python#10
Closed
14 tasks done

Tracking issue for no-gil/freethreaded work #4265

alex opened this issue Jun 20, 2024 · 66 comments · Fixed by open-spaced-repetition/fsrs-rs-python#10

Comments

@alex
Copy link
Contributor

alex commented Jun 20, 2024

We didn't have a dedicated issue for this, so now there's one.

TODO:

  • Add a cfg for no-gil, but only allowed behind an experimental feature
  • ffi-check passing with a no-GIL build
  • Adopt new owned-reference-friendly C APIs
    • PyDict_GetItemRef
    • PyList_GetItemRef
    • PyDict_Next
    • PyWeakref_GetRef
    • PyImport_AddModuleRef
  • Identify places that assume a Python<'_> indicates only a single thread is executing:
    • pyo3::sync::GILOnceCell
    • PyClassBorrowChecker
    • GILProtected
    • PyErrState::normalize
    • ...
  • A way for extensions to declare that the Py_mod_gil slot should be set
  • pyo3_ffi datetime bindings are not thread safe (?)
@ngoldbaum
Copy link
Contributor

As a tiny piece of this and to try to learn the library better, I'm working on adding wrappers for the new GetItemRef C API functions in the 3.13 stable API. These are needed to be fully safe for free-threaded python and are nice to have anyway on older versions because strong references are easier to reason about.

@ngoldbaum
Copy link
Contributor

ngoldbaum commented Jul 30, 2024

Just to update the current state of things: pyo3 builds against the free-threaded build if you do:

UNSAFE_PYO3_BUILD_FREE_THREADED=1 cargo build

If you use pyenv, you'll also need to locally delete or modify the .python-version file.

This very quicky crashes inside of mimalloc internals, ultimately inside of Py_InitializeEx:

  * frame #0: 0x000000010135af60 libpython3.13t.dylib`chacha_block + 448
    frame #1: 0x000000010134f7e8 libpython3.13t.dylib`_mi_os_get_aligned_hint + 172
    frame #2: 0x000000010135cd68 libpython3.13t.dylib`unix_mmap_prim + 136
    frame #3: 0x0000000101355e80 libpython3.13t.dylib`_mi_prim_alloc + 220
    frame #4: 0x000000010134fb30 libpython3.13t.dylib`mi_os_prim_alloc + 68
    frame #5: 0x0000000101348560 libpython3.13t.dylib`_mi_os_alloc_aligned + 352
    frame #6: 0x0000000101349a9c libpython3.13t.dylib`mi_reserve_os_memory_ex + 80
    frame #7: 0x0000000101347ee8 libpython3.13t.dylib`_mi_arena_alloc_aligned + 392
    frame #8: 0x000000010135bd00 libpython3.13t.dylib`mi_segment_alloc + 468
    frame #9: 0x0000000101354950 libpython3.13t.dylib`mi_segments_page_alloc + 1468
    frame #10: 0x000000010135ab94 libpython3.13t.dylib`mi_page_fresh_alloc + 56
    frame #11: 0x0000000101351f5c libpython3.13t.dylib`mi_find_page + 528
    frame #12: 0x0000000101344070 libpython3.13t.dylib`_mi_malloc_generic + 208
    frame #13: 0x000000010144e3d8 libpython3.13t.dylib`gc_alloc + 284
    frame #14: 0x000000010144e268 libpython3.13t.dylib`_PyObject_GC_New + 96
    frame #15: 0x000000010132042c libpython3.13t.dylib`PyDict_New + 84
    frame #16: 0x00000001013b3d24 libpython3.13t.dylib`_PyUnicode_InitGlobalObjects + 236
    frame #17: 0x000000010147e4dc libpython3.13t.dylib`pycore_interp_init + 72
    frame #18: 0x000000010147bb88 libpython3.13t.dylib`Py_InitializeFromConfig + 1360
    frame #19: 0x000000010147bc9c libpython3.13t.dylib`Py_InitializeEx + 144
    frame #20: 0x000000010006e5f0 pyo3-1eb544a7db3e1a47`pyo3::gil::prepare_freethreaded_python::_$u7b$$u7b$closure$u7d$$u7d$::h922d8fd5db1fd90c((null)={closure_env#0} @ 0x00000001710665c7, (null)=0x0000000171066640) at gil.rs:69:13
    frame #21: 0x0000000100047a4c pyo3-1eb544a7db3e1a47`std::sync::once::Once::call_once_force::_$u7b$$u7b$closure$u7d$$u7d$::h15baf6dd1f7316ea(p=0x0000000171066640) at once.rs:208:40
    frame #22: 0x000000010031a770 pyo3-1eb544a7db3e1a47`std::sys::sync::once::queue::Once::call::heacc08786c6d7dfa at queue.rs:183:21 [opt]
    frame #23: 0x00000001000478b4 pyo3-1eb544a7db3e1a47`std::sync::once::Once::call_once_force::h7b8eb88c3a02f292(self=0x00000001004dce70, f={closure_env#0} @ 0x000000017106671f) at once.rs:208:9
    frame #24: 0x00000001001d68b4 pyo3-1eb544a7db3e1a47`pyo3::gil::prepare_freethreaded_python::h316cd04b406e24c0 at gil.rs:66:5
    frame #25: 0x00000001001d6924 pyo3-1eb544a7db3e1a47`pyo3::gil::GILGuard::acquire::h2127069d9988a593 at gil.rs:174:21
    frame #26: 0x0000000100053000 pyo3-1eb544a7db3e1a47`pyo3::marker::Python::with_gil::h38979cd5e69873c3(f={closure_env#0} @ 0x000000017106679f) at marker.rs:403:21
    frame #27: 0x00000001001c6548 pyo3-1eb544a7db3e1a47`pyo3::conversions::std::array::tests::test_extract_non_iterable_to_array::h3c220ef1fe379cdf at array.rs:226:9
    frame #28: 0x000000010004a1b4 pyo3-1eb544a7db3e1a47`pyo3::conversions::std::array::tests::test_extract_non_iterable_to_array::_$u7b$$u7b$closure$u7d$$u7d$::h7de4fa3687a88518((null)=0x00000001710667fe) at array.rs:225:44

Just to make sure all of this is reproducible and we have some feedback on CI, I think I'm going to add a free-threaded CI job marked with continue-on-error with a test run that crashes like this.

@davidhewitt
Copy link
Member

That sounds great to me, thanks!

ngoldbaum added a commit to ngoldbaum/pyo3 that referenced this issue Jul 31, 2024
ngoldbaum added a commit to ngoldbaum/pyo3 that referenced this issue Aug 1, 2024
github-merge-queue bot pushed a commit that referenced this issue Aug 1, 2024
* Update dict.get_item binding to use PyDict_GetItemRef

Refs #4265

* test: add test for dict.get_item error path

* test: add test for dict.get_item error path

* test: add test for dict.get_item error path

* fix: fix logic error in dict.get_item bindings

* update: apply david's review suggestions for dict.get_item bindings

* update: create ffi::compat to store compatibility shims

* update: move PyDict_GetItemRef bindings to spot in order from dictobject.h

* build: fix build warning with --no-default-features

* doc: expand release note fragments

* fix: fix clippy warnings

* respond to review comments

* Apply suggestion from @mejrs

* refactor so cfg is applied to functions

* properly set cfgs

* fix clippy lints

* Apply @davidhewitt's suggestion

* deal with upstream deprecation of new_bound
@alex
Copy link
Contributor Author

alex commented Aug 2, 2024

I added a new checkbox for " Adopt new owned-reference-friendly C APIs". If we have a list of all the ones we need, I can make those sub-checkboxes.

@ngoldbaum
Copy link
Contributor

If we have a list of all the ones we need, I can make those sub-checkboxes.

I think PyDict_GetItemRef and PyList_GetItemRef are the most important ones. There'a a listing of the remaining ones in the HOWOTO guide for free-threading in the CPython docs: https://docs.python.org/3.13/howto/free-threading-extensions.html#borrowed-references

I also had a chat with @davidhewitt today and in addition to GilOnceCell, he pointed to GILProtected and PyCell as spots that make strong assumptions about the GIL.

Our first idea is to make GILProtected a no-op on Py_GIL_DISABLED builds (although we'll need to see if that has major fallout on user code) and as a first pass PyCell needs atomic increments and decrements to avoid data races in the free-threaded build.

In addition we need to use pyo3_ffi_check to update the assumptions the FFI bindings make about the free-threaded ABI. Doing this should hopefully fix some of the most egregious build issues. I am planning to work on that step next week.

I looked at adding a failing CI job, but that won't work right now because of if you run the tests on a free-threaded build with --no-fail-fast the tests will eventually deadlock. At least as far as I can see there's no option in cargo to automatically kill hung tests that run longer than a configurable timeout. You can do it manually pretty easily with a macro but I'd prefer not to do that and instead hold off on adding CI until the tests are runnable without deadlocks. Hopefully that won't be too long :)

@alex
Copy link
Contributor Author

alex commented Aug 2, 2024

Ok, updated the tracking list.

PyCell no longer exists, should that be something else?

@ngoldbaum
Copy link
Contributor

I'm still learning the library and it shows...

I think David meant Bound in our discussion and he just got mixed up with the old API after a long day. I'll let him clarify.

@alex
Copy link
Contributor Author

alex commented Aug 2, 2024

My guess is it's a reference to PyClassBorrowChecker, which manages the various borrow flags. But I'll let David say for sure.

@davidhewitt
Copy link
Member

My mistake, yes we removed the PyCell name with the Gil refs API 👍

@ngoldbaum
Copy link
Contributor

See #4421 which updates the FFI bindings for the free-threaded build. That's enough to get the tests to pass without deadlocking, so I added a CI config as well.

@alex
Copy link
Contributor Author

alex commented Aug 6, 2024

Added a checkbox for ffi-check being green.

@ngoldbaum
Copy link
Contributor

Ohhh, I get it, it's because LazyTypeObject depends on GILOnceCell. I bet if we finish #4512 this will go away.

@davidhewitt
Copy link
Member

Good point, I will try to fix that PR up next time I type a line of code!

@davidhewitt
Copy link
Member

Ah, just realised #4584 - I've added a checkbox for PyErrState::normalize

@aniketmaurya
Copy link

aniketmaurya commented Oct 8, 2024

really looking forward to this!

@davidhewitt
Copy link
Member

#4298 might imply append_to_inittab! is not thread safe, though I think given this is already broken I don't mind missing that fix from 0.23.

@davidhewitt
Copy link
Member

As per python/cpython#125243 (comment) I've added a bullet to the top for datetime bindings.

@alex
Copy link
Contributor Author

alex commented Oct 25, 2024

Are we good for PyDict_Next to be checked off?

@ngoldbaum
Copy link
Contributor

Yes, we should be. The concern about thread safety in the datetime bindings should also be fixed by #4623.

@alex
Copy link
Contributor Author

alex commented Oct 25, 2024

God help us, we're close.

Has anyone here done a top to bottom perusal of pyo3 for other potential concerns?

@ngoldbaum
Copy link
Contributor

Has anyone here done a top to bottom perusal of pyo3 for other potential concerns?

Not me. I did just grep the codebase for Cell and UnsafeCell uses, and I think all the remaining ones are safe? The fact that PyAny uses an UnsafeCell to wrap a PyObject * pointer is OK, right?

I'm also hoping that finishing up #4566 will elucidate any remaining issues in tests and docs. My plan is to help finish that up next week.

@alex
Copy link
Contributor Author

alex commented Oct 25, 2024 via email

@gwenhe
Copy link

gwenhe commented Nov 11, 2024

My orm Dependency pydantic。
It is very necessary. When can it be solved?

@davidhewitt
Copy link
Member

See #4651 . We will be releasing initial support very soon, some challenges at home have delayed the work. Thank you for your patience.

@ChristopherRabotin
Copy link

FYI, if you ended up here because your Github Action reported your Python version (e.g. 3.8) was not compatible with a GIL-free build of Python, check your configuration of the requires-python of your pyproject.toml file and make sure it's 3.9 or above.

@ngoldbaum
Copy link
Contributor

We shipped free-threaded support in PyO3 0.23 and forgot to close this. Issues related to free-threading should go in their own new issues.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants