Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[3.12] gh-113993: Make interned strings mortal (GH-120520, GH-121364, GH-121903, GH-122303) #123065

Merged
merged 14 commits into from
Sep 27, 2024

Conversation

encukou
Copy link
Member

@encukou encukou commented Aug 16, 2024

This backports several PRs for gh-113993, making interned strings mortal so they can be garbage-collected when no longer needed.

  • Allow interned strings to be mortal, and fix related issues (gh-113993: Allow interned strings to be mortal, and fix related issues #120520)

    • Add an InternalDocs file describing how interning should work and how to use it.

    • Add internal functions to explicitly request what kind of interning is done:

      • _PyUnicode_InternMortal
      • _PyUnicode_InternImmortal
      • _PyUnicode_InternStatic
    • Switch uses of PyUnicode_InternInPlace to those.

    • Disallow using _Py_SetImmortal on strings directly.
      You should use _PyUnicode_InternImmortal instead:

      • Strings should be interned before immortalization, otherwise you're possibly
        interning a immortalizing copy.
      • _Py_SetImmortal doesn't handle the SSTATE_INTERNED_MORTAL to
        SSTATE_INTERNED_IMMORTAL update, and those flags can't be changed in
        backports, as they are now part of public API and version-specific ABI.
    • Add private _only_immortal argument for sys.getunicodeinternedsize, used in refleak test machinery.

    Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:

    • _Py_ID
    • _Py_STR (including the empty string)
    • one-character latin-1 singletons

    Now, when you intern a singleton, that exact singleton will be interned.

    • Add a _Py_LATIN1_CHR macro, use it instead of _Py_ID/_Py_STR for one-character latin-1 singletons everywhere (including Clinic).

    • Intern _Py_STR singletons at startup.

    • Beef up the tests. Cover internal details (marked with @cpython_only).

    • Add lots of assertions

  • Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API #121364)

    • Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

    • Document immortality in some functions that take const char *

    This is PyUnicode_InternFromString;
    PyDict_SetItemString, PyObject_SetAttrString;
    PyObject_DelAttrString; PyUnicode_InternFromString;
    and the PyModule_Add convenience functions.

    Always point out a non-immortalizing alternative.

    • Don't immortalize user-provided attr names in _ctypes
  • Immortalize names in code objects to avoid crash (gh-121863: Immortalize names in code objects to avoid crash #121903)

  • Intern latin-1 one-byte strings at startup (gh-122291: Intern latin-1 one-byte strings at startup #122303)

There are some 3.12-specific changes, mainly to allow statically allocated strings in deepfreeze. (In 3.13, deepfreeze switched to the general _Py_ID/_Py_STR.)

Co-authored-by: Eric Snow ericsnowcurrently@gmail.com


📚 Documentation preview 📚: https://cpython-previews--123065.org.readthedocs.build/

Issue: #113993

encukou and others added 7 commits August 16, 2024 13:36
…related issues (pythonGH-120520)

* Add an InternalDocs file describing how interning should work and how to use it.

* Add internal functions to *explicitly* request what kind of interning is done:
  - `_PyUnicode_InternMortal`
  - `_PyUnicode_InternImmortal`
  - `_PyUnicode_InternStatic`

* Switch uses of `PyUnicode_InternInPlace` to those.

* Disallow using `_Py_SetImmortal` on strings directly.
  You should use `_PyUnicode_InternImmortal` instead:
  - Strings should be interned before immortalization, otherwise you're possibly
    interning a immortalizing copy.
  - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to
    `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in
    backports, as they are now part of public API and version-specific ABI.

* Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery.

* Make sure the statically allocated string singletons are unique. This means these sets are now disjoint:
  - `_Py_ID`
  - `_Py_STR` (including the empty string)
  - one-character latin-1 singletons

  Now, when you intern a singleton, that exact singleton will be interned.

* Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic).

* Intern `_Py_STR` singletons at startup.

* Beef up the tests. Cover internal details (marked with `@cpython_only`).

* Add lots of assertions

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
… keep immortalizing in other API (pythonGH-121364)

* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

* Document immortality in some functions that take `const char *`

This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.

Always point out a non-immortalizing alternative.

* Don't immortalize user-provided attr names in _ctypes
(cherry picked from commit b4aedb2)
@encukou encukou added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 21, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @encukou for commit 2640dc8 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Aug 21, 2024
@encukou
Copy link
Member Author

encukou commented Aug 22, 2024

The buildbot failures look unrelated.

@encukou
Copy link
Member Author

encukou commented Sep 23, 2024

@Yhg1s, please review this backport. Sorry about the size!

@Yhg1s Yhg1s merged commit 49f6beb into python:3.12 Sep 27, 2024
29 checks passed
@encukou encukou deleted the mortal-interns-3.12 branch September 27, 2024 23:31
@mgorny
Copy link
Contributor

mgorny commented Oct 2, 2024

FYI, this is at least missing a backport of 281ffb6, and therefore causing Rust packages to crash on assertions. I've filed #124887.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants