Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[subinterpreters] Global C variables are a problem #81057

Closed
ericsnowcurrently opened this issue May 10, 2019 · 23 comments
Closed

[subinterpreters] Global C variables are a problem #81057

ericsnowcurrently opened this issue May 10, 2019 · 23 comments
Assignees
Labels
3.8 (EOL) end of life topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented May 10, 2019

BPO 36876
Nosy @vsajip, @vstinner, @tiran, @phsilva, @ericsnowcurrently, @soltysh, @pablogsal, @sweeneyde
PRs
  • bpo-36876: Avoid static locals. #13372
  • bpo-36876: Use a consistent variable name for kwlist. #13531
  • bpo-36876: Make some static string literal arrays constant. #15760
  • bpo-36876: Add a tool that identifies unsupported global C variables. #15877
  • bpo-36876: Skip test_check_c_globals for now. #16017
  • bpo-36876: Fix the globals checker tool. #16058
  • bpo-36876: Moved Parser/listnode.c statics to interpreter state. #16328
  • bpo-36876: Re-organize the c-analyzer tool code. #16841
  • bpo-36876: Fix the C analyzer tool. #22841
  • bpo-36876: Small adjustments to the C-analyzer tool. #23045
  • bpo-36876: [c-analyzer tool] Tighten up the results and output. #23431
  • bpo-36876: [c-analyzer tool] Add a "capi" subcommand to the c-analyzer tool. #23918
  • bpo-36876: [c-analyzer tool] Additional CLI updates for "capi" command. #23929
  • bpo-36876: Update the c-analyzer whitelist. #31225
  • bpo-36876: Minor cleanup to c-analyzer "ignored" data.' #31239
  • bpo-36876: Make sure the c-analyzer is checking all the source files.' #31264
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericsnowcurrently'
    closed_at = None
    created_at = <Date 2019-05-10.19:01:42.087>
    labels = ['expert-subinterpreters', 'type-bug', '3.8']
    title = '[subinterpreters] Global C variables are a problem'
    updated_at = <Date 2022-02-10.23:14:23.025>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2022-02-10.23:14:23.025>
    actor = 'eric.snow'
    assignee = 'eric.snow'
    closed = False
    closed_date = None
    closer = None
    components = ['Subinterpreters']
    creation = <Date 2019-05-10.19:01:42.087>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36876
    keywords = ['patch']
    message_count = 22.0
    messages = ['342119', '342125', '352013', '352032', '352063', '352085', '352208', '354928', '356181', '357351', '357352', '357353', '368910', '379390', '379999', '381511', '383698', '383778', '393785', '412886', '412957', '413029']
    nosy_count = 8.0
    nosy_names = ['vinay.sajip', 'vstinner', 'christian.heimes', 'phsilva', 'eric.snow', 'maciej.szulik', 'pablogsal', 'Dennis Sweeney']
    pr_nums = ['13372', '13531', '15760', '15877', '16017', '16058', '16328', '16841', '22841', '23045', '23431', '23918', '23929', '31225', '31239', '31264']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36876'
    versions = ['Python 3.8']

    @ericsnowcurrently
    Copy link
    Member Author

    We still have a bunch of "global" C variables (static globals, static locals, maybe thread-local storage) in our code-base that break the isolation between interpreters. I added Tools/c-globals/check-c-globals.py a while back to help identify such variables, however more have crept in. I also did not take static locals or thread-locals into account.

    To address the above, we should do the following:

    1. update check-c-globals.py to identify static locals (and thread-locals)
    2. deal with any identified globals
    3. add check-c-globals.py to "make check"
    4. (if "make check" isn't already there), ensure check-c-globals.py is run at some point in CI

    Separately, we should move fields out of _PyRuntimeState into PyInterpreterState wherever possible. That can also be done at step 2 if it's not too much work.

    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.8 (EOL) end of life labels May 10, 2019
    @ericsnowcurrently ericsnowcurrently self-assigned this May 10, 2019
    @ericsnowcurrently ericsnowcurrently added the type-bug An unexpected behavior, bug, or error label May 10, 2019
    @ericsnowcurrently
    Copy link
    Member Author

    Also, Tools/c-globals/ignored-globals.txt is a bit out of date (some vars have been removed, renamed, or moved to another file). That should get cleaned up. It might also make sense to update check-c-globals.py to verify that all variables in ignored-globals.txt actually exist.

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset ee536b2 by Eric Snow in branch 'master':
    bpo-36876: Add a tool that identifies unsupported global C variables. (bpo-15877)
    ee536b2

    @db3l
    Copy link
    Contributor

    db3l commented Sep 11, 2019

    The new test_check_c_globals.ActualChecks test is failing with an "unexpected success" on the bolen-ubuntu buildbot (under Ubuntu 18.04.3). I can reproduce the failure in a manually built tree.

    @ericsnowcurrently
    Copy link
    Member Author

    @db3l, I'll take a look right away.

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 64535fc by Eric Snow in branch 'master':
    bpo-36876: Skip test_check_c_globals for now. (gh-16017)
    64535fc

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 088b63e by Eric Snow in branch 'master':
    bpo-36876: Fix the globals checker tool. (gh-16058)
    088b63e

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset e4c431e by Eric Snow in branch 'master':
    bpo-36876: Re-organize the c-analyzer tool code. (gh-16841)
    e4c431e

    @vsajip
    Copy link
    Member

    vsajip commented Nov 7, 2019

    New changeset 9def81a by Vinay Sajip in branch 'master':
    bpo-36876: Moved Parser/listnode.c statics to interpreter state. (GH-16328)
    9def81a

    @ericsnowcurrently
    Copy link
    Member Author

    Thanks, Vinay!

    @ericsnowcurrently
    Copy link
    Member Author

    FYI, others have been tackling this in separate issues (e.g. Victor, anyone relative to PEP-384).

    @ericsnowcurrently
    Copy link
    Member Author

    And I *am* still working on the c-analyzer + a test to that runs it, so we can keep from adding more globals. :)

    @vstinner vstinner added topic-subinterpreters and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels May 15, 2020
    @vstinner vstinner changed the title Global C variables are a problem. [subinterpreters] Global C variables are a problem May 15, 2020
    @vstinner
    Copy link
    Member

    More and more C extensions are converted to multiphase initialization API (PEP-489) and their global variables are moved into a new module state.

    bpo-39465 will make _Py_IDENTIFIER compatible with subinterpreters.

    See also bpo-40521 for caches like free lists and Unicode interned strings.

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 345cd37 by Eric Snow in branch 'master':
    bpo-36876: Fix the C analyzer tool. (GH-22841)
    345cd37

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 4fe7209 by Eric Snow in branch 'master':
    bpo-36876: Small adjustments to the C-analyzer tool. (GH-23045)
    4fe7209

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 9f02b47 by Eric Snow in branch 'master':
    bpo-36876: [c-analyzer tool] Tighten up the results and output. (GH-23431)
    9f02b47

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 7ec59d8 by Eric Snow in branch 'master':
    bpo-36876: [c-analyzer tool] Add a "capi" subcommand to the c-analyzer tool. (gh-23918)
    7ec59d8

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 5ae9be6 by Eric Snow in branch 'master':
    bpo-36876: [c-analyzer tool] Additional CLI updates for "capi" command. (gh-23929)
    5ae9be6

    @sweeneyde
    Copy link
    Member

    I'm getting the following FutureWarning on a certain regular expression. I think it just needs "[]" to become "\[\]".

    0:05:36 load avg: 0.00 [ 53/427] test_check_c_globals
    ...\Tools\c-analyzer\c_common\tables.py:236: FutureWarning: Possible nested set at position 12
    _COLSPEC_RE = re.compile(textwrap.dedent(r'''

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 77bab59 by Eric Snow in branch 'main':
    bpo-36876: Update the c-analyzer whitelist. (gh-31225)
    77bab59

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset cb68788 by Eric Snow in branch 'main':
    bpo-36876: Minor cleanup to c-analyzer "ignored" data.' (gh-31239)
    cb68788

    @ericsnowcurrently
    Copy link
    Member Author

    New changeset 80e4f26 by Eric Snow in branch 'main':
    bpo-36876: Make sure the c-analyzer is checking all the source files.' (gh-31264)
    80e4f26

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    ericsnowcurrently added a commit that referenced this issue Dec 9, 2022
    An earlier commit only defined check_ticks_per_second() when HAVE_TIMES is defined. However, we also need it when HAVE_CLOCK is defined. This primarily affects Windows.
    
    #81057
    @vstinner
    Copy link
    Member

    vstinner commented Dec 9, 2022

    I still don't understand how moving global variables into _PyRuntime does solve any problem. What are the advantages of moving a variable declared as "static" in a C file into _PyRuntime? I don't get the rationale.

    For me, it just sounds more complicated to get _PyRuntime.threads._condattr_monotonic.ptr instead of condattr_monotonic. It's annoying to have to repeat all conditions to decide if a variable should be declared or not in _PyRuntimeState, since they are tons of variables, and many are platform specific. A side effect is that any C file using _PyRuntime should now include headers files required to decide if platform specific variables should be declared or not.

    I already noticed this problem when I made global variables per-interpreter: move them into PyInterpreterState. See the long list of direct includes in pycore_interp.h:

    #include "pycore_atomic.h"        // _Py_atomic_address
    #include "pycore_ast_state.h"     // struct ast_state
    #include "pycore_code.h"          // struct callable_cache
    #include "pycore_context.h"       // struct _Py_context_state
    #include "pycore_dict.h"          // struct _Py_dict_state
    #include "pycore_exceptions.h"    // struct _Py_exc_state
    #include "pycore_floatobject.h"   // struct _Py_float_state
    #include "pycore_genobject.h"     // struct _Py_async_gen_state
    #include "pycore_gil.h"           // struct _gil_runtime_state
    #include "pycore_gc.h"            // struct _gc_runtime_state
    #include "pycore_list.h"          // struct _Py_list_state
    #include "pycore_tuple.h"         // struct _Py_tuple_state
    #include "pycore_typeobject.h"    // struct type_cache
    #include "pycore_unicodeobject.h" // struct _Py_unicode_state
    #include "pycore_warnings.h"      // struct _warnings_runtime_state
    

    But they are also many indirect includes, like <windows.h> which is quite big and has many side effects :-(

    ericsnowcurrently added a commit that referenced this issue Dec 9, 2022
    ericsnowcurrently added a commit that referenced this issue Dec 9, 2022
    ericsnowcurrently added a commit that referenced this issue Dec 12, 2022
    We can't move it to _PyRuntimeState because the symbol is exposed in the stable ABI. We'll have to sort that out before a per-interpreter GIL, but it shouldn't be too hard.
    
    #81057
    carljm added a commit to carljm/cpython that referenced this issue Dec 14, 2022
    * main: (103 commits)
      pythongh-100248: Add missing `ssl_shutdown_timeout` parameter in `asyncio` docs (python#100249)
      Assorted minor fixes for specialization stats. (pythonGH-100219)
      pythongh-100176: venv: Remove redundant compat code for Python <= 3.2 (python#100177)
      pythonGH-100222: Redefine _Py_CODEUNIT as a union to clarify structure of code unit. (pythonGH-100223)
      pythongh-99955: undef ERROR and SUCCESS before redefining (fixes sanitizer warning) (python#100215)
      pythonGH-100206: use versionadded for the addition of sysconfig.get_default_scheme (python#100207)
      pythongh-81057: Move _Py_RefTotal to the "Ignored Globals" List (pythongh-100203)
      pythongh-81057: Move Signal-Related Globals to _PyRuntimeState (pythongh-100085)
      pythongh-81057: Move faulthandler Globals to _PyRuntimeState (pythongh-100152)
      pythongh-81057: Move tracemalloc Globals to _PyRuntimeState (pythongh-100151)
      pythonGH-100143: Improve collecting pystats for parts of runs (pythonGH-100144)
      pythongh-99955: standardize return values of functions in compiler's code-gen (python#100010)
      pythongh-79218: Define `MS_WIN64` macro for Mingw-w64 64bit on Windows (pythonGH-100137)
      Fix: typo (Indention) (pythonGH-99904)
      pythongh-96715 Remove redundant NULL check in `profile_trampoline` function (python#96716)
      pythongh-100176: remove incorrect version compatibility check from argument clinic (python#100190)
      clarify the 4300-digit limit on int-str conversion (python#100175)
      pythongh-70393: Clarify mention of "middle" scope (python#98839)
      pythongh-99688: Fix outdated tests in test_unary (python#99712)
      pythongh-100174: [Enum] Correct PowersOfThree example. (pythonGH-100178)
      ...
    carljm added a commit to carljm/cpython that referenced this issue Dec 16, 2022
    * main:
      Improve stats presentation for calls. (pythonGH-100274)
      Better stats for `LOAD_ATTR` and `STORE_ATTR` (pythonGH-100295)
      pythongh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (python#100277)
      Document that zipfile's pwd parameter is a `bytes` object (python#100209)
      pythongh-99767: mark `PyTypeObject.tp_watched` as internal use only in table (python#100271)
      Fix typo in introduction.rst (python#100266)
      pythongh-78997: AttributeError if loading fails in LibraryLoader.__getattr__
      pythonGH-100234: Set a default value for random.expovariate() (pythonGH-100235)
      Remove uninformative itertools recipe (pythonGH-100253)
      pythonGH-99767: update PyTypeObject docs for type watchers (pythonGH-99928)
      Move stats for the method cache into the `Py_STAT` machinery (pythonGH-100255)
      pythonGH-100222: fix typo _py_set_opocde -> _py_set_opcode (pythonGH-100259)
      pythonGH-100000: Cleanup and polish various watchers code (pythonGH-99998)
      pythongh-90111: Minor Cleanup for Runtime-Global Objects (pythongh-100254)
    shihai1991 added a commit to shihai1991/cpython that referenced this issue Dec 18, 2022
    * origin/main: (1306 commits)
      Correct CVE-2020-10735 documentation (python#100306)
      pythongh-100272: Fix JSON serialization of OrderedDict (pythonGH-100273)
      pythongh-93649: Split tracemalloc tests from _testcapimodule.c (python#99551)
      Docs: Use `PY_VERSION_HEX` for version comparison (python#100179)
      pythongh-97909: Fix markup for `PyMethodDef` members (python#100089)
      pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
      pythongh-99240: Reset pointer to NULL when the pointed memory is freed in argument parsing (python#99890)
      pythonGH-98831: Add DECREF_INPUTS(), expanding to DECREF() each stack input (python#100205)
      pythongh-78707: deprecate passing >1 argument to `PurePath.[is_]relative_to()` (pythonGH-94469)
      pythongh-99540: Constant hash for _PyNone_Type to aid reproducibility (pythonGH-99541)
      pythongh-100039: enhance __signature__ to work with str and callables (pythonGH-100168)
      pythongh-99830: asyncio: Document returns of remove_{reader,writer} (python#100302)
      "Compound statement" docs: Fix with-statement step indexing (python#100286)
      pythonGH-90043: Handle NaNs in COMPARE_OP_FLOAT_JUMP (pythonGH-100278)
      Improve stats presentation for calls. (pythonGH-100274)
      Better stats for `LOAD_ATTR` and `STORE_ATTR` (pythonGH-100295)
      pythongh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (python#100277)
      Document that zipfile's pwd parameter is a `bytes` object (python#100209)
      pythongh-99767: mark `PyTypeObject.tp_watched` as internal use only in table (python#100271)
      Fix typo in introduction.rst (python#100266)
      ...
    ericsnowcurrently added a commit that referenced this issue Mar 9, 2023
    …2505)
    
    distutils was removed in November. However, the c-analyzer relies on it. To solve that here, we vendor the parts the tool needs so it can be run against 3.12+. (Also see gh-92584.)
    
    Note that we may end up removing this code later in favor of a solution in common with the peg_generator tool (which also relies on distutils).  At the least, the copy here makes sure the c-analyzer tool works on 3.12+ in the meantime.
    ericsnowcurrently added a commit that referenced this issue Mar 14, 2023
    …02506)
    
    This will keep us from adding new unsupported (i.e. non-const) C global variables, which would break interpreter isolation.
    
    FYI, historically it is very uncommon for new global variables to get added. Furthermore, it is rare for new code to break the c-analyzer. So the check should almost always pass unnoticed.
    
    Note that I've removed test_check_c_globals. A test wasn't a great fit conceptually and was super slow on debug builds. A CI check is a better fit.
    
    This also resolves gh-100237.
    
    #81057
    carljm added a commit to carljm/cpython that referenced this issue Mar 14, 2023
    * main: (50 commits)
      pythongh-102674: Remove _specialization_stats from Lib/opcode.py (python#102685)
      pythongh-102660: Handle m_copy Specially for the sys and builtins Modules (pythongh-102661)
      pythongh-102354: change python3 to python in docs examples (python#102696)
      pythongh-81057: Add a CI Check for New Unsupported C Global Variables (pythongh-102506)
      pythonGH-94851: check unicode consistency of static strings in debug mode (python#102684)
      pythongh-100315: clarification to `__slots__` docs. (python#102621)
      pythonGH-100227: cleanup initialization of global interned dict (python#102682)
      doc: Remove a duplicate 'versionchanged' in library/asyncio-task (pythongh-102677)
      pythongh-102013: Add PyUnstable_GC_VisitObjects (python#102014)
      pythonGH-102670: Use sumprod() to simplify, speed up, and improve accuracy of statistics functions (pythonGH-102649)
      pythongh-102627: Replace address pointing toward malicious web page (python#102630)
      pythongh-98831: Use DECREF_INPUTS() more (python#102409)
      pythongh-101659: Avoid Allocation for Shared Exceptions in the _xxsubinterpreters Module (pythongh-102659)
      pythongh-101524: Fix the ChannelID tp_name (pythongh-102655)
      pythongh-102069: Fix `__weakref__` descriptor generation for custom dataclasses (python#102075)
      pythongh-98169 dataclasses.astuple support DefaultDict (python#98170)
      pythongh-102650: Remove duplicate include directives from multiple source files (python#102651)
      pythonGH-100987: Don't cache references to the names and consts array in `_PyEval_EvalFrameDefault`. (python#102640)
      pythongh-87092: refactor assemble() to a number of separate functions, which do not need the compiler struct (python#102562)
      pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (python#102631)
      ...
    Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this issue Mar 27, 2023
    …pythongh-102506)
    
    This will keep us from adding new unsupported (i.e. non-const) C global variables, which would break interpreter isolation.
    
    FYI, historically it is very uncommon for new global variables to get added. Furthermore, it is rare for new code to break the c-analyzer. So the check should almost always pass unnoticed.
    
    Note that I've removed test_check_c_globals. A test wasn't a great fit conceptually and was super slow on debug builds. A CI check is a better fit.
    
    This also resolves pythongh-100237.
    
    python#81057
    warsaw pushed a commit to warsaw/cpython that referenced this issue Apr 11, 2023
    …pythongh-102506)
    
    This will keep us from adding new unsupported (i.e. non-const) C global variables, which would break interpreter isolation.
    
    FYI, historically it is very uncommon for new global variables to get added. Furthermore, it is rare for new code to break the c-analyzer. So the check should almost always pass unnoticed.
    
    Note that I've removed test_check_c_globals. A test wasn't a great fit conceptually and was super slow on debug builds. A CI check is a better fit.
    
    This also resolves pythongh-100237.
    
    python#81057
    @github-project-automation github-project-automation bot moved this from Todo to Done in Subinterpreters May 3, 2023
    # for free to join this conversation on GitHub. Already have an account? # to comment
    Labels
    3.8 (EOL) end of life topic-subinterpreters type-bug An unexpected behavior, bug, or error
    Projects
    Status: Done
    Development

    No branches or pull requests

    5 participants