Skip to content

gh-94673: Add Per-Interpreter Storage for Static Builtin Types #94995

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Jul 19, 2022

This is a precursor to storing tp_subclasses (and tp_weaklist) on the interpreter state for static builtin types. At a high level, we add the following:

  • _PyStaticType_InitBuiltin()
  • PyInterpreterState.types.builtins

We also shuffle some code around to be able to use _PyStaticType_InitBuiltin(), especially in Objects/structseq.c.

One thing to note is that we add a new "tp_" field: PyTypeObject.tp_static_builtin_index. If adding another field to PyTypeObject is too costly then we could conditionally re-purpose some other field (e.g. tp_subclasses once we don't use it for static types).

@ericsnowcurrently
Copy link
Member Author

For context, I have a branch on top of this one that takes care of tp_subclasses: main...ericsnowcurrently:shareable-static-types-tp_subclasses.

@kumaraditya303
Copy link
Contributor

Thanks for working on this!

Mostly LGTM, a few changes worth looking into:

  • The index should be reset after each finalization
  • Maybe avoid embedding the type cache in type state struct as it may negatively impact performance?
  • Maybe prefix tp_static_builtin_index with an underscore?

@ericsnowcurrently
Copy link
Member Author

Maybe avoid embedding the type cache in type state struct as it may negatively impact performance?

There should be no performance impact since it's still a single static offset from the pointer we were already dereferencing.

Maybe prefix tp_static_builtin_index with an underscore?

None of the other fields have a leading underscore, even the ones that are documented as internal.

That does remind me, I should also add the new field to the docs. (Or maybe I'll ditch the new field and not worry about it. 🙂)

@@ -65,6 +65,7 @@ lookup_maybe_method(PyObject *self, PyObject *attr, int *unbound);
static int
slot_tp_setattro(PyObject *self, PyObject *name, PyObject *value);


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@ericsnowcurrently
Copy link
Member Author

The test_embed failure is because _PyStaticType_Dealloc() bails out if the type still has subclasses (it checks tp_subclasses). This implies that we're still leaking objects out of Py_Finalize(). I'll see what can be done.

@ericsnowcurrently
Copy link
Member Author

FYI, I tried printing some diagnostic info in _PyStaticType_Dealloc() for types with non-empty tp_subclasses.

minimal:

leaked dict: 1
    13 collections.defaultdict
leaked object: 26
    37 dict
     8 itertools.accumulate
    10 itertools.chain
     9 itertools.combinations
     9 itertools.combinations_with_replacement
     7 itertools.compress
     8 itertools.count
     8 itertools.cycle
     8 itertools.dropwhile
     7 itertools.filterfalse
     8 itertools.groupby
     8 itertools.islice
     6 itertools.pairwise
     9 itertools.permutations
     9 itertools.product
     9 itertools.repeat
     7 itertools.starmap
     8 itertools.takewhile
     8 itertools.zip_longest
     7 itertools._grouper
     8 itertools._tee
     5 itertools._tee_dataobject
    43 collections.deque
     8 _collections._deque_iterator
     8 _collections._deque_reverse_iterator
     9 _collections._tuplegetter

This boils down to static type subclasses in extension modules, since they don't get finalized (ever). Maybe we should ignore static types in tp_subclasses...

a more complex case

./python -E -c 'import sys ; from sysconfig import get_platform ; print("%s-%d.%d" % (get_platform(), *sys.version_info[:2])) (from make -j8):

leaked ZeroDivisionError: 2
     3 DivisionByZero
      (refcount: 2)
     2 DivisionUndefined
      (refcount: 2)
leaked TypeError: 1
     3 FloatOperation
      (refcount: 2)
leaked RuntimeError: 2
     3 RunFailedError
    15 ChannelError
leaked OSError: 2
     3 herror
     3 gaierror
leaked ArithmeticError: 2
     8 ZeroDivisionError
    30 DecimalException
leaked Exception: 12
    22 ArithmeticError
    17 OSError
    14 RuntimeError
     6 TypeError
     4 error
     3 InvalidStateError
     3 ArgumentError
     3 error
     3 RecursingInfinitelyError
     3 error
     3 OSSAudioError
     3 error
leaked BaseException: 2
    58 Exception
     3 CancelledError
leaked tuple: 2
     8 datetime.IsoCalendarDate
     2 DecimalTuple
leaked staticmethod: 1
     4 abstractstaticmethod
leaked property: 1
     4 abstractproperty
leaked list: 2
     3 MyList
     8 xxsubtype.spamlist
leaked dict: 3
    13 collections.defaultdict
     4 StgDict
     6 xxsubtype.spamdict
leaked classmethod: 1
     4 abstractclassmethod
leaked type: 7
    17 ABCMeta
    12 _ctypes.PyCStructType
    12 _ctypes.UnionType
    11 _ctypes.PyCPointerType
    10 _ctypes.PyCArrayType
    10 _ctypes.PyCSimpleType
    10 _ctypes.PyCFuncPtrType
leaked object: 91
    49 type
    13 classmethod
    42 dict
    41 list
    20 property
    14 staticmethod
    29 tuple
    59 BaseException
     9 ModuleSpec
     5 BuiltinImporter
     9 _LoaderBasics
    10 FileLoader
    16 _abc._abc_data
     2 ABC
     6 Iterable
     7 Sized
     6 Container
     8 itertools.accumulate
     9 itertools.combinations
     9 itertools.combinations_with_replacement
     8 itertools.cycle
     8 itertools.dropwhile
     8 itertools.takewhile
     8 itertools.islice
     7 itertools.starmap
    10 itertools.chain
     7 itertools.compress
     7 itertools.filterfalse
     8 itertools.count
     8 itertools.zip_longest
     6 itertools.pairwise
     9 itertools.permutations
     9 itertools.product
     9 itertools.repeat
     8 itertools.groupby
     7 itertools._grouper
     8 itertools._tee
     5 itertools._tee_dataobject
    43 collections.deque
     8 _collections._deque_iterator
     8 _collections._deque_reverse_iterator
     9 _collections._tuplegetter
     4 _IterationGuard
     5 WeakSet
    39 _socket.socket
    17 _struct.Struct
     7 _struct.unpack_iterator
     8 _asyncio.FutureIter
     5 TaskStepMethWrapper
     3 _RunningLoopHolder
    31 _asyncio.Future
     4 CArgObject
     2 _ctypes.CThunkObject
    21 _ctypes._CData
     9 _ctypes.CField
     4 _ctypes.DictRemover
     2 _ctypes.StructParam_Type
    76 _curses.window
    38 datetime.date
    29 datetime.time
    36 datetime.timedelta
    13 datetime.tzinfo
   100 decimal.Decimal
    85 decimal.Context
    21 decimal.SignalDictMixin
     5 decimal.ContextManager
     8 Number
     4 _elementtree._element_iterator
    10 xml.etree.ElementTree.TreeBuilder
    35 xml.etree.ElementTree.Element
    11 xml.etree.ElementTree.XMLParser
    17 _multiprocessing.SemLock
     2 _pickle.Pdata
     8 _pickle.PicklerMemoProxy
     8 _pickle.UnpicklerMemoProxy
    12 _pickle.Pickler
     9 _pickle.Unpickler
     9 matmulType
     4 ipowType
     7 awaitType
     3 GenericAlias
     3 Generic
     9 MethInstance
     9 MethClass
     9 MethStatic
     4 _testcapi.HeapCType
     4 _testcapi.ContainerNoGC
    11 MethodDescriptorBase
    18 _xxsubinterpreters.ChannelID
    26 ossaudiodev.oss_audio_device
    13 ossaudiodev.oss_mixer_device

@ericsnowcurrently
Copy link
Member Author

To solve the test_embed failure, I had to stop ignoring types in _PyStaticType_Dealloc() when tp_subclasses was not cleared. This means the subclasses (from non-PEP 489 extension modules) are leaking. However, they were already leaking.

@ericsnowcurrently
Copy link
Member Author

I'm going to split this PR up.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants