From 26ffdec7f825b3ffacd7d68033a9087328171da9 Mon Sep 17 00:00:00 2001 From: nstarman Date: Wed, 26 Apr 2023 15:25:40 -0400 Subject: [PATCH 01/10] public api Signed-off-by: nstarman --- APE_public.rst | 284 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 284 insertions(+) create mode 100644 APE_public.rst diff --git a/APE_public.rst b/APE_public.rst new file mode 100644 index 00000000..c1253441 --- /dev/null +++ b/APE_public.rst @@ -0,0 +1,284 @@ +Astropy's Public API Policy +--------------------------- + +author: Nathaniel Starkman + +date-created: 2013 November 5 + +date-last-revised: 2013 December 17 + +date-accepted: 2014 January 20 + +type: + +status: Discussion + + +Abstract +-------- + +This APE establishes Astropy's policy for communicating what are public versus +internal interfaces. This policy is implemented in Astropy core and coordinated +packages, and highly recommended -- though not mandatory -- in affiliated +packages. + + +Detailed description +-------------------- + +Pure Python has no language-level means to establish and enforce separations +between public versus internal interfaces -- all code objects can be accessed. +This is in contrast to languages like C++ and Java, which have language-level +constructs to establish public versus private interfaces. Differentiating +between public and internal interfaces is important for everyone, users and +developers alike, to know what interfaces are stable and supported, and which +are not. Consequently, Python has developed conventions for communicating what +is public versus internal. Nominally there are three means, (discussed more +formally later): + +1. **Underscore Prefix**. By convention, names that begin with an underscore + are considered internal. +2. ``__all__``. The dunder attribute ``__all__`` of a module is a list of + names that are considered public. +3. **Documentation**. Public interfaces are documented and internal interfaces + are not, or are specifically noted. + +Astropy partially implements all of these conventions: some private classes and +methods are prefixed with an underscore, while some are not; some modules have +an ``__all__`` attribute, and some do not; some (debatably) internal interfaces +are in the public documentation, like ``astropy.utils``, and some are not. +However, Astropy does not have a uniform approach to public vs internal, which +leads to confusion for users and developers alike. This longstanding issue has +steadily become more pressing with the development of assistive code tools, like +auto-fill in IDEs or more recently with generative coding tools, like GitHub's +CoPilot, which can suggest code to users. This can be problematic if that code +is internal and users do not realize. If we do not have a uniform and systematic +approach to communicating what is public vs internal, then we cannot expect +users and especially their tools to know what is public vs internal. + +Having established that we need a uniform and systematic approach to public vs +internal interfaces, we now decide what that approach should be. The approach +should fulfill a few criteria: + +1. It should be unambiguous and easy to understand. +2. It should be straightforward to implement and maintain, so that it is not a + burden on developers. +3. It should align closely with established conventions, so that it is familiar + to users, developers, and hopefully detected by assistive tools. + +We start by considering established pactice, for criteria 3, and how to ensure +that it satisfies the first two criteria and can be implemented in a uniform and +systematic way. There are three important documents that describe established +practice as it relates to Astropy: + +1. `PEP 8 `_ : + the Python style guide. +2. `scipy + `_: + the largest scientific Python project. +3. `Python's typing guide + `_: + the official guide for how to statically analyze Python code, which is import + for assistive tools. + +Starting with PEP 8, it states that :: + + Documented interfaces are considered public, unless the documentation + explicitly declares them to be provisional or internal interfaces exempt + from the usual backwards compatibility guarantees. All undocumented + interfaces should be assumed to be internal. + + To better support introspection, modules should explicitly declare the names + in their public API using the ``__all__`` attribute. Setting ``__all__`` to + an empty list indicates that the module has no public API. + + Even with ``__all__`` set appropriately, internal interfaces (packages, + modules, classes, functions, attributes or other names) should still be + prefixed with a single leading underscore. + + An interface is also considered internal if any containing namespace + (package, module or class) is considered internal. + + Imported names should always be considered an implementation detail. Other + modules must not rely on indirect access to such imported names unless they + are an explicitly documented part of the containing module’s API, such as + os.path or a package’s __init__ module that exposes functionality from + submodules. + + +This is a more detailed and nuanced description of the three means discussed +above. It is also the most authoritative source, as it is the official Python +style guide. However, it is not as unambiguous as we require: which takes +precedence, the ``__all__`` attribute or the underscore prefix? + +Scipy tries to resolve this ambiguity by stating that :: + + - Methods / functions / classes and module attributes whose names begin with + a leading underscore are private. + - If a class name begins with a leading underscore, none of its members are + public, whether or not they begin with a leading underscore. + - If a module name in a package begins with a leading underscore none of its + members are public, whether or not they begin with a leading underscore. + - If a module or package defines ``__all__``, that authoritatively defines + the public interface. + - If a module or package doesn’t define ``__all__``, then all names that + don’t start with a leading underscore are public. + +This is a good solution, and is consistent with PEP 8, but for our goal of +complete unambiguity and uniformity it falls short on two counts: first, it +does not mention the documentation; second, PEP 8 recommends "modules should +explicitly declare the names in their public API using the ``__all__`` +attribute. Setting ``__all__`` to an empty list indicates that the module has no +public API." SciPy allows modules to lack an ``__all__`` attribute, meaning a +user and their tools must understand the nuances of the previous rules. Having +an ``__all__`` attribute in every module is simpler, unambiguous, and better for +introspection by both users and automated systems. + +Finally, let's consider what `Python typing guide +` +adds. :: + + - Symbols whose names begin with an underscore (but are not dunder names) are + considered private. + + - Imported symbols are considered private by default. If they use the + ``import A as A`` (a redundant module alias), from ``X import A as A`` + (a redundant symbol alias), or ``from . import A`` forms, symbol ``A`` is + not private unless the name begins with an underscore. If a file + ``__init__.py`` uses form ``from .A import X``, symbol ``A`` is treated + likewise. If a wildcard import (of the form ``from X import *``) is used, + all symbols referenced by the wildcard are not private. + + - A module can expose an ``__all__`` symbol at the module level that provides + a list of names that are considered part of the interface. This overrides + all other rules above, allowing imported symbols or symbols whose names + begin with an underscore to be included in the interface. + + - Local variables within a function (including nested functions) are always + considered private. + +This is consistent with PEP 8 and SciPy, but it does not mention the +documentation. This typing recommendations reinforces the idea that the +``__all__`` attribute is the authoritative source of what is public vs internal. +Importantly, the ``__all__`` attribute is a module-level attribute, so it only +applies to the module in which it is defined. This means that a module can +define an ``__all__`` attribute but if the module itself is not public, then +anything in the ``__all__`` attribute cannot be publicly accessed from outside +the module. For example, consider the following module:: + + src/ + __init__/ + __all__ = [] + + _foo/ + __all__ = ['bar'] + def bar(): + pass + +In this case, ``bar`` is public in ``_foo``, but ``src._foo`` is not public, so +``src._foo.bar`` is not public either. + +We propose the following: + +1. That Astropy adopts all the PEP 8 rules on public versus internal interfaces + (the only one I'm not sure about is adding prefixes to private classes and + functions.) +2. That Astropy disambiguates these rules by adopting the subsequent rules for + its API: +3. That Astropy ensures its documentation is consistent with its code, the + latter being the authoritative source. + + - That Astropy explains these rules in its documentation +4. That Astropy also adopts these rules for coordinated packages, and recommends + that affiliated packages follow these rules as well. + +Rules: + +1. A symbol is public if all containing namespaces (packages, modules, classes, + functions, attributes or other names) are public. A symbol is private if any + containing namespace is private. +2. All modules must have an ``__all__`` attribute, even if it is empty. The + ``__all__`` attribute defines the public and private interface of the module + in which it is defined. Anything in ``__all__`` is public, including + underscore-prefixed symbols. Anything not in ``__all__`` is private in that + module. The exception to this rule is implicit namespace packages, which + cannot have a top-level ``__all__`` attribute because they do not have a + top-level ``__init__.py`` file. In these cases, the public interface is + defined by the following rule. +3. A symbol is public if it does not start with an underscore, and private + otherwise. The only exception to this rule is dunder symbols (``__<...>__``), + which follow their rules not determined in this APE. + + +Branches and pull requests +-------------------------- + +N/A + + +Implementation +-------------- + +This APE will be implemented in 3 phases: + +1. **Add** ``__all__``: An ``__all__`` dunder attribute will be added to all + modules that do not have it. +2. **Update the documentation** to reflect the new rules. + + - Add a scipy-like section to the developer documentation. + - Make sure the developer documentation is consistent with the new rules. + - Fix links to always point to the public interface, not the internal + interface. + - Clearly state if a documented object is actually private. +3. **Add prefixes**: 1. Add prefixes to all modules that are not public. 2. Add + prefixes to all classes, functions, and attributes that are not public. + :note:`I'm less enthusiastic on this point.` + +The 3rd phase will be broken into the two listed parts. + + +Backward compatibility +---------------------- + +This APE breaks backward compatibility in two ways: + +1. Changes ``__all__`` in many modules. Many modules define an ``__all__`` but + include symbols that are intended to be private because they should be + imported from other modules (generally the top of the module). This will not + break code that directly imports the symbols (as Python does not use + ``__all__`` for this purpose), but it will break code that uses ``from module + import *``. +2. Adds prefixes to many objects that are not public. This can be done in a + backward compatible way by adding a ``__getattr__`` method to the module that + raises an warning for any object that is not public. This will allow existing + code to continue to work, because it will force people to fix their code to + use the public interface. + + +Alternatives +------------ + +**We do nothing:** + +This is the status quo. It is not a good option because it is not solve the +issue. The aforementioned problems of not i) knowing what is stable and +supported, and what is not, remain. + +**We allow** ``__all__`` **to be optional:** + +Not great. + +The only time this might be good is when a mudule has dynamic symbols from a +`PEP 562 `_ module-level ``__getattr__`` +method. However if it truly dynamic then it cannot be statically analyzed, which +is undesirable for other reasons. We haven't encountered this situation in +Astropy yet, so I don't think it's a good reason to allow ``__all__`` to be +optional. + + +Decision rationale +------------------ + + From a45127331e5a1d87fb56250c9a32289f08c6315d Mon Sep 17 00:00:00 2001 From: nstarman Date: Sun, 30 Jul 2023 17:26:51 -0700 Subject: [PATCH 02/10] update from meeting discussion Signed-off-by: nstarman --- APE_public.rst | 324 ++++++++++++++++++++++++++++++++++--------------- 1 file changed, 228 insertions(+), 96 deletions(-) diff --git a/APE_public.rst b/APE_public.rst index c1253441..1767e0ce 100644 --- a/APE_public.rst +++ b/APE_public.rst @@ -18,14 +18,25 @@ status: Discussion Abstract -------- -This APE establishes Astropy's policy for communicating what are public versus -internal interfaces. This policy is implemented in Astropy core and coordinated -packages, and highly recommended -- though not mandatory -- in affiliated -packages. +This APE establishes Astropy's policy for public versus internal modules and +their contents. -Detailed description --------------------- +Scope +----- + +A library's API is the set of publicly accessible "surface": the modules, +classes, functions, and other symbols provided to the library's users. This APE +is narrowly focused on the module level of the API -- what modules are public +and what modules are internal -- and the objects within those modules -- what +are public and this is communicated. This APE does not address any other +questions of API design, e.g. public versus internal methods in classes. + + +.. _description_of_the_problem: + +Description of the Problem +-------------------------- Pure Python has no language-level means to establish and enforce separations between public versus internal interfaces -- all code objects can be accessed. @@ -44,31 +55,77 @@ formally later): 3. **Documentation**. Public interfaces are documented and internal interfaces are not, or are specifically noted. -Astropy partially implements all of these conventions: some private classes and -methods are prefixed with an underscore, while some are not; some modules have -an ``__all__`` attribute, and some do not; some (debatably) internal interfaces -are in the public documentation, like ``astropy.utils``, and some are not. -However, Astropy does not have a uniform approach to public vs internal, which -leads to confusion for users and developers alike. This longstanding issue has -steadily become more pressing with the development of assistive code tools, like -auto-fill in IDEs or more recently with generative coding tools, like GitHub's -CoPilot, which can suggest code to users. This can be problematic if that code -is internal and users do not realize. If we do not have a uniform and systematic -approach to communicating what is public vs internal, then we cannot expect -users and especially their tools to know what is public vs internal. +These 3 means can be ambiguous and contradictory. Let's explore this point with +an example. Consider an example library with the following layout:: + + src/ + __init__/ + __all__ = ["foo", "_bar"] + + foo/ + __all__ = ["_func"] + def func(): + pass + + def _func(): + pass + + _bar/ + __all__ = ['bunc'] + def bunc(): + pass + + def _bunc(): + pass + + baz/ + __all__ = ['qux'] + + def qux(): + pass + +- Is ``foo`` public? Probably, what what if it's not documented? +- Is ``func`` public? It is not in ``foo.__all__``. +- Is ``_func``public? It is prefixed with an underscore, but in ``foo.__all__``. +- Is ``_bar`` public? It is prefixed with an underscore, but in ``src.__all__``. +- Is ``bunc`` public? It is in ``_bar.__all__``, but what if ``_bar`` is private? +- Is ``_bunc`` public? Probably not, but what if it's documented? +- Is ``baz`` public? It does not start with an underscore, but is not in ``src.__all__``. +- Is ``qux`` public? It does not start with an underscore and is in + ``baz.__all__``, but what if ``baz`` is private? + + +Astropy partially implements all three of the conventions: some private classes +and functions are prefixed with an underscore, while some are not; some modules +have an ``__all__`` attribute, and some do not; some (debatably) internal +interfaces are in the public documentation, like ``astropy.utils``, and some are +not. However, Astropy does not have a uniform approach to differentiating and +communicating public vs internal symbols, which leads to confusion for users and +developers alike. This longstanding issue has steadily become more pressing with +the development of assistive code tools, like auto-fill in IDEs or more recently +with generative coding tools, like GitHub's CoPilot, which suggest code to users +based on a corpus of code examples. When users unintentionally use internal +code, code tools will then suggest that internal code to other users. If we do +not have a uniform and systematic approach to communicating what is public vs +internal, then we cannot expect users and especially their tools to know what is +public vs internal. + + +Solution Criteria +----------------- Having established that we need a uniform and systematic approach to public vs internal interfaces, we now decide what that approach should be. The approach should fulfill a few criteria: -1. It should be unambiguous and easy to understand. -2. It should be straightforward to implement and maintain, so that it is not a - burden on developers. -3. It should align closely with established conventions, so that it is familiar +1. It should align closely with established conventions, so that it is familiar to users, developers, and hopefully detected by assistive tools. +2. It should be unambiguous and easy to understand. +3. It should be straightforward to implement and maintain, so that it is not a + burden on developers. -We start by considering established pactice, for criteria 3, and how to ensure -that it satisfies the first two criteria and can be implemented in a uniform and +We start by considering established practice, for criteria 1, and how to ensure +that it satisfies the latter two criteria and can be implemented in a uniform and systematic way. There are three important documents that describe established practice as it relates to Astropy: @@ -79,10 +136,12 @@ practice as it relates to Astropy: the largest scientific Python project. 3. `Python's typing guide `_: - the official guide for how to statically analyze Python code, which is import + the official guide for how to statically analyze Python code, which is important for assistive tools. -Starting with PEP 8, it states that :: +Starting with `PEP 8 +`_, it states +that :: Documented interfaces are considered public, unless the documentation explicitly declares them to be provisional or internal interfaces exempt @@ -107,12 +166,15 @@ Starting with PEP 8, it states that :: submodules. -This is a more detailed and nuanced description of the three means discussed -above. It is also the most authoritative source, as it is the official Python -style guide. However, it is not as unambiguous as we require: which takes -precedence, the ``__all__`` attribute or the underscore prefix? +This is a more detailed and nuanced description of the three means mentioned in +:ref:`description_of_the_problem`. It is also the most authoritative source, as +it is the official Python style guide. However, it is not as unambiguous as we +require: which takes precedence, the ``__all__`` attribute or the underscore +prefix? -Scipy tries to resolve this ambiguity by stating that :: +`Scipy +`_ +tries to resolve this ambiguity by stating that :: - Methods / functions / classes and module attributes whose names begin with a leading underscore are private. @@ -141,74 +203,125 @@ adds. :: - Symbols whose names begin with an underscore (but are not dunder names) are considered private. - - - Imported symbols are considered private by default. If they use the - ``import A as A`` (a redundant module alias), from ``X import A as A`` - (a redundant symbol alias), or ``from . import A`` forms, symbol ``A`` is - not private unless the name begins with an underscore. If a file - ``__init__.py`` uses form ``from .A import X``, symbol ``A`` is treated - likewise. If a wildcard import (of the form ``from X import *``) is used, - all symbols referenced by the wildcard are not private. - A module can expose an ``__all__`` symbol at the module level that provides a list of names that are considered part of the interface. This overrides all other rules above, allowing imported symbols or symbols whose names begin with an underscore to be included in the interface. - - - Local variables within a function (including nested functions) are always - considered private. This is consistent with PEP 8 and SciPy, but it does not mention the -documentation. This typing recommendations reinforces the idea that the -``__all__`` attribute is the authoritative source of what is public vs internal. +documentation. The Python typing guide reinforces the idea that the ``__all__`` +attribute is the authoritative source of what is public vs internal. Importantly, the ``__all__`` attribute is a module-level attribute, so it only -applies to the module in which it is defined. This means that a module can -define an ``__all__`` attribute but if the module itself is not public, then -anything in the ``__all__`` attribute cannot be publicly accessed from outside -the module. For example, consider the following module:: - - src/ - __init__/ - __all__ = [] +applies to the module in which it is defined. We will refer to this as "locally" +public. This means that a module can define an ``__all__`` attribute but if the +module itself is not public, then anything in the ``__all__`` attribute cannot +be publicly accessed from outside the module. - _foo/ - __all__ = ['bar'] - def bar(): - pass -In this case, ``bar`` is public in ``_foo``, but ``src._foo`` is not public, so -``src._foo.bar`` is not public either. +Solution +-------- We propose the following: -1. That Astropy adopts all the PEP 8 rules on public versus internal interfaces - (the only one I'm not sure about is adding prefixes to private classes and - functions.) -2. That Astropy disambiguates these rules by adopting the subsequent rules for - its API: +1. That Astropy adopts the PEP 8 rules on public versus internal interfaces +2. That Astropy disambiguates these rules similarly to SciPy, by adopting the + subsequent rules for its API: 3. That Astropy ensures its documentation is consistent with its code, the latter being the authoritative source. +4. That Astropy strongly recommends these rules for coordinated packages, and + enourange affiliated packages to follow these rules as well. - - That Astropy explains these rules in its documentation -4. That Astropy also adopts these rules for coordinated packages, and recommends - that affiliated packages follow these rules as well. +**Rules for Public Interfaces:** -Rules: +1. A symbol is public if it is *locally* public in it's containing + namespace, and each containing namespace is *locally* public to the top-level + namespace -- i.e. all containing namespaces are public. + A symbol is private if any containing namespace is private. -1. A symbol is public if all containing namespaces (packages, modules, classes, - functions, attributes or other names) are public. A symbol is private if any - containing namespace is private. 2. All modules must have an ``__all__`` attribute, even if it is empty. The ``__all__`` attribute defines the public and private interface of the module - in which it is defined. Anything in ``__all__`` is public, including - underscore-prefixed symbols. Anything not in ``__all__`` is private in that - module. The exception to this rule is implicit namespace packages, which - cannot have a top-level ``__all__`` attribute because they do not have a - top-level ``__init__.py`` file. In these cases, the public interface is - defined by the following rule. -3. A symbol is public if it does not start with an underscore, and private - otherwise. The only exception to this rule is dunder symbols (``__<...>__``), - which follow their rules not determined in this APE. + in which it is defined. Anything in ``__all__`` is *locally* public, + including underscore-prefixed symbols. Anything not in ``__all__`` is private + in that module. The exception to this rule are modules that cannot have an + ``__all__`` attribute, for example implicit namespace packages. + In these cases, the public interface is defined by the sub-rule. + + - A symbol is *locally* public if it does not start with an underscore, and + private otherwise. The only exception to this rule is dunder symbols + (``__<...>__``), the rules for which are not determined in this APE. + +3. Public symbols must not be prefixed with an underscore. + + - Private symbols need not be prefixed with an underscore, but it is often + recommended. An example of when it is not needed are symbols defined in + private modules, making the prefix redundant. + +4. Public symbols must be documented. + + - If a private symbol is documented (which is not recommended), it must be + obviously, explicitly, (and preferably repeatedly) noted as private. + + +Let's consider an example:: + + src/ + __init__.py:: + __all__ = ["foo", "spam"] + + foo.py:: + __all__ = ["func"] + def func(): + pass + + def _func(): + pass + + _bar.py:: + __all__ = ['bunc'] + def bunc(): + pass + + def _bunc(): + pass + + baz.py:: + __all__ = ['qux'] + + def qux(): + pass + + spam/ # implicit namespace package + + ham.py:: + __all__ = ['eggs'] + + def eggs(): + pass + +In this case, ``bar`` is public in ``foo``, and ``foo`` is public in ``src``, +so ``src.foo.func`` is public. This is public. In contrast, while ``bunc`` is +public in ``_bar``, ``src._bar`` is not public, so ``src._bar.bunc`` is not +public either. This is "locally" public, as in private. Both ``_func`` and +``_bunc`` are private. + +======= ========= ========== ========== ========================= +Symbol Status +======= ========= ========== ========== ========================= +``src`` public + ``.foo`` public + ``.func`` public + ``._func`` private + ``._bar`` public + ``.bunc`` private (locally public) + ``._bunc`` private + ``.baz`` private + ``.bunc`` private (locally public) + ``._bunc`` private + ``spam`` public + ``.ham`` public + ``.eggs`` public +======= ========= ========== ========== ========================= Branches and pull requests @@ -220,22 +333,41 @@ N/A Implementation -------------- -This APE will be implemented in 3 phases: - -1. **Add** ``__all__``: An ``__all__`` dunder attribute will be added to all - modules that do not have it. -2. **Update the documentation** to reflect the new rules. - - - Add a scipy-like section to the developer documentation. - - Make sure the developer documentation is consistent with the new rules. +The process of adopting this APE will change Astropy's API. This APE formally +establishes Astropy's API, so what is meant by "change"? There is already a *de +facto* API, which is what users *think* is the API. This is determined +primarily by the documentation, but also by tab-completion within an interactive +session. In the implementation of this APE, we aim to minimize changes to the +*de facto* API. This will be accomplished in the following 4 phases: + +1. **snapshot the documentation**. As of the adoption of the APE a snapshot of + the documentation will be saved. This will be used to determine what is + currently public and what is not. Until the APE's adoption is complete, this + snapshot is authoritative, e.g. for deciding what must undergo a deprecation + process to be changed. + +2. **Add / update** ``__all__``. The ``__all__`` in each module will be updated + to reflect phase 1. Any modules' missing ``__all__`` will have one added. + +3. **Update the documentation** and **implement deprecations**. + + - Make sure every public symbol is in the documentation, unless it is being + deprecated. + - Deprecate any public symbol that is not intended to be public, but was made + public as part of steps 1 and 2. + - Add a scipy-like section to the developer documentation explaining the + public vs internal rules. - Fix links to always point to the public interface, not the internal interface. + - Add a reminder to the maintainer checklist that public symbols must be + in ``__all__`` and documented. - Clearly state if a documented object is actually private. -3. **Add prefixes**: 1. Add prefixes to all modules that are not public. 2. Add - prefixes to all classes, functions, and attributes that are not public. - :note:`I'm less enthusiastic on this point.` -The 3rd phase will be broken into the two listed parts. +4. **Add prefixes**: Add prefixes to the top-most private symbols. For modules + this makes all their contents private. + +In Astropy core each phase will be be implemented on a per-subpackage basis with +individual pull requests. Backward compatibility @@ -247,13 +379,13 @@ This APE breaks backward compatibility in two ways: include symbols that are intended to be private because they should be imported from other modules (generally the top of the module). This will not break code that directly imports the symbols (as Python does not use - ``__all__`` for this purpose), but it will break code that uses ``from module - import *``. + ``__all__`` for this purpose), but it will break code that expects private + symbols and uses ``import *``. 2. Adds prefixes to many objects that are not public. This can be done in a backward compatible way by adding a ``__getattr__`` method to the module that raises an warning for any object that is not public. This will allow existing - code to continue to work, because it will force people to fix their code to - use the public interface. + code to continue to work, while encouraging people to fix their code to use + the public interface. Alternatives @@ -267,7 +399,7 @@ supported, and what is not, remain. **We allow** ``__all__`` **to be optional:** -Not great. +This is not great. The only time this might be good is when a mudule has dynamic symbols from a `PEP 562 `_ module-level ``__getattr__`` From 171f7087b59ef128ce9a7a626f2338abe4a90d7f Mon Sep 17 00:00:00 2001 From: nstarman Date: Sun, 30 Jul 2023 17:36:18 -0700 Subject: [PATCH 03/10] Add a pre-commit phase Signed-off-by: nstarman --- APE_public.rst | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/APE_public.rst b/APE_public.rst index 1767e0ce..0064533b 100644 --- a/APE_public.rst +++ b/APE_public.rst @@ -337,14 +337,15 @@ The process of adopting this APE will change Astropy's API. This APE formally establishes Astropy's API, so what is meant by "change"? There is already a *de facto* API, which is what users *think* is the API. This is determined primarily by the documentation, but also by tab-completion within an interactive -session. In the implementation of this APE, we aim to minimize changes to the -*de facto* API. This will be accomplished in the following 4 phases: +session. In the implementation of this APE to establish a *formal* API, we aim +to minimize changes to the *de facto* API. This will be accomplished in the +following 5 phases: 1. **snapshot the documentation**. As of the adoption of the APE a snapshot of the documentation will be saved. This will be used to determine what is currently public and what is not. Until the APE's adoption is complete, this - snapshot is authoritative, e.g. for deciding what must undergo a deprecation - process to be changed. + snapshot is authoritative, e.g. dictating what must be added and removed from + ``__all__``, for deciding what must undergo a deprecation process, etc. 2. **Add / update** ``__all__``. The ``__all__`` in each module will be updated to reflect phase 1. Any modules' missing ``__all__`` will have one added. @@ -359,11 +360,18 @@ session. In the implementation of this APE, we aim to minimize changes to the public vs internal rules. - Fix links to always point to the public interface, not the internal interface. - - Add a reminder to the maintainer checklist that public symbols must be + - Add a reminder to the maintainer checklist that to be public a symbol must be in ``__all__`` and documented. - Clearly state if a documented object is actually private. -4. **Add prefixes**: Add prefixes to the top-most private symbols. For modules +4. **Add a pre-commit bot**. This will ensure that the ``__all__`` attribute is + always present and up-to-date. It will also ensure that the documentation is + always up-to-date with the ``__all__`` attribute. This can be accomplished + with a pre-commit hook that runs a script that checks the ``__all__`` + attribute and, if the module is public, searches in the ``docs/api`` + directory for symbols in ``__all__``. + +5. **Add prefixes**: Add prefixes to the top-most private symbols. For modules this makes all their contents private. In Astropy core each phase will be be implemented on a per-subpackage basis with From bfbc31bf7b0c4a16f440d6658745a68ffb74427e Mon Sep 17 00:00:00 2001 From: nstarman Date: Sun, 30 Jul 2023 17:43:41 -0700 Subject: [PATCH 04/10] cleanup Signed-off-by: nstarman --- APE_public.rst | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/APE_public.rst b/APE_public.rst index 0064533b..956080e7 100644 --- a/APE_public.rst +++ b/APE_public.rst @@ -302,8 +302,8 @@ Let's consider an example:: In this case, ``bar`` is public in ``foo``, and ``foo`` is public in ``src``, so ``src.foo.func`` is public. This is public. In contrast, while ``bunc`` is public in ``_bar``, ``src._bar`` is not public, so ``src._bar.bunc`` is not -public either. This is "locally" public, as in private. Both ``_func`` and -``_bunc`` are private. +public either. This is "locally" public, as in internal. Both ``_func`` and +``_bunc`` are internal. ======= ========= ========== ========== ========================= Symbol Status @@ -311,13 +311,13 @@ Symbol Status ``src`` public ``.foo`` public ``.func`` public - ``._func`` private + ``._func`` internal ``._bar`` public - ``.bunc`` private (locally public) - ``._bunc`` private - ``.baz`` private - ``.bunc`` private (locally public) - ``._bunc`` private + ``.bunc`` internal (locally public) + ``._bunc`` internal + ``.baz`` internal + ``.bunc`` internal (locally public) + ``._bunc`` internal ``spam`` public ``.ham`` public ``.eggs`` public @@ -359,7 +359,8 @@ following 5 phases: - Add a scipy-like section to the developer documentation explaining the public vs internal rules. - Fix links to always point to the public interface, not the internal - interface. + interface. (This will presumably require a similar implementation as used + in ``numpy`` to change the ``__module__`` attribute of many objects.) - Add a reminder to the maintainer checklist that to be public a symbol must be in ``__all__`` and documented. - Clearly state if a documented object is actually private. @@ -416,6 +417,11 @@ is undesirable for other reasons. We haven't encountered this situation in Astropy yet, so I don't think it's a good reason to allow ``__all__`` to be optional. +**We just use the documentation.** + +Good code is self-documenting. What is public versus internal is an important +aspect of the code, and should be communicated in the code itself. + Decision rationale ------------------ From fbeec7e7841072fded62a4eeb781b1c0d790359b Mon Sep 17 00:00:00 2001 From: nstarman Date: Sun, 30 Jul 2023 17:51:03 -0700 Subject: [PATCH 05/10] expand alternatives. Signed-off-by: nstarman --- APE_public.rst | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/APE_public.rst b/APE_public.rst index 956080e7..c592101c 100644 --- a/APE_public.rst +++ b/APE_public.rst @@ -410,17 +410,25 @@ supported, and what is not, remain. This is not great. -The only time this might be good is when a mudule has dynamic symbols from a +The only time this might be good is when a module has dynamic symbols from a `PEP 562 `_ module-level ``__getattr__`` method. However if it truly dynamic then it cannot be statically analyzed, which -is undesirable for other reasons. We haven't encountered this situation in -Astropy yet, so I don't think it's a good reason to allow ``__all__`` to be -optional. +is undesirable for other reasons. If it is not truly dynamic, then the "dynamic" +symbols can be added to ``__all__`` and an ``__init__.pyi`` file used to +communicate the public interface to static analyzers. -**We just use the documentation.** +**We make the documentation the authoritative definition of the API.** Good code is self-documenting. What is public versus internal is an important -aspect of the code, and should be communicated in the code itself. +aspect of the code, and should be communicated in the code itself. It is vitally +important that the code and user-facing documentation are consistent, and this +means that the user-facing documentation must reflect the code. + +**We make tab-completion the authoritative definition of the API.** + +This is not a good option because it it does not work with implicit namespace +packages nor `PEP 562 `_ module-level +``__getattr__`` methods. Decision rationale From 27c4d6c9ed1ac4ebabf2275d0717193308d3b281 Mon Sep 17 00:00:00 2001 From: nstarman Date: Sun, 30 Jul 2023 21:32:29 -0700 Subject: [PATCH 06/10] further cleanup Signed-off-by: nstarman --- APE_public.rst | 95 +++++++++++++++++++++++++------------------------- 1 file changed, 48 insertions(+), 47 deletions(-) diff --git a/APE_public.rst b/APE_public.rst index c592101c..5c96a3b0 100644 --- a/APE_public.rst +++ b/APE_public.rst @@ -25,11 +25,11 @@ their contents. Scope ----- -A library's API is the set of publicly accessible "surface": the modules, +A library's API is the publicly accessible "surface": the set of modules, classes, functions, and other symbols provided to the library's users. This APE is narrowly focused on the module level of the API -- what modules are public -and what modules are internal -- and the objects within those modules -- what -are public and this is communicated. This APE does not address any other +and what modules are internal -- and the objects within those modules -- which +are public and how this is communicated. This APE does not address any other questions of API design, e.g. public versus internal methods in classes. @@ -45,7 +45,7 @@ constructs to establish public versus private interfaces. Differentiating between public and internal interfaces is important for everyone, users and developers alike, to know what interfaces are stable and supported, and which are not. Consequently, Python has developed conventions for communicating what -is public versus internal. Nominally there are three means, (discussed more +is public versus internal. Nominally there are three means (discussed more formally later): 1. **Underscore Prefix**. By convention, names that begin with an underscore @@ -84,9 +84,9 @@ an example. Consider an example library with the following layout:: def qux(): pass -- Is ``foo`` public? Probably, what what if it's not documented? +- Is ``foo`` public? Probably, but what if it's not documented? - Is ``func`` public? It is not in ``foo.__all__``. -- Is ``_func``public? It is prefixed with an underscore, but in ``foo.__all__``. +- Is ``_func`` public? It is prefixed with an underscore, but in ``foo.__all__``. - Is ``_bar`` public? It is prefixed with an underscore, but in ``src.__all__``. - Is ``bunc`` public? It is in ``_bar.__all__``, but what if ``_bar`` is private? - Is ``_bunc`` public? Probably not, but what if it's documented? @@ -118,20 +118,20 @@ Having established that we need a uniform and systematic approach to public vs internal interfaces, we now decide what that approach should be. The approach should fulfill a few criteria: -1. It should align closely with established conventions, so that it is familiar - to users, developers, and hopefully detected by assistive tools. -2. It should be unambiguous and easy to understand. -3. It should be straightforward to implement and maintain, so that it is not a +1. Align closely with established conventions, so that it is familiar + to users and developers, and hopefully detected by assistive tools. +2. Be unambiguous and easy to understand. +3. Be straightforward to implement and maintain, so that it is not a burden on developers. We start by considering established practice, for criteria 1, and how to ensure -that it satisfies the latter two criteria and can be implemented in a uniform and +that we satisfy the latter two criteria and can be implemented in a uniform and systematic way. There are three important documents that describe established practice as it relates to Astropy: 1. `PEP 8 `_ : the Python style guide. -2. `scipy +2. `SciPy `_: the largest scientific Python project. 3. `Python's typing guide @@ -141,7 +141,7 @@ practice as it relates to Astropy: Starting with `PEP 8 `_, it states -that :: +that:: Documented interfaces are considered public, unless the documentation explicitly declares them to be provisional or internal interfaces exempt @@ -162,7 +162,7 @@ that :: Imported names should always be considered an implementation detail. Other modules must not rely on indirect access to such imported names unless they are an explicitly documented part of the containing module’s API, such as - os.path or a package’s __init__ module that exposes functionality from + ``os.path`` or a package’s ``__init__`` module that exposes functionality from submodules. @@ -172,7 +172,7 @@ it is the official Python style guide. However, it is not as unambiguous as we require: which takes precedence, the ``__all__`` attribute or the underscore prefix? -`Scipy +`SciPy `_ tries to resolve this ambiguity by stating that :: @@ -188,7 +188,7 @@ tries to resolve this ambiguity by stating that :: don’t start with a leading underscore are public. This is a good solution, and is consistent with PEP 8, but for our goal of -complete unambiguity and uniformity it falls short on two counts: first, it +complete unambiguity and uniformity it falls short on two counts: first, it does not mention the documentation; second, PEP 8 recommends "modules should explicitly declare the names in their public API using the ``__all__`` attribute. Setting ``__all__`` to an empty list indicates that the module has no @@ -199,7 +199,7 @@ introspection by both users and automated systems. Finally, let's consider what `Python typing guide ` -adds. :: +adds.:: - Symbols whose names begin with an underscore (but are not dunder names) are considered private. @@ -225,42 +225,42 @@ Solution We propose the following: 1. That Astropy adopts the PEP 8 rules on public versus internal interfaces -2. That Astropy disambiguates these rules similarly to SciPy, by adopting the +2. That Astropy disambiguate these rules similarly to SciPy, by adopting the subsequent rules for its API: 3. That Astropy ensures its documentation is consistent with its code, the latter being the authoritative source. 4. That Astropy strongly recommends these rules for coordinated packages, and - enourange affiliated packages to follow these rules as well. + encourage affiliated packages to follow these rules as well. **Rules for Public Interfaces:** -1. A symbol is public if it is *locally* public in it's containing - namespace, and each containing namespace is *locally* public to the top-level - namespace -- i.e. all containing namespaces are public. - A symbol is private if any containing namespace is private. +1. A symbol is public if all containing namespaces are public. + A symbol is internal if any containing namespace is internal. 2. All modules must have an ``__all__`` attribute, even if it is empty. The - ``__all__`` attribute defines the public and private interface of the module - in which it is defined. Anything in ``__all__`` is *locally* public, - including underscore-prefixed symbols. Anything not in ``__all__`` is private - in that module. The exception to this rule are modules that cannot have an - ``__all__`` attribute, for example implicit namespace packages. - In these cases, the public interface is defined by the sub-rule. + ``__all__`` attribute specifies the public and internal interface of the + module in which it is defined. Anything in ``__all__`` is *locally* public, + including underscore-prefixed symbols. Anything not in ``__all__`` is + internal in that module. The exception to this rule are modules that cannot + have an ``__all__`` attribute, for example implicit namespace packages, which + lack an ``__init__.py``. In these cases, the public interface is defined by + the sub-rules: - A symbol is *locally* public if it does not start with an underscore, and - private otherwise. The only exception to this rule is dunder symbols - (``__<...>__``), the rules for which are not determined in this APE. + internal otherwise, unless it is a dunder symbol (``__<...>__``). + - Dunder symbols (``__<...>__``) have their own public vs internal rules, + which are not determined in this APE. 3. Public symbols must not be prefixed with an underscore. - - Private symbols need not be prefixed with an underscore, but it is often + - Internal symbols need not be prefixed with an underscore, but it is often recommended. An example of when it is not needed are symbols defined in - private modules, making the prefix redundant. + internal modules, making the prefix redundant. 4. Public symbols must be documented. - - If a private symbol is documented (which is not recommended), it must be - obviously, explicitly, (and preferably repeatedly) noted as private. + - If an internal symbol is documented (which is not recommended), it must be + obviously, explicitly, (and preferably repeatedly) noted as internal. Let's consider an example:: @@ -301,7 +301,7 @@ Let's consider an example:: In this case, ``bar`` is public in ``foo``, and ``foo`` is public in ``src``, so ``src.foo.func`` is public. This is public. In contrast, while ``bunc`` is -public in ``_bar``, ``src._bar`` is not public, so ``src._bar.bunc`` is not +public in ``_bar``, ``src._bar`` is not public, so ``src._bar.bunc`` is not public either. This is "locally" public, as in internal. Both ``_func`` and ``_bunc`` are internal. @@ -341,7 +341,7 @@ session. In the implementation of this APE to establish a *formal* API, we aim to minimize changes to the *de facto* API. This will be accomplished in the following 5 phases: -1. **snapshot the documentation**. As of the adoption of the APE a snapshot of +1. **Snapshot the documentation**. As of the adoption of the APE a snapshot of the documentation will be saved. This will be used to determine what is currently public and what is not. Until the APE's adoption is complete, this snapshot is authoritative, e.g. dictating what must be added and removed from @@ -356,11 +356,11 @@ following 5 phases: deprecated. - Deprecate any public symbol that is not intended to be public, but was made public as part of steps 1 and 2. - - Add a scipy-like section to the developer documentation explaining the + - Add a SciPy-like section to the developer documentation explaining the public vs internal rules. - Fix links to always point to the public interface, not the internal interface. (This will presumably require a similar implementation as used - in ``numpy`` to change the ``__module__`` attribute of many objects.) + in NumPy to change the ``__module__`` attribute of many objects.) - Add a reminder to the maintainer checklist that to be public a symbol must be in ``__all__`` and documented. - Clearly state if a documented object is actually private. @@ -375,7 +375,7 @@ following 5 phases: 5. **Add prefixes**: Add prefixes to the top-most private symbols. For modules this makes all their contents private. -In Astropy core each phase will be be implemented on a per-subpackage basis with +In Astropy core each phase will be implemented on a per-sub-package basis with individual pull requests. @@ -392,7 +392,7 @@ This APE breaks backward compatibility in two ways: symbols and uses ``import *``. 2. Adds prefixes to many objects that are not public. This can be done in a backward compatible way by adding a ``__getattr__`` method to the module that - raises an warning for any object that is not public. This will allow existing + raises a warning for any object that is not public. This will allow existing code to continue to work, while encouraging people to fix their code to use the public interface. @@ -412,21 +412,22 @@ This is not great. The only time this might be good is when a module has dynamic symbols from a `PEP 562 `_ module-level ``__getattr__`` -method. However if it truly dynamic then it cannot be statically analyzed, which +method. However, if it truly dynamic then it cannot be statically analyzed, which is undesirable for other reasons. If it is not truly dynamic, then the "dynamic" symbols can be added to ``__all__`` and an ``__init__.pyi`` file used to communicate the public interface to static analyzers. **We make the documentation the authoritative definition of the API.** -Good code is self-documenting. What is public versus internal is an important -aspect of the code, and should be communicated in the code itself. It is vitally -important that the code and user-facing documentation are consistent, and this -means that the user-facing documentation must reflect the code. +Good code is well documented. Beyond code comments and docstrings, what is +public versus internal is an important aspect of the code, and should be +communicated in the code itself. It is vitally important that the code and +user-facing documentation are consistent, and the user-facing documentation +should reflect the code. **We make tab-completion the authoritative definition of the API.** -This is not a good option because it it does not work with implicit namespace +This is not a good option because it does not work with implicit namespace packages nor `PEP 562 `_ module-level ``__getattr__`` methods. From 6c1416c046430a5ba9c48e17b6577f136b145032 Mon Sep 17 00:00:00 2001 From: nstarman Date: Mon, 31 Jul 2023 10:28:57 -0700 Subject: [PATCH 07/10] rename file to APE number Signed-off-by: nstarman --- APE_public.rst => APE22.rst | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename APE_public.rst => APE22.rst (100%) diff --git a/APE_public.rst b/APE22.rst similarity index 100% rename from APE_public.rst rename to APE22.rst From 1741eb3e70563f3ff888caff2522d38a409f99ff Mon Sep 17 00:00:00 2001 From: Nathaniel Starkman Date: Mon, 14 Aug 2023 17:04:08 -0400 Subject: [PATCH 08/10] Apply suggestions from code review --- APE22.rst | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/APE22.rst b/APE22.rst index 5c96a3b0..4d007593 100644 --- a/APE22.rst +++ b/APE22.rst @@ -1,4 +1,4 @@ -Astropy's Public API Policy +Astropy's API Policy on Modules and their Contents --------------------------- author: Nathaniel Starkman @@ -30,7 +30,8 @@ classes, functions, and other symbols provided to the library's users. This APE is narrowly focused on the module level of the API -- what modules are public and what modules are internal -- and the objects within those modules -- which are public and how this is communicated. This APE does not address any other -questions of API design, e.g. public versus internal methods in classes. +questions of API design, e.g. public versus internal methods in classes, leaving +these for future APEs or future revisions to this APE. .. _description_of_the_problem: @@ -390,7 +391,7 @@ This APE breaks backward compatibility in two ways: break code that directly imports the symbols (as Python does not use ``__all__`` for this purpose), but it will break code that expects private symbols and uses ``import *``. -2. Adds prefixes to many objects that are not public. This can be done in a +2. Adds prefixes to many symbols that are not public. This can be done in a backward compatible way by adding a ``__getattr__`` method to the module that raises a warning for any object that is not public. This will allow existing code to continue to work, while encouraging people to fix their code to use @@ -419,7 +420,7 @@ communicate the public interface to static analyzers. **We make the documentation the authoritative definition of the API.** -Good code is well documented. Beyond code comments and docstrings, what is +Good code is well documented in the code. Beyond code comments and docstrings, what is public versus internal is an important aspect of the code, and should be communicated in the code itself. It is vitally important that the code and user-facing documentation are consistent, and the user-facing documentation From 5eca03752b293e779cc6383a56f506142c32968f Mon Sep 17 00:00:00 2001 From: Nathaniel Starkman Date: Wed, 6 Dec 2023 17:14:05 -0500 Subject: [PATCH 09/10] Apply suggestions from code review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Clément Robert --- APE22.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/APE22.rst b/APE22.rst index 4d007593..018c1894 100644 --- a/APE22.rst +++ b/APE22.rst @@ -41,7 +41,7 @@ Description of the Problem Pure Python has no language-level means to establish and enforce separations between public versus internal interfaces -- all code objects can be accessed. -This is in contrast to languages like C++ and Java, which have language-level +This is in contrast to languages like C++ or Java, which have language-level constructs to establish public versus private interfaces. Differentiating between public and internal interfaces is important for everyone, users and developers alike, to know what interfaces are stable and supported, and which @@ -49,7 +49,7 @@ are not. Consequently, Python has developed conventions for communicating what is public versus internal. Nominally there are three means (discussed more formally later): -1. **Underscore Prefix**. By convention, names that begin with an underscore +1. **Underscore Prefix**. By convention, names that begin with a single underscore are considered internal. 2. ``__all__``. The dunder attribute ``__all__`` of a module is a list of names that are considered public. @@ -125,7 +125,7 @@ should fulfill a few criteria: 3. Be straightforward to implement and maintain, so that it is not a burden on developers. -We start by considering established practice, for criteria 1, and how to ensure +We start by considering established practice, for criterion 1, and how to ensure that we satisfy the latter two criteria and can be implemented in a uniform and systematic way. There are three important documents that describe established practice as it relates to Astropy: @@ -261,7 +261,7 @@ We propose the following: 4. Public symbols must be documented. - If an internal symbol is documented (which is not recommended), it must be - obviously, explicitly, (and preferably repeatedly) noted as internal. + unambiguously, explicitly, (and preferably repeatedly) noted as internal. Let's consider an example:: @@ -301,7 +301,7 @@ Let's consider an example:: pass In this case, ``bar`` is public in ``foo``, and ``foo`` is public in ``src``, -so ``src.foo.func`` is public. This is public. In contrast, while ``bunc`` is +so ``src.foo.func`` is public. In contrast, while ``bunc`` is public in ``_bar``, ``src._bar`` is not public, so ``src._bar.bunc`` is not public either. This is "locally" public, as in internal. Both ``_func`` and ``_bunc`` are internal. @@ -366,7 +366,7 @@ following 5 phases: in ``__all__`` and documented. - Clearly state if a documented object is actually private. -4. **Add a pre-commit bot**. This will ensure that the ``__all__`` attribute is +4. **Add a pre-commit hook**. This will ensure that the ``__all__`` attribute is always present and up-to-date. It will also ensure that the documentation is always up-to-date with the ``__all__`` attribute. This can be accomplished with a pre-commit hook that runs a script that checks the ``__all__`` @@ -403,8 +403,8 @@ Alternatives **We do nothing:** -This is the status quo. It is not a good option because it is not solve the -issue. The aforementioned problems of not i) knowing what is stable and +This is the status quo. It is not a good option because it does not solve the +issue. The aforementioned problems of not knowing what is stable and supported, and what is not, remain. **We allow** ``__all__`` **to be optional:** @@ -413,7 +413,7 @@ This is not great. The only time this might be good is when a module has dynamic symbols from a `PEP 562 `_ module-level ``__getattr__`` -method. However, if it truly dynamic then it cannot be statically analyzed, which +method. However, if it is truly dynamic then it cannot be statically analyzed, which is undesirable for other reasons. If it is not truly dynamic, then the "dynamic" symbols can be added to ``__all__`` and an ``__init__.pyi`` file used to communicate the public interface to static analyzers. From 436baa21dbb2e8cd2e4bfd207f231167b86ec69d Mon Sep 17 00:00:00 2001 From: nstarman Date: Tue, 26 Dec 2023 17:12:32 -0800 Subject: [PATCH 10/10] iff Signed-off-by: nstarman --- APE22.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/APE22.rst b/APE22.rst index 018c1894..9106ba67 100644 --- a/APE22.rst +++ b/APE22.rst @@ -235,7 +235,7 @@ We propose the following: **Rules for Public Interfaces:** -1. A symbol is public if all containing namespaces are public. +1. A symbol is public if and only if all containing namespaces are public. A symbol is internal if any containing namespace is internal. 2. All modules must have an ``__all__`` attribute, even if it is empty. The