gh-130587: Add hand-written docs for non-OP tokens #130588

encukou · 2025-02-26T15:40:38Z

Add hand-written docs for non-OP tokens
Make the automation (generate_token.py) check that the hand-written docs are present, and only generate docs for the OP tokens
Switch to list-table for the OP tokens, to make their docs more compact
Add ENDMARKER to the grammar docs where it appears (toplevel components)
Add forgotten versionchanged entry for EXCLAMATION
Remove docs for NT_OFFSET

Issue: token module documentation is incomplete #130587

📚 Documentation preview 📚: https://cpython-previews--130588.org.readthedocs.build/

…e rest

…ARKER to that

Co-authored-by: Blaise Pabon <blaise@gmail.com>

encukou · 2025-02-26T16:44:41Z

The rendered docs are at https://cpython-previews--130588.org.readthedocs.build/en/130588/library/token.html

encukou · 2025-02-26T16:45:20Z

@lysnikolaou, does this look reasonable to you?

Doc/library/token-list.inc

Doc/library/token.rst

AA-Turner · 2025-02-26T19:36:42Z

Doc/library/token.rst

+   A generic token value returned by the :mod:`tokenize` module for
+   :ref:`operators <operators>` and :ref:`delimiters <delimiters>`.


Suggested change

A generic token value returned by the :mod:`tokenize` module for

:ref:`operators <operators>` and :ref:`delimiters <delimiters>`.

A generic token value that indicates an

:ref:`operator <operators>` or :ref:`delimiter <delimiters>`.

Let's include the information that this is only returned by the tokenize module. It's not used internally in the tokenizer at all. Maybe add it as an implementation detail?

The tokenize docs have some more information; how much should be repeated here?

Maybe add an .. impl-detail:: like you did for ENCODING?

Doc/library/token.rst

AA-Turner · 2025-02-26T19:46:27Z

Doc/library/token.rst

+   Such tokens are produced instead of regular :data:`COMMENT` tokens only when
+   :func:`ast.parse` is invoked with ``type_comments=True``.


Does compile() with the AST flag not produce them?

I think it does.

AA-Turner · 2025-02-26T19:47:50Z

Doc/reference/toplevel_components.rst

@@ -69,7 +69,7 @@ All input read from non-interactive files has the same form:
 .. grammar-snippet::
   :group: python-grammar

-   file_input: (NEWLINE | `statement`)*
+   file_input: (NEWLINE | `statement`)* ENDMARKER


I'm not sure how these changes are related to this PR?

The ENDMARKER docs now refer here, and it would be curious if the marker is missing.

The actual gramar does use ENDMARKER.

Tools/build/generate_token.py

AA-Turner · 2025-02-26T19:53:31Z

Tools/build/generate_token.py

+        tokendef_re = re.compile(r'.. data:: (\w+)')
+        for line in fileobj:
+            if match := tokendef_re.fullmatch(line.strip()):
+                if match[1].isupper():
+                    has_handwritten_doc.add(match[1])


I think we can avoid re:

Suggested change

tokendef_re = re.compile(r'.. data:: (\w+)')

for line in fileobj:

if match := tokendef_re.fullmatch(line.strip()):

if match[1].isupper():

has_handwritten_doc.add(match[1])

for line in fileobj:

if not line.startswith('.. data:: '):

continue

tok_name = line.removeprefix('.. data:: ').rstrip()

if tok_name.isidentifier() and tok_name.isupper():

has_handwritten_doc.add(tok_name)

Yes, but at the cost of duplicating the prefix. Is that a good tradeoff?

Perhaps, though clarity may be lost in removing duplication:

Suggested change

tokendef_re = re.compile(r'.. data:: (\w+)')

for line in fileobj:

if match := tokendef_re.fullmatch(line.strip()):

if match[1].isupper():

has_handwritten_doc.add(match[1])

for line in map(str.rstrip, fileobj):

name = line.removeprefix('.. data:: ')

# Token names must be uppercase and made up of alphanumerics or '_'

if line != name and name.isidentifier() and name.isupper():

has_handwritten_doc.add(name)

Doc/library/token.rst

lysnikolaou

This looks like a good improvement to me! Thanks @encukou!

I've left some inline comments regarding some specifics in the docs.

Doc/library/token.rst

lysnikolaou · 2025-02-27T10:14:21Z

Doc/library/token.rst

+   A generic token value returned by the :mod:`tokenize` module for
+   :ref:`operators <operators>` and :ref:`delimiters <delimiters>`.


Let's include the information that this is only returned by the tokenize module. It's not used internally in the tokenizer at all. Maybe add it as an implementation detail?

Doc/library/token.rst

lysnikolaou · 2025-02-27T10:19:03Z

Doc/library/token.rst

+   Such tokens are produced instead of regular :data:`COMMENT` tokens only when
+   :func:`ast.parse` is invoked with ``type_comments=True``.


I think it does.

Doc/library/token.rst

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>

encukou · 2025-02-27T11:00:52Z

Thank you for the reviews! I addressed some; I'll continue next week.

encukou and others added 12 commits January 22, 2025 17:57

generate_tokens.py: Only generate docs for 'literal' tokens; check th…

7129f00

…e rest

Change docs for the "literal" tokens to a list-table

d400ae7

Merge in the main branch

352611b

Document most of the tokens; improve top-level grammar docs link ENDM…

4411c37

…ARKER to that

Write prose; reorganize the token list

1bf5511

Fixups

e1e498b

Merge in the main branch

56c5d7c

Correct directive name

8bbbb0f

Co-authored-by: Blaise Pabon <blaise@gmail.com>

Don't use the Gather syntax

eed407e

Co-authored-by: Blaise Pabon <blaise@gmail.com>

Revert some changes to the toplevel_components

0dd236f

Merge in the main branch

90cd1e2

Revert an order change

dca268e

encukou requested review from willingc and AA-Turner as code owners February 26, 2025 15:40

bedevere-app bot added the awaiting core review label Feb 26, 2025

bedevere-app bot mentioned this pull request Feb 26, 2025

token module documentation is incomplete #130587

Open

encukou added docs Documentation in the Doc dir skip news labels Feb 26, 2025

AA-Turner reviewed Feb 26, 2025

View reviewed changes

StanFromIreland reviewed Feb 26, 2025

View reviewed changes

Doc/library/token.rst Outdated Show resolved Hide resolved

Doc/library/token.rst Show resolved Hide resolved

Doc/library/token.rst Show resolved Hide resolved

lysnikolaou reviewed Feb 27, 2025

View reviewed changes

encukou commented Feb 27, 2025

View reviewed changes

Doc/library/token.rst Outdated Show resolved Hide resolved

encukou commented Feb 27, 2025

View reviewed changes

Doc/library/token.rst Outdated Show resolved Hide resolved

Apply suggestions from code review

e02ced8

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-130587: Add hand-written docs for non-OP tokens #130588

gh-130587: Add hand-written docs for non-OP tokens #130588

encukou commented Feb 26, 2025 •

edited by github-actions bot

Loading

encukou commented Feb 26, 2025

encukou commented Feb 26, 2025

AA-Turner Feb 26, 2025

lysnikolaou Feb 27, 2025

encukou Feb 27, 2025

lysnikolaou Feb 27, 2025

AA-Turner Feb 26, 2025

lysnikolaou Feb 27, 2025

AA-Turner Feb 26, 2025

encukou Feb 27, 2025

AA-Turner Feb 26, 2025 •

edited

Loading

encukou Feb 27, 2025

AA-Turner Feb 27, 2025

lysnikolaou left a comment

lysnikolaou Feb 27, 2025

lysnikolaou Feb 27, 2025

encukou commented Feb 27, 2025

		A generic token value returned by the :mod:`tokenize` module for
		:ref:`operators <operators>` and :ref:`delimiters <delimiters>`.

		Such tokens are produced instead of regular :data:`COMMENT` tokens only when
		:func:`ast.parse` is invoked with ``type_comments=True``.

gh-130587: Add hand-written docs for non-OP tokens #130588

Are you sure you want to change the base?

gh-130587: Add hand-written docs for non-OP tokens #130588

Conversation

encukou commented Feb 26, 2025 • edited by github-actions bot Loading

encukou commented Feb 26, 2025

encukou commented Feb 26, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AA-Turner Feb 26, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lysnikolaou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

encukou commented Feb 27, 2025

encukou commented Feb 26, 2025 •

edited by github-actions bot

Loading

AA-Turner Feb 26, 2025 •

edited

Loading