-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
gh-130587: Add hand-written docs for non-OP tokens #130588
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Blaise Pabon <blaise@gmail.com>
Co-authored-by: Blaise Pabon <blaise@gmail.com>
The rendered docs are at https://cpython-previews--130588.org.readthedocs.build/en/130588/library/token.html |
@lysnikolaou, does this look reasonable to you? |
A generic token value returned by the :mod:`tokenize` module for | ||
:ref:`operators <operators>` and :ref:`delimiters <delimiters>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A generic token value returned by the :mod:`tokenize` module for | |
:ref:`operators <operators>` and :ref:`delimiters <delimiters>`. | |
A generic token value that indicates an | |
:ref:`operator <operators>` or :ref:`delimiter <delimiters>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include the information that this is only returned by the tokenize
module. It's not used internally in the tokenizer at all. Maybe add it as an implementation detail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tokenize
docs have some more information; how much should be repeated here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add an .. impl-detail::
like you did for ENCODING
?
Doc/library/token.rst
Outdated
Such tokens are produced instead of regular :data:`COMMENT` tokens only when | ||
:func:`ast.parse` is invoked with ``type_comments=True``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does compile()
with the AST flag not produce them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does.
@@ -69,7 +69,7 @@ All input read from non-interactive files has the same form: | |||
.. grammar-snippet:: | |||
:group: python-grammar | |||
|
|||
file_input: (NEWLINE | `statement`)* | |||
file_input: (NEWLINE | `statement`)* ENDMARKER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how these changes are related to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ENDMARKER docs now refer here, and it would be curious if the marker is missing.
The actual gramar does use ENDMARKER.
tokendef_re = re.compile(r'.. data:: (\w+)') | ||
for line in fileobj: | ||
if match := tokendef_re.fullmatch(line.strip()): | ||
if match[1].isupper(): | ||
has_handwritten_doc.add(match[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid re
:
tokendef_re = re.compile(r'.. data:: (\w+)') | |
for line in fileobj: | |
if match := tokendef_re.fullmatch(line.strip()): | |
if match[1].isupper(): | |
has_handwritten_doc.add(match[1]) | |
for line in fileobj: | |
if not line.startswith('.. data:: '): | |
continue | |
tok_name = line.removeprefix('.. data:: ').rstrip() | |
if tok_name.isidentifier() and tok_name.isupper(): | |
has_handwritten_doc.add(tok_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but at the cost of duplicating the prefix. Is that a good tradeoff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, though clarity may be lost in removing duplication:
tokendef_re = re.compile(r'.. data:: (\w+)') | |
for line in fileobj: | |
if match := tokendef_re.fullmatch(line.strip()): | |
if match[1].isupper(): | |
has_handwritten_doc.add(match[1]) | |
for line in map(str.rstrip, fileobj): | |
name = line.removeprefix('.. data:: ') | |
# Token names must be uppercase and made up of alphanumerics or '_' | |
if line != name and name.isidentifier() and name.isupper(): | |
has_handwritten_doc.add(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good improvement to me! Thanks @encukou!
I've left some inline comments regarding some specifics in the docs.
A generic token value returned by the :mod:`tokenize` module for | ||
:ref:`operators <operators>` and :ref:`delimiters <delimiters>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include the information that this is only returned by the tokenize
module. It's not used internally in the tokenizer at all. Maybe add it as an implementation detail?
Doc/library/token.rst
Outdated
Such tokens are produced instead of regular :data:`COMMENT` tokens only when | ||
:func:`ast.parse` is invoked with ``type_comments=True``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it does.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Thank you for the reviews! I addressed some; I'll continue next week. |
Add hand-written docs for non-OP tokens
Make the automation (
generate_token.py
) check that the hand-written docs are present, and only generate docs for the OP tokensSwitch to
list-table
for the OP tokens, to make their docs more compactAdd
ENDMARKER
to the grammar docs where it appears (toplevel components)Add forgotten
versionchanged
entry forEXCLAMATION
Remove docs for
NT_OFFSET
📚 Documentation preview 📚: https://cpython-previews--130588.org.readthedocs.build/