Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Use empty range when there's "gap" in token source #11032

Merged
merged 1 commit into from
Apr 19, 2024
Merged

Conversation

dhruvmanila
Copy link
Member

Summary

This fixes a bug where the parser would panic when there is a "gap" in
the token source.

What's a gap?

The reason it's <= instead of just == is because there could be whitespaces between
the two tokens. For example:

#     last token end
#     | current token (newline) start
#     v v
def foo \n
#      ^
#      assume there's trailing whitespace here

Or, there could tokens that are considered "trivia" and thus aren't emitted by the token
source. These are comments and non-logical newlines. For example:

#     last token end
#     v
def foo # comment\n
#                ^ current token (newline) start

In either of the above cases, there's a "gap" between the end of the last token and start
of the current token.

Test Plan

Add test cases and update the snapshots.

Copy link
Contributor

github-actions bot commented Apr 19, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+0 -1 violations, +0 -0 fixes in 1 projects; 43 projects unchanged)

python/typeshed (+0 -1 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select E,F,FA,I,PYI,RUF,UP,W

- stdlib/pathlib.pyi:111:89: E999 SyntaxError: Expected ':', found newline

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
E999 1 0 1 0 0

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Comment on lines +264 to +268
// It's possible during error recovery that the parsing didn't consume any tokens. In that
// case, `last_token_end` still points to the end of the previous token but `start` is the
// start of the current token. Calling `TextRange::new(start, self.last_token_end)` would
// panic in that case because `start > end`. This path "detects" this case and creates an
// empty range instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: i think this is specific to error recovery and instead is true for all "empty" nodes that don't consist of any tokens?

If that's the case, then I think we can remove the missing_node_range method that was only added to handle empty node ranges.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think missing_node_range is still useful in places where you don't have the start value or rather the start value itself is the current token start.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working parser Related to the parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants