Enable token-based rules on source with syntax errors #11950

dhruvmanila · 2024-06-20T11:46:30Z

Summary

This PR updates the linter, specifically the token-based rules, to work on the tokens that come after a syntax error.

For context, the token-based rules only diagnose the tokens up to the first lexical error. This PR builds up an error resilience by introducing a TokenIterWithContext which updates the nesting level and tries to reflect it with what the lexer is seeing. This isn't 100% accurate because if the parser recovered from an unclosed parenthesis in the middle of the line, the context won't reduce the nesting level until it sees the newline token at the end of the line.

resolves: #11915

Test Plan

Add test cases for a bunch of rules that are affected by this change.
Run the fuzzer for a long time, making sure to fix any other bugs.

dhruvmanila · 2024-06-28T05:04:43Z

crates/ruff_linter/src/checkers/tokens.rs

-        for token in tokens.up_to_first_unknown() {
+        for token in tokens {
            pylint::rules::invalid_string_characters(


This is looking at string tokens and the lexer doesn't emit them if it's unterminated. So, we might get away with not doing anything in this case for now.

dhruvmanila · 2024-06-28T05:11:16Z

crates/ruff_linter/src/doc_lines.rs

 impl<'a> DocLines<'a> {
    fn new(tokens: &'a Tokens) -> Self {
        Self {
-            inner: tokens.up_to_first_unknown().iter(),
+            inner: tokens.iter(),
            prev: TextSize::default(),
        }


This extracts a specific set of comments so it doesn't require any specific update.

dhruvmanila · 2024-06-28T05:14:00Z

crates/ruff_linter/src/rules/flake8_commas/rules/trailing_commas.rs

-            _ => {
+            kind => {
+                if matches!(kind, TokenKind::Newline if fstrings > 0) {
+                    // The parser recovered from an unterminated f-string.
+                    fstrings = 0;
+                }


I think this should work as the newline tokens within f-strings are actually NonLogicalNewline, I'll move this into TokenIterWithContext. I'll test this a lot because f-strings are complex.

dhruvmanila · 2024-06-28T05:58:55Z

crates/ruff_linter/src/directives.rs

+    for token in tokens {
        match token.kind() {
-            TokenKind::EndOfFile => {
-                break;
-            }
-


The token stream doesn't contain the EndOfFile token.

github-actions · 2024-06-28T06:24:40Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check encountered linter errors. (no lint changes; 1 project error)

demisto/content (error)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'unfixable' -> 'lint.unfixable'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
warning: `PGH001` has been remapped to `S307`.
warning: `PGH002` has been remapped to `G010`.
warning: `PLR1701` has been remapped to `SIM101`.
ruff failed
  Cause: Selection of deprecated rule `E999` is not allowed when preview is enabled.

Formatter (stable)

ℹ️ ecosystem check encountered format errors. (no format changes; 1 project error)

openai/openai-cookbook (error)

warning: Detected debug build without --no-cache.
error: Failed to parse examples/gpt_actions_library/.gpt_action_getting_started.ipynb:11:1:1: Expected an expression
error: Failed to parse examples/gpt_actions_library/gpt_action_bigquery.ipynb:13:1:1: Expected an expression

Formatter (preview)

ℹ️ ecosystem check encountered format errors. (no format changes; 2 project errors)

demisto/content (error)

ruff format --preview --exclude Packs/ThreatQ/Integrations/ThreatQ/ThreatQ.py

warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'unfixable' -> 'lint.unfixable'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
warning: `PGH001` has been remapped to `S307`.
warning: `PGH002` has been remapped to `G010`.
warning: `PLR1701` has been remapped to `SIM101`.
ruff failed
  Cause: Selection of deprecated rule `E999` is not allowed when preview is enabled.

openai/openai-cookbook (error)

ruff format --preview

warning: Detected debug build without --no-cache.
error: Failed to parse examples/gpt_actions_library/.gpt_action_getting_started.ipynb:11:1:1: Expected an expression
error: Failed to parse examples/gpt_actions_library/gpt_action_bigquery.ipynb:13:1:1: Expected an expression

This PR reverts #12016 with a small change where the error location points to the continuation character only. Earlier, it would also highlight the whitespace that came before it. The motivation for this change is to avoid panic in #11950. For example: ```py \) ``` Playground: https://play.ruff.rs/87711071-1b54-45a3-b45a-81a336a1ea61 The range of `Unknown` token and `Rpar` is the same. Once #11950 is enabled, the indexer would panic. It won't panic in the stable version because we stop at the first `Unknown` token.

codspeed-hq · 2024-07-01T11:23:26Z

CodSpeed Performance Report

Merging #11950 will not alter performance

_{Comparing dhruv/token-rules-with-syntax-errors (2e932e3) with main (88a4cc4)}

Summary

✅ 30 untouched benchmarks

MichaReiser

This is nice!

MichaReiser · 2024-07-02T05:55:38Z

crates/ruff_linter/src/rules/pycodestyle/rules/blank_lines.rs

-                    if !line_is_comment_only {
-                        self.max_preceding_blank_lines = BlankLines::Zero;
-                    }
+            if kind.is_any_newline() && !self.tokens.in_parenthesized_context() {


This is an improvement even without the error recoverability :)

MichaReiser · 2024-07-02T05:57:17Z

crates/ruff_python_parser/src/lib.rs

+            TokenKind::Newline if self.nesting > 0 => {
+                self.nesting = 0;
+            }


That's simpler than I expected. Nice

## Summary This PR updates Ruff to **not** generate auto-fixes if the source code contains syntax errors as determined by the parser. The main motivation behind this is to avoid infinite autofix loop when the token-based rules are run over any source with syntax errors in #11950. Although even after this, it's not certain that there won't be an infinite autofix loop because the logic might be incorrect. For example, #12094 and #12136. This requires updating the test infrastructure to not validate for fix availability status when the source contained syntax errors. This is required because otherwise the fuzzer might fail as it uses the test function to run the linter and validate the source code. resolves: #11455 ## Test Plan `cargo insta test`

dhruvmanila added the rule Implementing or modifying a lint rule label Jun 20, 2024

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from b7134c9 to 7db979b Compare June 20, 2024 12:16

dhruvmanila commented Jun 28, 2024

View reviewed changes

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from 7db979b to 1961406 Compare June 28, 2024 05:58

dhruvmanila commented Jun 28, 2024

View reviewed changes

dhruvmanila mentioned this pull request Jun 28, 2024

Revert "Use correct range to highlight line continuation error" #12089

Merged

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from 6e96839 to f3bbacd Compare June 28, 2024 11:26

dhruvmanila changed the base branch from main to dhruv/revert June 28, 2024 11:26

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from f3bbacd to 7b42997 Compare June 28, 2024 11:27

Base automatically changed from dhruv/revert to main June 28, 2024 12:40

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from 7b42997 to eeb24b1 Compare June 28, 2024 13:51

dhruvmanila mentioned this pull request Jul 1, 2024

Ruff applies auto-fix in files with syntax errors #11455

Closed

dhruvmanila changed the base branch from main to dhruv/disable-autofix July 1, 2024 11:18

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from eeb24b1 to 4019ca4 Compare July 1, 2024 11:18

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from 4019ca4 to 27f494e Compare July 1, 2024 11:40

dhruvmanila mentioned this pull request Jul 1, 2024

Disable auto-fix when source has syntax errors #12134

Merged

dhruvmanila force-pushed the dhruv/disable-autofix branch from 3390bf0 to b58e87b Compare July 1, 2024 13:33

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch 2 times, most recently from 6bb916f to 85baab7 Compare July 1, 2024 14:38

dhruvmanila marked this pull request as ready for review July 1, 2024 14:39

dhruvmanila requested a review from MichaReiser as a code owner July 1, 2024 14:39

MichaReiser approved these changes Jul 2, 2024

View reviewed changes

Base automatically changed from dhruv/disable-autofix to main July 2, 2024 08:52

Enable token-based rules on source with syntax errors

2e932e3

dhruvmanila force-pushed the dhruv/token-rules-with-syntax-errors branch from 85baab7 to 2e932e3 Compare July 2, 2024 08:53

dhruvmanila enabled auto-merge (squash) July 2, 2024 08:56

dhruvmanila merged commit 8f40928 into main Jul 2, 2024
19 checks passed

dhruvmanila deleted the dhruv/token-rules-with-syntax-errors branch July 2, 2024 08:57

BrewTestBot mentioned this pull request Jul 5, 2024

ruff 0.5.1 Homebrew/homebrew-core#176475

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable token-based rules on source with syntax errors #11950

Enable token-based rules on source with syntax errors #11950

dhruvmanila commented Jun 20, 2024 •

edited

Loading

dhruvmanila Jun 28, 2024

dhruvmanila Jun 28, 2024

dhruvmanila Jun 28, 2024

dhruvmanila Jun 28, 2024

github-actions bot commented Jun 28, 2024 •

edited

Loading

codspeed-hq bot commented Jul 1, 2024 •

edited

Loading

MichaReiser left a comment

MichaReiser Jul 2, 2024

MichaReiser Jul 2, 2024

Enable token-based rules on source with syntax errors #11950

Enable token-based rules on source with syntax errors #11950

Conversation

dhruvmanila commented Jun 20, 2024 • edited Loading

Summary

Test Plan

dhruvmanila Jun 28, 2024

Choose a reason for hiding this comment

dhruvmanila Jun 28, 2024

Choose a reason for hiding this comment

dhruvmanila Jun 28, 2024

Choose a reason for hiding this comment

dhruvmanila Jun 28, 2024

Choose a reason for hiding this comment

github-actions bot commented Jun 28, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

codspeed-hq bot commented Jul 1, 2024 • edited Loading

CodSpeed Performance Report

Merging #11950 will not alter performance

Summary

MichaReiser left a comment

Choose a reason for hiding this comment

MichaReiser Jul 2, 2024

Choose a reason for hiding this comment

MichaReiser Jul 2, 2024

Choose a reason for hiding this comment

dhruvmanila commented Jun 20, 2024 •

edited

Loading

github-actions bot commented Jun 28, 2024 •

edited

Loading

`ruff-ecosystem` results

codspeed-hq bot commented Jul 1, 2024 •

edited

Loading