Skip to content

Simplify nested characters classes and expression character classes #574

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
RunDevelopment opened this issue Sep 6, 2023 · 4 comments · Fixed by #595
Closed

Simplify nested characters classes and expression character classes #574

RunDevelopment opened this issue Sep 6, 2023 · 4 comments · Fixed by #595
Labels
enhancement New feature or request new rule

Comments

@RunDevelopment
Copy link
Collaborator

Motivation
The v flag added set operation to character classes. These set operations can be simplified in some cases.

Description
The rule should apply the following simplifications:

  1. [[abc]] -> [abc] and [^[abc]] -> [^abc] and [[^\w&&\d]] -> [^\w&&\d]: remove trivially nested (expression) character classes.
  2. [[abc][def]ghi] -> [abcdefghi]: always inline non-negated character classes.
  3. [a&&[^b]] -> [a--b] and [[^b]&&a] -> [a--b]: use subtraction
  4. [a--[^b]] -> [a&&b]: use intersection
  5. [[^a]&&[^b]] -> [^ab]: De Morgan
@RunDevelopment RunDevelopment added enhancement New feature or request new rule labels Sep 6, 2023
@ota-meshi
Copy link
Owner

Thank you for the rule suggestions! I haven't really used nested character classes yet, but I think the rule is very useful!

@RunDevelopment
Copy link
Collaborator Author

I haven't really used nested character classes yet

Same. I only used them when writing tests for v flag regexes thus far. These simplifications are just a suggestion.

I'm looking forward to using the v flag in practice though. I had to write quite a few regexes like /(?!\s)[\w\x80-\uFFFF]/ in the past, and I can now rewrite them, e.g. as /[[\w\P{ASCII}]--\s]/v.

We might even want to make a rule for that. If a regex has the v flag, then transform (?=[x])[y] => [y&&x] and (?![x])[y] => [y--x]. Let me make an issue for that.

@ota-meshi
Copy link
Owner

I was working on no-useless-character-class rule and noticed that it supports two patterns. #593

  • /[[abc]]/v -> /[abc]/v
  • /[[^\w&&\d]]/v -> /[^\w&&\d]/v

However, the following patterns are not supported yet. I'm wondering whether we should support them with no-useless-character-class rule or create a separate rule.

  • /[^[abc]]/v -> /[^abc]/
  • /[[abc][def]ghi]/v -> /[abcdefghi]/v

https://github.com/ota-meshi/eslint-plugin-regexp/pull/593/files#diff-eb1d64f6019f074be64747d51d62981aaddc46eb17bf309a9f11da0789c5fdabR51-R54


By the way, I think it would be better to use a separate rule to check patterns that rewrite the following expressions. Because I think it's an optimization problem.

  • /[a&&[^b]]/v -> /[a--b]/v
  • /[a--[^b]]/v -> /[a&&b]/v
  • /[[^a]&&[^b]]/v -> /[^ab]/v

@ota-meshi
Copy link
Owner

However, the following patterns are not supported yet. I'm wondering whether we should support them with no-useless-character-class rule or create a separate rule.

  • /[^[abc]]/v -> /[^abc]/
  • /[[abc][def]ghi]/v -> /[abcdefghi]/v

I thought for a moment and thought the rule should support them, so I'll change the rule.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request new rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants