Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Enable some repetitions for \A and \Z #5349

Merged

Conversation

NVnavkumar
Copy link
Collaborator

Fixes #4800 (which was also partially fixed by PR#5319 when \Z was finally re-enabled).

This enables using \A and \Z in some repetition sequences to add more full support for those escape sequences in regular expressions on the GPU. This enables:

  • + near \A or \Z
  • {n} or {n,} or {n,m} where n > 0

NOTE:

  • \A* and \A{...} can be transpiled to \A?; however cuDF does not yet support \A?, so this * and ? and {0,} will all still fallback to CPU.

Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
Signed-off-by: Navin Kumar <navink@nvidia.com>
@NVnavkumar NVnavkumar self-assigned this Apr 28, 2022
@sameerz sameerz added the feature request New feature or request label Apr 28, 2022
@sameerz sameerz added this to the Apr 18 - Apr 29 milestone Apr 28, 2022
@NVnavkumar
Copy link
Collaborator Author

build

@NVnavkumar NVnavkumar requested a review from andygrove May 2, 2022 22:25
Comment on lines +865 to +866
case (RegexEscaped('A'), '+') |
(RegexSequence(ListBuffer(RegexEscaped('A'))), '+') =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is repeated a few times. I wonder if it is worth introducing a utility function that can simply expressions to remove redundant list buffers?

Copy link
Contributor

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one suggestion but not critical for this PR.

@NVnavkumar NVnavkumar merged commit 1e3a9a3 into NVIDIA:branch-22.06 May 3, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Enable support for more regular expressions with \A and \Z
3 participants