-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add support for POSIX predefined character classes #5692
Add support for POSIX predefined character classes #5692
Conversation
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
@anthony-chang @andygrove We are in burndown for 22.06. Is this meant to target 22.06, or should it be retargeted towards 22.08? |
Good point. @anthony-chang can you retarget for 22.08 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please retarget for 22.08
Signed-off-by: Anthony Chang <antchang@nvidia.com>
This is blocked until the fix in #5610 is merged |
@anthony-chang this has now been merged |
…six-character-classes
…six-character-classes Signed-off-by: Anthony Chang <antchang@nvidia.com>
…six-character-classes Signed-off-by: Anthony Chang <antchang@nvidia.com>
6be0c4a
to
7a2a5a1
Compare
build |
…six-character-classes Signed-off-by: Anthony Chang <antchang@nvidia.com>
tests/src/test/scala/com/nvidia/spark/rapids/RegularExpressionTranspilerSuite.scala
Outdated
Show resolved
Hide resolved
doTranspileTest("\\p{Alnum}", "[a-zA-Z0-9]") | ||
doTranspileTest("\\p{Punct}", "[!\"#$%&'()*+,\\-./:;<=>?@\\^_`{|}~\\[\\]]") | ||
doTranspileTest("\\p{Print}", "[a-zA-Z0-9!\"#$%&'()*+,\\-./:;<=>?@\\^_`{|}~\\[\\]\u0020]") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intentional that only a subset are tested here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this one is mostly a sanity check since the \p{Print}
transpilation recursively uses the definition of the 4 \p{}
classes above it.
I tested the full set in the integration tests.
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
I've removed |
Signed-off-by: Anthony Chang <antchang@nvidia.com>
build |
Closes #4413
This PR adds support for
\p{...}
for these character classes defined here under "POSIX character classes (US-ASCII only)"Signed-off-by: Anthony Chang antchang@nvidia.com