-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Work around CLR unicode bug with classification of U+2028 #9519
Work around CLR unicode bug with classification of U+2028 #9519
Conversation
LGTM |
CC @ellismg |
@gafter future-stabilization is basically an ask mode branch. This doesn't intuitively seem like an ask mode issue. Am I missing a case? |
Why did you assign the issue to 2.0 (Preview)? Isn't that the milestone for this preview? |
@gafter ah okay. That's my mistake. |
// U+2028 "LINE SEPARATOR" should be classified by CharUnicodeInfo.GetUnicodeCategory as | ||
// UnicodeCategory.LineSeparator but due to a bug is incorrectly categorized as | ||
// UnicodeCategory.Format. See https://github.com/dotnet/coreclr/issues/3542 | ||
replaceWith = "\\u2028"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of calls to CharUnicodeInfo.GetUnicodeCategory in our codebase and some of those sites can still suffer from the bug in this API. For example (in src\Compilers\Core\Portable\UnicodeCharacterUtilities.cs):
UnicodeCategory cat = CharUnicodeInfo.GetUnicodeCategory(ch);
return IsLetterChar(cat)
|| IsDecimalDigitChar(cat)
|| IsConnectingChar(cat)
|| IsCombiningChar(cat)
|| IsFormattingChar(cat);
Instead of patching this particular call site in this manner, we should consider creating a wrapper for CharUnicodeInfo.GetUnicodeCategory, which would make necessary adjustments to the value it returns and simply change all call sites to use that wrapper.
Moving to the future branch. |
Fixes #9484
@dotnet/roslyn-compiler Please review