Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ICU-20575 fix broken default locale mapping for C.UTF-8 #634

Merged
merged 1 commit into from
Apr 25, 2019

Conversation

srl295
Copy link
Member

@srl295 srl295 commented Apr 25, 2019

  1. LANG=C.UTF-8 would not map to en_US_POSIX but to invalid language c
  2. LANG=aa@BB (and similar) would map to aa__BB__BB
Checklist

@srl295 srl295 requested review from markusicu and jefgen April 25, 2019 17:47
@srl295 srl295 self-assigned this Apr 25, 2019
@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

Test case: env LANG=C.UTF-8 icuinfo should return:

    <param name="locale.default">en_US_POSIX</param>
    <param name="locale.default.bcp47">en-US-u-va-posix</param>

Before this bug, they return c and und respectively.

@markusicu
Copy link
Member

I really don't like these mappings at this level. There has to be a way to handle "C.UTF-8" in the getDefaultLocale() code.

@jefgen
Copy link
Member

jefgen commented Apr 25, 2019

FWIW, I believe this exacerbated the issue with the DateTimePatternGenerator (ICU-20558), as "C.UTF-8" would be treated as a default locale of root, rather than "en_US_POSIX".

@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

I really don't like these mappings at this level. There has to be a way to handle "C.UTF-8" in the getDefaultLocale() code.

Markus, makes sense. I'll see what i can do.

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/putil.cpp is now changed in the branch
  • icu4c/source/common/uloc.cpp is no longer changed in the branch

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/putil.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

@markusicu PTAL at the updated approach. There was another bug, where aa@bb turned into aa_BB_BB (doubled…). I think C@PREEURO never worked (even back to 4.8). Not that important, but at least we should not overwrite the variant twice and be consistent. (I've updated the bug to mention this)

@markusicu
Copy link
Member

Thanks for fixing it all via putil.cpp!

Is the PR description still accurate?

@srl295
Copy link
Member Author

srl295 commented Apr 25, 2019

@markusicu welcome. the PR is accurate now (and the commit message already was)

// Over-allocate in case we replace "@" with "__".
char *correctedPOSIXLocale = static_cast<char *>(uprv_malloc(uprv_strlen(posixID) + 1 + 1));
// Over-allocate in case we replace "@" with "__" or "C" with "en_US_POSIX"
char *correctedPOSIXLocale = static_cast<char *>(uprv_malloc(uprv_strlen(posixID) + 11 + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI +10 would suffice because en_US_POSIX has length 11 which is 10 longer than C.
Unless I am mis-counting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right.

Regression was in 1afef30
PR unicode-org#418 [ICU-20187]

- We dropped the mapping from "C" in uloc_canonicalize,
  but then putil did not handle cases where a codepage was
  set (such as C.UTF-8).

- Add an additional check in uprv_getDefaultLocaleID() for
  locales that end up as "C" or "POSIX" after removing codepage
  suffix.

- Also fix regression where aa@bb would become aa__BB__BB
  (incorrectly doubled __BB)
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • icu4c/source/common/putil.cpp is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

Copy link
Member

@markusicu markusicu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@srl295 srl295 merged commit 075cefb into unicode-org:master Apr 25, 2019
@srl295 srl295 deleted the fix-ICU-20575 branch April 25, 2019 21:59
srl295 added a commit to srl295/node that referenced this pull request Apr 26, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: nodejs#27418
targos pushed a commit to nodejs/node that referenced this pull request Apr 27, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
targos pushed a commit to nodejs/node that referenced this pull request Apr 27, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
BethGriggs pushed a commit to nodejs/node that referenced this pull request May 10, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
MylesBorins pushed a commit to nodejs/node that referenced this pull request May 16, 2019
- Floating patch for ICU 64.x
- includes test case

ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575
Backport of: unicode-org/icu#634
Fixes: #27418

PR-URL: #27435
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants