-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Update to Unicode 14.0 #89614
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Update to Unicode 14.0 #89614
Conversation
The "Alphabetic" property in Unicode 14 grew too big for the bitset representation, panicking "cannot pack 264 into 8 bits". However, we were already choosing the skiplist for that anyway, so this doesn't need to be a hard failure. That panic is now a returned `Err`, and then in `emit_codepoints` we automatically defer to skiplist.
(rust-highfive has picked a reviewer for you, use r? to override) |
For comparison, here's the tool output with version 13 data:
And here's with version 14 data:
|
Sorry! I know for next time. Thank you for all the work you're doing upgrading unicode! |
There some unicode crates around, should they be updated too? |
@klensy sure, other crates will have to be updated individually. e.g. |
@bors r+ |
📌 Commit 459a7e3 has been approved by |
Update to Unicode 14.0 The Unicode Standard [announced Version 14.0](https://home.unicode.org/announcing-the-unicode-standard-version-14-0/) on September 14, 2021, and this pull request updates the generated tables in `core` accordingly. This did require a little prep-work in `unicode-table-generator`. First, rust-lang#81358 had modified the generated file instead of the tool, so that change is now reflected in the tool as well. Next, I found that the "Alphabetic" property in version 14 was panicking when generating a bitset, "cannot pack 264 into 8 bits". We've been using the skiplist for that anyway, so I changed this to fail gracefully. Finally, I confirmed that the tool still created the exact same tables for 13 before moving to 14.
…laumeGomez Rollup of 6 pull requests Successful merges: - rust-lang#75644 (Add 'core::array::from_fn' and 'core::array::try_from_fn') - rust-lang#87528 (stack overflow handler specific openbsd change.) - rust-lang#88436 (std: Stabilize command_access) - rust-lang#89614 (Update to Unicode 14.0) - rust-lang#89664 (Add documentation to boxed conversions) - rust-lang#89700 (Fix invalid HTML generation for higher bounds) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Maybe a tidy check should be added that compares the output of the tool to the checked-in file to make sure they're the same? |
Ideally there's some kind of #![autogen_code] metadata and rust analyzer
warns me I am altering autogenerated code.
…On Sat, 9 Oct 2021 at 20:05, bors ***@***.***> wrote:
Merged #89614 <#89614> into master.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#89614 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGEJCAZ5OOBOPJFBGFV4I3UGCG65ANCNFSM5FQDVQ4A>
.
|
@Mark-Simulacrum would mingw-check be a good place to check the generated code? Or maybe a rust-highfive mention? |
That's a good idea; a rust-highfive ping should be quick to set up and sufficient since the file is not changed often. |
Highfive definitely seems like a good idea, someone can file a PR adding that rule somewhere here - https://github.com/rust-lang/highfive/blob/master/highfive/configs/rust-lang/rust.json#L130 I think running the script on each CI to check it's still in sync might be a bit much, though, seems not worth it to me. If we want something more than highfive, I would just fail CI if the file doesn't match a hardcoded hash (e.g., in tidy). Easy to update when needed and fast, as well as potentially more broadly useful (we have several such files, I think). |
Pkgsrc changes: * Adapt a couple of patches Upstream changes: Version 1.57.0 (2021-12-02) ========================== Language -------- - [Macro attributes may follow `#[derive]` and will see the original (pre-`cfg`) input.][87220] - [Accept curly-brace macros in expressions, like `m!{ .. }.method()` and `m!{ .. }?`.][88690] - [Allow panicking in constant evaluation.][89508] Compiler -------- - [Create more accurate debuginfo for vtables.][89597] - [Add `armv6k-nintendo-3ds` at Tier 3\*.][88529] - [Add `armv7-unknown-linux-uclibceabihf` at Tier 3\*.][88952] - [Add `m68k-unknown-linux-gnu` at Tier 3\*.][88321] - [Add SOLID targets at Tier 3\*:][86191] `aarch64-kmc-solid_asp3`, `armv7a-kmc-solid_asp3-eabi`, `armv7a-kmc-solid_asp3-eabihf` \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Avoid allocations and copying in `Vec::leak`][89337] - [Add `#[repr(i8)]` to `Ordering`][89507] - [Optimize `File::read_to_end` and `read_to_string`][89582] - [Update to Unicode 14.0][89614] - [Many more functions are marked `#[must_use]`][89692], producing a warning when ignoring their return value. This helps catch mistakes such as expecting a function to mutate a value in place rather than return a new value. Stabilised APIs --------------- - [`[T; N]::as_mut_slice`][`array::as_mut_slice`] - [`[T; N]::as_slice`][`array::as_slice`] - [`collections::TryReserveError`] - [`HashMap::try_reserve`] - [`HashSet::try_reserve`] - [`String::try_reserve`] - [`String::try_reserve_exact`] - [`Vec::try_reserve`] - [`Vec::try_reserve_exact`] - [`VecDeque::try_reserve`] - [`VecDeque::try_reserve_exact`] - [`Iterator::map_while`] - [`iter::MapWhile`] - [`proc_macro::is_available`] - [`Command::get_program`] - [`Command::get_args`] - [`Command::get_envs`] - [`Command::get_current_dir`] - [`CommandArgs`] - [`CommandEnvs`] These APIs are now usable in const contexts: - [`hint::unreachable_unchecked`] Cargo ----- - [Stabilize custom profiles][cargo/9943] Compatibility notes ------------------- Internal changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Added an experimental backend for codegen with `libgccjit`.][87260] [86191]: rust-lang/rust#86191 [87220]: rust-lang/rust#87220 [87260]: rust-lang/rust#87260 [88243]: rust-lang/rust#88243 [88321]: rust-lang/rust#88321 [88529]: rust-lang/rust#88529 [88690]: rust-lang/rust#88690 [88952]: rust-lang/rust#88952 [89337]: rust-lang/rust#89337 [89507]: rust-lang/rust#89507 [89508]: rust-lang/rust#89508 [89582]: rust-lang/rust#89582 [89597]: rust-lang/rust#89597 [89614]: rust-lang/rust#89614 [89692]: rust-lang/rust#89692 [cargo/9943]: rust-lang/cargo#9943 [`array::as_mut_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_mut_slice [`array::as_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_slice [`collections::TryReserveError`]: https://doc.rust-lang.org/std/collections/struct.TryReserveError.html [`HashMap::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.try_reserve [`HashSet::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html#method.try_reserve [`String::try_reserve`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve [`String::try_reserve_exact`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve_exact [`Vec::try_reserve`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve [`Vec::try_reserve_exact`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve_exact [`VecDeque::try_reserve`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve [`VecDeque::try_reserve_exact`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve_exact [`Iterator::map_while`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map_while [`iter::MapWhile`]: https://doc.rust-lang.org/std/iter/struct.MapWhile.html [`proc_macro::is_available`]: https://doc.rust-lang.org/proc_macro/fn.is_available.html [`Command::get_program`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_program [`Command::get_args`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_args [`Command::get_envs`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs [`Command::get_current_dir`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_current_dir [`CommandArgs`]: https://doc.rust-lang.org/std/process/struct.CommandArgs.html [`CommandEnvs`]: https://doc.rust-lang.org/std/process/struct.CommandEnvs.html
Pkgsrc changes: * Adjust line numbers in a number of patches * remove the --disable-dist-src option, so that we produce the rust-src rust component, which we upload to LOCALSRC to allow the rust-src package to build, which is needed for rust-analyzer. * Cargo checksum for vendor/cc no longer needs patching; checksum for vendor/libc updated Upstream changes: Version 1.57.0 (2021-12-02) ========================== Language -------- - [Macro attributes may follow `#[derive]` and will see the original (pre-`cfg`) input.][87220] - [Accept curly-brace macros in expressions, like `m!{ .. }.method()` and `m!{ .. }?`.][88690] - [Allow panicking in constant evaluation.][89508] Compiler -------- - [Create more accurate debuginfo for vtables.][89597] - [Add `armv6k-nintendo-3ds` at Tier 3\*.][88529] - [Add `armv7-unknown-linux-uclibceabihf` at Tier 3\*.][88952] - [Add `m68k-unknown-linux-gnu` at Tier 3\*.][88321] - [Add SOLID targets at Tier 3\*:][86191] `aarch64-kmc-solid_asp3`, `armv7a-kmc-solid_asp3-eabi`, `armv7a-kmc-solid_asp3-eabihf` \* Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - [Avoid allocations and copying in `Vec::leak`][89337] - [Add `#[repr(i8)]` to `Ordering`][89507] - [Optimize `File::read_to_end` and `read_to_string`][89582] - [Update to Unicode 14.0][89614] - [Many more functions are marked `#[must_use]`][89692], producing a warning when ignoring their return value. This helps catch mistakes such as expecting a function to mutate a value in place rather than return a new value. Stabilised APIs --------------- - [`[T; N]::as_mut_slice`][`array::as_mut_slice`] - [`[T; N]::as_slice`][`array::as_slice`] - [`collections::TryReserveError`] - [`HashMap::try_reserve`] - [`HashSet::try_reserve`] - [`String::try_reserve`] - [`String::try_reserve_exact`] - [`Vec::try_reserve`] - [`Vec::try_reserve_exact`] - [`VecDeque::try_reserve`] - [`VecDeque::try_reserve_exact`] - [`Iterator::map_while`] - [`iter::MapWhile`] - [`proc_macro::is_available`] - [`Command::get_program`] - [`Command::get_args`] - [`Command::get_envs`] - [`Command::get_current_dir`] - [`CommandArgs`] - [`CommandEnvs`] These APIs are now usable in const contexts: - [`hint::unreachable_unchecked`] Cargo ----- - [Stabilize custom profiles][cargo/9943] Compatibility notes ------------------- Internal changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Added an experimental backend for codegen with `libgccjit`.][87260] [86191]: rust-lang/rust#86191 [87220]: rust-lang/rust#87220 [87260]: rust-lang/rust#87260 [88243]: rust-lang/rust#88243 [88321]: rust-lang/rust#88321 [88529]: rust-lang/rust#88529 [88690]: rust-lang/rust#88690 [88952]: rust-lang/rust#88952 [89337]: rust-lang/rust#89337 [89507]: rust-lang/rust#89507 [89508]: rust-lang/rust#89508 [89582]: rust-lang/rust#89582 [89597]: rust-lang/rust#89597 [89614]: rust-lang/rust#89614 [89692]: rust-lang/rust#89692 [cargo/9943]: rust-lang/cargo#9943 [`array::as_mut_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_mut_slice [`array::as_slice`]: https://doc.rust-lang.org/std/primitive.array.html#method.as_slice [`collections::TryReserveError`]: https://doc.rust-lang.org/std/collections/struct.TryReserveError.html [`HashMap::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.try_reserve [`HashSet::try_reserve`]: https://doc.rust-lang.org/std/collections/hash_set/struct.HashSet.html#method.try_reserve [`String::try_reserve`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve [`String::try_reserve_exact`]: https://doc.rust-lang.org/alloc/string/struct.String.html#method.try_reserve_exact [`Vec::try_reserve`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve [`Vec::try_reserve_exact`]: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.try_reserve_exact [`VecDeque::try_reserve`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve [`VecDeque::try_reserve_exact`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.try_reserve_exact [`Iterator::map_while`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.map_while [`iter::MapWhile`]: https://doc.rust-lang.org/std/iter/struct.MapWhile.html [`proc_macro::is_available`]: https://doc.rust-lang.org/proc_macro/fn.is_available.html [`Command::get_program`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_program [`Command::get_args`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_args [`Command::get_envs`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_envs [`Command::get_current_dir`]: https://doc.rust-lang.org/std/process/struct.Command.html#method.get_current_dir [`CommandArgs`]: https://doc.rust-lang.org/std/process/struct.CommandArgs.html [`CommandEnvs`]: https://doc.rust-lang.org/std/process/struct.CommandEnvs.html
The Unicode Standard announced Version 14.0 on September 14, 2021, and this pull request updates the generated tables in
core
accordingly.This did require a little prep-work in
unicode-table-generator
. First, #81358 had modified the generated file instead of the tool, so that change is now reflected in the tool as well. Next, I found that the "Alphabetic" property in version 14 was panicking when generating a bitset, "cannot pack 264 into 8 bits". We've been using the skiplist for that anyway, so I changed this to fail gracefully. Finally, I confirmed that the tool still created the exact same tables for 13 before moving to 14.