-
Notifications
You must be signed in to change notification settings - Fork 13.4k
adding wcwidth for char in libcore #15224
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
It can be a function/method on
There's a
The comment suggests that this is mainly a legacy thing, and so (in my mind) isn't a high priority. |
Add libunicode; move unicode functions from core - created new crate, libunicode, below libstd - split `Char` trait into `Char` (libcore) and `UnicodeChar` (libunicode) - Unicode-aware functions now live in libunicode - `is_alphabetic`, `is_XID_start`, `is_XID_continue`, `is_lowercase`, `is_uppercase`, `is_whitespace`, `is_alphanumeric`, `is_control`, `is_digit`, `to_uppercase`, `to_lowercase` - added `width` method in UnicodeChar trait - determines printed width of character in columns, or None if it is a non-NULL control character - takes a boolean argument indicating whether the present context is CJK or not (characters with 'A'mbiguous widths are double-wide in CJK contexts, single-wide otherwise) - split `StrSlice` into `StrSlice` (libcore) and `UnicodeStrSlice` (libunicode) - functionality formerly in `StrSlice` that relied upon Unicode functionality from `Char` is now in `UnicodeStrSlice` - `words`, `is_whitespace`, `is_alphanumeric`, `trim`, `trim_left`, `trim_right` - also moved `Words` type alias into libunicode because `words` method is in `UnicodeStrSlice` - unified Unicode tables from libcollections, libcore, and libregex into libunicode - updated `unicode.py` in `src/etc` to generate aforementioned tables - generated new tables based on latest Unicode data - added `UnicodeChar` and `UnicodeStrSlice` traits to prelude - libunicode is now the collection point for the `std::char` module, combining the libunicode functionality with the `Char` functionality from libcore - thus, moved doc comment for `char` from `core::char` to `unicode::char` - libcollections remains the collection point for `std::str` The Unicode-aware functions that previously lived in the `Char` and `StrSlice` traits are no longer available to programs that only use libcore. To regain use of these methods, include the libunicode crate and `use` the `UnicodeChar` and/or `UnicodeStrSlice` traits: extern crate unicode; use unicode::UnicodeChar; use unicode::UnicodeStrSlice; use unicode::Words; // if you want to use the words() method NOTE: this does *not* impact programs that use libstd, since UnicodeChar and UnicodeStrSlice have been added to the prelude. closes #15224 [breaking-change]
Replace `x` with `it` I kept some usages of `x`: * `x`s that are used together with `y`, `z`, ... * `x` that shadow `it`. I use `it` for iterators out of r-a, so there were some cases that I used `it` and `x` together. * `x` in test fixtures. Many of those `x` usages was not me so I thought it's better to keep them as is. I tried to remove the rest, but since there was too many `x` I might missed some of them or changed some of them that I didn't want to change.
It would be nice to have a
wcwidth
-alike, presumably living in core::unicode and exposed as a char method. I've got a working local implementation of this that automatically generates the search tables for 0- and double-width characters from the latest unicode data (this does not need to be done at build time, only when the unicode charsets are updated).If this is a desirable feature, can you give me some guidance as to
wcwidth
is pretty C-ish and maybe not so Rustic).src/etc/unicode_width
?)wcwidth_cjk
-equivalent function, whose behavior is slightly different on certain spacing characters, as described in http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c (see comment above themk_wcwidth_cjk
function).Any other thoughts are, of course, appreciated.
The text was updated successfully, but these errors were encountered: