-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Treat args/env as lossy UTF-8 #12283
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
pub fn as_bytes_no_nul<'a>(&'a self) -> &'a [u8] { | ||
if self.buf.is_null() { fail!("CString is null!"); } | ||
unsafe { | ||
cast::transmute((self.buf, self.len())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transmute a raw::Slice
to avoid accidentally getting the order mixed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I was copying as_bytes()
but this is a good excuse to fix that.
os::args() was using str::raw::from_c_str(), which would assert if the C-string wasn't valid UTF-8. Switch to using from_utf8_lossy() instead, and add a separate function os::args_as_bytes() that returns the ~[u8] byte-vectors instead.
Parse the environment by default with from_utf8_lossy. Also provide byte-vector equivalents (e.g. os::env_as_bytes()). Unfortunately, setenv() can't have a byte-vector equivalent because of Windows support, unless we want to define a setenv_bytes() that fails under Windows for non-UTF8 (or non-UTF16).
New version pushed |
Generally I like this approach. Why |
The character |
Change `os::args()` and `os::env()` to use `str::from_utf8_lossy()`. Add new functions `os::args_as_bytes()` and `os::env_as_bytes()` to retrieve the args/env as byte vectors instead. The existing methods were left returning strings because I expect that the common use-case is to want string handling. Fixes #7188.
Minor refactor format-args * Move all linting logic into a single format implementations struct This should help with the future format-args improvements. **NOTE TO REVIEWERS**: use "hide whitespace" in the github diff -- most of the code has shifted, but relatively low number of lines actually modified. Followig up from rust-lang#12274 r? `@xFrednet` --- changelog: none
Change
os::args()
andos::env()
to usestr::from_utf8_lossy()
.Add new functions
os::args_as_bytes()
andos::env_as_bytes()
to retrieve the args/env as byte vectors instead.The existing methods were left returning strings because I expect that the common use-case is to want string handling.
Fixes #7188.