Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

rustc: Change byte literals to fixed-size arrays #18480

Closed
wants to merge 1 commit into from

Conversation

alexcrichton
Copy link
Member

This commit alters the type of b"foo" from &'static [u8] to
&'static [u8, ..3] and is an implementation of RFC 339.

This is a breaking change because not all operations are always compatible with
fixed-size arrays currently when compared with slices. As seen in the diff, if a
fixed-size array is the left hand size of an equality then the operator may not
resolve. Breakage may require some shuffling or explicitly converting to a slice
via .as_slice() or [].

[breaking-change]
Closes #18465

This commit alters the type of `b"foo"` from `&'static [u8]` to
`&'static [u8, ..3]` and is an implementation of RFC 339.

This is a breaking change because not all operations are always compatible with
fixed-size arrays currently when compared with slices. As seen in the diff, if a
fixed-size array is the left hand size of an equality then the operator may not
resolve. Breakage may require some shuffling or explicitly converting to a slice
via `.as_slice()` or `[]`.

[breaking-change]
Closes rust-lang#18465
@alexcrichton
Copy link
Member Author

Hm, this is bouncing primarily because this doesn't compile:

const FOO: &'static [u8, ..3] = [1, 2, 3];

fn main() {
    let a = [1u8, 2, 3].as_slice();
    match a {
        FOO => {}
        _ => {}
    }
}

@jakub-, do you know if this would be an easy change to make to checking matches?

@ghost
Copy link

ghost commented Nov 7, 2014

@alexcrichton Just to say, I didn't miss your comment, will investigate soon!

At the first glance, for the above to work it would require an implicit coercion which we try to stay away from in patterns, I think. We definitely need more principled thinking here before 1.0, though!

@alexcrichton
Copy link
Member Author

I'll reopen this when I have a chance to fix this.

@geofft
Copy link
Contributor

geofft commented Feb 11, 2015

@alexcrichton Can you help me understand what is the conceptual issue with that code snippet not compiling? Is it that there would be no way to match a &[u8] against a bytestring literal, because match patterns are patterns, not full expressions (so you can't even insist on people calling .as_slice())? The buildbot links no longer work so I can't see the actual tests that failed.

What are the implications of allowing this specific coercion in match patterns? Given that they have to be constants, the specific case of allowing either sized or unsized arrays to match an unsized array seems like it ought to be fine (for any array type), but I'm sure there's complexity here that I'm missing.

I'm interested in this feature (as I understand it, it's required for C-ABI dynamic libraries to export string constants / static pointers to strings without trickery) so I'd like to know what needs to be done to get this change into 1.0.

It would also be fine for my use case if there were special syntax to make a bytestring (or string!) literal sized, but they remained unsized by default; it would certainly be a readability improvement over using an external macro.

@alexcrichton
Copy link
Member Author

@geofft the problem was getting this code to compile:

const FOO: &'static [u8; 3] = [1, 2, 3];

fn main() {
    let a = [1u8, 2, 3].as_slice();
    match a {
        FOO => {}
        _ => {}
    }
}

This would prevent match arms using b"foo" as a pattern, for example.

@petrochenkov
Copy link
Contributor

@alexcrichton
I tried to implement this (again) last week and a straightforward workaround succesfully compiled:

// From libstd/path.rs
 fn parse_single_component(comp: &[u8]) -> Option<Component> {
+    const DOT: &'static [u8] = b".";
+    const DOTDOT: &'static [u8] = b"..";
+    const EMPTY: &'static [u8] = b"";
     match comp {
-        b"." => Some(Component::CurDir),
-        b".." => Some(Component::ParentDir),
-        b"" => None,
+        DOT => Some(Component::CurDir),
+        DOTDOT => Some(Component::ParentDir),
+        EMPTY => None,
         _ => Some(Component::Normal(unsafe { u8_slice_as_os_str(comp) }))
     }
 }

The true problem is in the LLVM error #17233 , which apparently is't related to b"..." being a DST, as @huonw assumed, because it appears exactly when I change b"..." to be a non-DST and try to compile libstd. I'd really like to have this LLVM issue fixed, but have no idea how to do it myself. (Any chances that #21744 will resolve it?)

@alexcrichton
Copy link
Member Author

I suspect this is likely a problem in our own compiler vs LLVM, and I wouldn't personally take this patch with the require workaround you listed unfortunately (although it should in theory be easy to fix, I just don't know where to do so)

@petrochenkov
Copy link
Contributor

@geofft
It is implemented now, the PR was merged today.

@geofft
Copy link
Contributor

geofft commented Mar 18, 2015

oh awesome! I'll play around with it when the next nightly hits. From the discussion on #22838, sounds like you went with the approach of having bytestring literals in patterns work as either &[u8] or &[u8; N]? That sounds perfect.

@alexcrichton alexcrichton deleted the issue-18465 branch March 18, 2015 16:41
@geofft
Copy link
Contributor

geofft commented Mar 25, 2015

@petrochenkov Thanks a bunch, I can now use bytestring literals in static variables. Double-casting as *const u8 as *const libc::c_char is a little awkward but totally manageable.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Byte string literals should have a type of a fixed size
4 participants