Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Tracking issue for mem::unreachable #43751

Closed
tbu- opened this issue Aug 8, 2017 · 74 comments
Closed

Tracking issue for mem::unreachable #43751

tbu- opened this issue Aug 8, 2017 · 74 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@tbu-
Copy link
Contributor

tbu- commented Aug 8, 2017

No description provided.

@sfackler sfackler added B-unstable Blocker: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Aug 8, 2017
@scottmcm
Copy link
Member

scottmcm commented Aug 9, 2017

Bikeshed: I find it surprising that safe unreachable!() and unsafe unreachable() do very different things. Also, this seems to have nothing to do with memory, so mem is an odd location for it.

@durka
Copy link
Contributor

durka commented Aug 9, 2017

Another possible name/location for discussion: raw::assume_unreachable(). Would require changing the raw module's mandate. We could also leave it under intrinsics, but stabilize it (you can use std::intrinsics::transmute; today).

@notriddle
Copy link
Contributor

notriddle commented Aug 9, 2017

This does not belong in mem, and it should not have the same name as the unreachable!() macro.

Why not just call it std::intrinsics::undefined_behaviour()?

@kennytm
Copy link
Member

kennytm commented Aug 9, 2017

There's a suggestion calling it unchecked_unreachable.

@durka
Copy link
Contributor

durka commented Aug 9, 2017

I like @notriddle's suggestion. Does what it says on the tin.

@Mark-Simulacrum Mark-Simulacrum added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Aug 10, 2017
@diwic
Copy link
Contributor

diwic commented Aug 16, 2017

std::raw::eat_my_laundry()

(Sorry, couldn't help myself)

@notriddle
Copy link
Contributor

std::raw::eat_my_laundry()

Except that, in most cases, it won't consume your clothes. Your computer probably doesn't even have the necessary hardware to do anything to your laundry. Undefined behaviour may eat your laundry, because it may do anything; the specification places no constraints on what the result of invoking this intrinsic is. A name like eat_my_laundry fails to express this. Of course, I'm probably taking the suggestion too seriously. 😆

Unlike eat_my_laundry, though, naming the function undefined_behavior is actually a suggestion; it has the advantage of being one less name that people have to remember. And it's not like invoking this intrinsic may invoke UB; invoking this intrinsic is (at least from a specification and documentation POV) exactly the same as invoking UB any other way. There's no conceptual difference, so I'm not sure why there should be a naming difference.

@diwic
Copy link
Contributor

diwic commented Aug 17, 2017

@notriddle Yep, I agree that undefined_behavior is the best suggestion so far. Not sure which module it should fit in though.

(Context for those who came to this community after Rust 1.0: Before 1.0 there was a disclaimer on Rust's homepage saying something like it wasn't production ready yet, and could in theory do anything, including eating your laundry. I don't remember the exact wording.)

@earthengine
Copy link

I would say std itself. Since undefined_behavior is a basic concept, not less common than Box or other common concepts.

@kennytm
Copy link
Member

kennytm commented Aug 18, 2017

@earthengine Box is located at std::boxed::Box, not std::Box. std as a module contains no types or functions. (and there's no way this function will enter prelude.)

@earthengine
Copy link

Then make it inside std::intrinsics.

@nagisa
Copy link
Member

nagisa commented Aug 22, 2017

Consider that another intrinsic, abort, is very similar in behaviour and would probably live in the same place as unreachable.

@nagisa
Copy link
Member

nagisa commented Aug 26, 2017

I propose that we simply stabilise the intrinsics module and keep the functions in there unstable, stabilising only unreachable. I cannot see a single better location for this function, unless we invent another module with identical meaning (think: builtins).

This would need some thinking on how to deal with the currently stable intrinsics in there:

  • copy;
  • copy_nonoverlapping;
  • drop_in_place;
  • transmute;
  • write_bytes;

@notriddle
Copy link
Contributor

Would std::abstract be a workable module name? Making it something like std::abstract::assume_unreachable or std::abstract::undefined_behavior. After all, the distinguishing feature of this function is that it has no definition in machine code.

(I thought about calling it abstract_machine, but that's too long.),

@scottmcm
Copy link
Member

For the currently-stable things in intrinsics: Since they're all exported elsewhere with (nearly?) identical signatures, just deprecate them with a message suggesting the ptr or mem ones instead?

@nagisa
Copy link
Member

nagisa commented Aug 28, 2017 via email

@durka
Copy link
Contributor

durka commented Aug 28, 2017

It doesn't actually trip the stability check, you can import them from intrinsics if you want even though they are re-exported elsewhere. So we could avoid creating another module.

@bluss
Copy link
Member

bluss commented Oct 21, 2017

Do likely/unlikely also belong in the same category, that category that doesn't have a home in a current submodule?

likely/unlikely: #26179

@kennytm
Copy link
Member

kennytm commented Oct 21, 2017

std::hint::{likely, unlikely, unreachable, assume, breakpoint, prefetch_read_data}? (Maybe reexport std::sync::atomic::compiler_fence here as well?)

@hanna-kruppe
Copy link
Contributor

(compiler_fence is not a "hint", it has semantic impact)

@tbu-
Copy link
Contributor Author

tbu- commented Oct 21, 2017

assume and unreachable are also not hints, they also have impact.

@bluss
Copy link
Member

bluss commented Oct 21, 2017

I agree. Especially since their impact for being used incorrectly is simply UB.

@bluss
Copy link
Member

bluss commented Oct 21, 2017

std::builtin or std::builtins would be a bit silly, right? We have a lot of things that are "built in". Yet it seems to be the best name I can think of.

  • std::builtin::unreachable()

@kennytm
Copy link
Member

kennytm commented Oct 21, 2017

Or that means likely/unlikely and unreachable/assume shouldn't be grouped into the same module after all (if you want a meaningful name for the module).

@hanna-kruppe
Copy link
Contributor

I'm actually kinda okay with unreachable and assume being in a "hints" module. They're optimizer hints. Misuse can introduce UB, but otherwise they don't affect semantics either. I could understand not wanting to mix these subtly different meanings in one module, but I don't think it's inherently misleading to call them "hints", especially considering the existing "optimizer hint" terminology.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Oct 21, 2017

After some discussion on IRC and re-reading the earlier discussion, I actually don't see any reason to not stabilize std::intrinsics (it's already effectively stable) and put/leave all these functions in there.

@mark-i-m
Copy link
Member

There's a crate that does that. I don't think it's a good policy in general for --release to remove safety checks.

I think it is the opposite, right?

If one uses unreachable!(), I think that should always be checked (debug or release), but if one uses unreachable_unchecked(), then we are adding (not removing) a safety check. And as @stjepang mentioned, it is really more for ease of debugging than safety.

@SimonSapin
Copy link
Contributor

I think that something that has unchecked in the name should not be checked, even in debug mode. If "checked in debug mode only" functionality is to be added, it should be a separate API (and so not really relevant to this tracking issue).

@mark-i-m
Copy link
Member

@SimonSapin I feel like that would make me never use unreachable_unchecked.

@sfackler
Copy link
Member

You are welcome to use or not use whatever APIs you want.

@durka
Copy link
Contributor

durka commented Apr 18, 2018 via email

@SimonSapin
Copy link
Contributor

@mark-i-m Double negation is hard and I can’t tell what your "that" refers to.

@mark-i-m
Copy link
Member

To clarify, what I mean is this: I generally prefer to leave asserts on in production, but there are a few asserts that I like to turn off in production. For those few asserts, I still want them on when I'm debugging. I consider unreachable_unchecked to be like that.

While I'm debugging, if I don't want to risk UB. When I deploy, I'm confident enough that I'm willing to not have the assert. If I can't make that distinction cleanly, I'm unlike to ever use unreachable_unchecked. (I guess I could always manually do some sort of cfg, but that's ugly)...

@SimonSapin
Copy link
Contributor

It sounds like you want to use https://crates.io/crates/debug_unreachable, and maybe there’s a case to be made to include it in the standard library but it shouldn’t be under the name unreachable_unchecked and probably not in this issue.

@SimonSapin
Copy link
Contributor

… however just copying that code into libstd or libcore probably won’t work since the standard library is always compiled in release mode.

@mark-i-m
Copy link
Member

@SimonSapin That's fair. Thanks :)

@dtolnay
Copy link
Member

dtolnay commented Apr 18, 2018

I think @mark-i-m has raised a great point and we need to be more clear about what use cases we envision for unreachable_unchecked.

I feel like that would make me never use unreachable_unchecked.

I sympathize with this and I struggle to think of reasonable use cases where I would prefer the current unconditionally unchecked unreachable over something like debug_unreachable. @SimonSapin or @sfackler do you have such a use case in mind?

@sfackler
Copy link
Member

How would that be implemented?

@dtolnay
Copy link
Member

dtolnay commented Apr 18, 2018

Possibly the same way CStr::from_ptr is implemented -- we just reserve the right to implement it a particular way when that becomes possible.

@sfackler
Copy link
Member

We can do literally anything we want when undefined behavior happens so that's not a thing we need to explicitly guarantee.

#45920 causes a trap to be generated from unreachable anyway, so you're already going to abort if an unreachable_unchecked is reached.

@dtolnay
Copy link
Member

dtolnay commented Apr 18, 2018

Then it's just a difference between "unconditionally unchecked is the correct behavior for a function with this name, and if we want a debug checked version later we call it something else or recommend a thirdparty crate" vs "we wish we could check in debug mode but the standard library is not set up to be able to do that -- maybe later." You and Simon seem to have been supporting the former but maybe mark-i-m and I would prefer the latter.

Is the trap as easily debuggable as a panic with stacktrace?

@durka
Copy link
Contributor

durka commented Apr 18, 2018 via email

@hanna-kruppe
Copy link
Contributor

Note that #45920 only kicks in at codegen time, optimizations based on the UB can still cause arbitrarily bad or confusing consequences. You might not even ever hit the unreachable -- if it is conditional, for example, the branch with the "unreachable" in it might be eliminated entirely.

@notriddle
Copy link
Contributor

One reason for not panicking specifically is that it unwinds the stack, and the surrounding unsafe code might not be equipped to handle that.

What if it immediately aborted and printed a stack trace in debug mode?

@mark-i-m
Copy link
Member

Is the trap as easily debuggable as a panic with stacktrace?

It's not quite as nice. A trap will cause the program to be killed on a standard Linux setup. If you are running in GDB, then GDB will pause and you can use bt to get a stacktrace. However, if it just happens when you are running (especially if it is non-deterministic), then you will just get killed by the OS without warning.

What if it immediately aborted and printed a stack trace in debug mode?

I think I would be ok with pretty much anything, as long as the program halts and some debugging info is printed.

@durka
Copy link
Contributor

durka commented Apr 18, 2018 via email

@SimonSapin
Copy link
Contributor

I think that I was wrong, and that debug_assertions would do the right thing in libcore since it’s not a function but a macro, so (if I understand things correctly) cfg!(…) in its expansion would be evaluated in the context of the call site.

However I think that for the past three years the crate also hasn’t been doing what everyone thinks it’s doing, since it uses cfg!(ndebug) which isn’t set by Cargo anymore: rust-lang/cargo#1444

@SimonSapin
Copy link
Contributor

I sympathize with this and I struggle to think of reasonable use cases where I would prefer the current unconditionally unchecked unreachable over something like debug_unreachable.

<[T]>::get_unchecked and every other *_unchecked API are unconditionally unchecked. Why do we have those?

@hanna-kruppe
Copy link
Contributor

@durka

Well if it's defined to trap it's not undefined, is it? And those "clever UB optimizations" shouldn't happen in debug mode.

It's undefined, period, codegen can just be configured is slightly mitigating the impact of any unreachables that make it to codegen. And re: "shouldn't happen in debug mode" -- pretty much all existing runtime checks can be configured independently of optimization level (-Coverflow-checks, debug_assertions) and this should too. Moreover, it appears @sfackler was talking more generally than "just" about debug mode. I just want to be clear about what #45920 does and does not achieve -- it's a mitigation, it doesn't tame any UB nor its consequences.

@durka
Copy link
Contributor

durka commented Apr 18, 2018

cfg!(…) in its expansion would be evaluated in the context of the call site.

Yes but the call site is in libcore, which has already been compiled, so it won't do what you expect. That's why there are ridiculous ingenious hacks to make overflow-checks-in-debug-mode work right.

@mark-i-m
Copy link
Member

Well if it's defined to trap it's not undefined, is it? And those "clever UB optimizations" shouldn't happen in debug mode.

Undefined doesn't necessarily have to mean unpredictable. Technically, *(int*)0 is UB, but in practice, most OSes cause it to segfault predictably. UB simply means we don't make any promises. We can still make a best effort, though. On the other hand, it also means we don't have any promises about what happens in debug mode either, just best effort. This makes it seem acceptable to panic in debug mode but not in release mode (assuming we can get it to work).

@tbu-
Copy link
Contributor Author

tbu- commented Apr 18, 2018

Technically, (int)0 is UB, but in practice, most OSes cause it to segfault predictably.

This is not true anymore if you factor compilers into the equation. Actual real-world programs don't segfault reliably when the programmer might think that it'd reliably execute the *(int *)0.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests