Skip to content

core::marker::NoCell in bounds (previously known an Freeze) #3633

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

p-avital
Copy link

@p-avital p-avital commented May 10, 2024

@RalfJung
Copy link
Member

IIRC @joshlf was also asking about exposing Freeze; maybe they can help provide some more motivating examples. Currently there's only really one. (I don't understand the "key of a map" note, and the RFC doesn't explain it in more than 5 words either.)

@clarfonthey
Copy link

The main reason for not stabilising Freeze was because it would now add an additional burden for API contracts: adding interior mutability could be considered a breaking change for some APIs, and thus people might want to add "phantom cells" of UnsafeCell<()> inside their types to ensure that they have this option for the future.

I don't think that just stabilising it for bounds affects this concern. The issue isn't the trait itself being exposed in the library, but APIs having to worry about whether they are Freeze or not as an API contract.

@RalfJung
Copy link
Member

RalfJung commented May 10, 2024

I don't think that just stabilising it for bounds affects this concern. The issue isn't the trait itself being exposed in the library, but APIs having to worry about whether they are Freeze or not as an API contract.

A bound is exactly what one is worried about here? If one writes a function fn myfunc(x: impl Freeze), then if I pass a value I got some some other crate to myfunc this will work off that type is Freeze -- meaning it's a breaking change for that crate to add non-Freeze fields.

This is exactly the same as Send and Sync. How often to people defensively add PhantomData<*mut ()> to make their type !Semd + !Sync?

Fundamentally there are some things the compiler only lets you do with Freeze types, and people naturally want to do these things in generic code, and they need Freeze bounds for that. I'm honestly surprised that we went so long without allowing these bounds.^^

(Or did I misunderstand what you mean? It sounded a bit like you're saying just stabilizing the bound means we don't have to worry about the concern. But upon re-reading I am less sure.)

The main reason for not stabilising Freeze was because it would now add an additional burden for API contracts

Letting people write unsafe impl Freeze for Type is even worse, given its role in the operational semantics, so I think we disagree on what the main reason was. ;)

@joshlf
Copy link
Contributor

joshlf commented May 10, 2024

IIRC @joshlf was also asking about exposing Freeze; maybe they can help provide some more motivating examples. Currently there's only really one. (I don't understand the "key of a map" note, and the RFC doesn't explain it in more than 5 words either.)

Sure!

If you want to dive deeper, look at uses of Immutable in zerocopy's 0.8 alpha docs.

On zerocopy stable, we provide traits like FromBytes. T: FromBytes guarantees that transmute::<[u8; size_of::<T>()], T>(bytes) is sound regardless of the value of bytes. We also want to support reference transmutations (e.g., &[u8; N] to &T). Reference transmutations are necessarily more restrictive than value transmutations since they must ban interior mutation. For example, &[u8; N] to &Cell<U> is unsound because it would allow the referent bytes to be mutated while a &[u8; N] reference exists referencing the same memory. For this reason, T: FromBytes also requires that T contains no UnsafeCells.

However, this is overly-restrictive. The "no UnsafeCells" requirement is only relevant to reference transmutations, but not to value transmutations. E.g., transmute::<[u8; size_of::<T>()], UnsafeCell<T>>(bytes) can be sound (depending on T), but we can never implement FromBytes for UnsafeCell<T>. Our API supports value transmutations in places like FromBytes::read_from and transmute!. APIs like these are made less powerful because of this restriction.

Another restriction is that some authors want to use FromBytes as a bound to justify their own unsafe code blocks. For example, google/zerocopy#251 was motivated by a user who wanted to derive FromBytes on a type containing an UnsafeCell for this purpose (see the "Motivation" section of that issue; cc @korran).

Our solution in the upcoming zerocopy 0.8 is to add a separate Immutable trait that is semantically very similar to Freeze*. We require T: Immutable where reference transmutations are happening, but don't require it where mutable reference transmutations (e.g. &mut [u8; N] to &mut T) or value transmutations are happening.


* Currently, Immutable bans UnsafeCells recursively, even via indirection. We will likely lift this restriction, at which point Immutable will be semantically identical to Freeze.

@clarfonthey
Copy link

Letting people write unsafe impl Freeze for Type is even worse, given its role in the operational semantics, so I think we disagree on what the main reason was. ;)

I should clarify, what I meant here was the adding of stuff like UnsafeCell<()> to make your type !Freeze, so that there's the option to add real interior mutability later without breaking existing APIs that rely on the type being Freeze.

Not the ability to make something Freeze despite interior mutability.

@ehuss ehuss added T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels May 10, 2024
@zachs18
Copy link
Contributor

zachs18 commented May 10, 2024

For another "motivating example", bytemuck is in basically the same position as zerocopy, in that some of the bounds are currently more strict than necessary (discussion). I have a branch with a sketch of bytemuck's API in the presence of core::mem::Freeze. (cc @Lokathor)


One possible alternative that would "work around" the semver issue would be to have Freeze be a "normal" (non-auto) trait that, like Copy, can only be implemented for types where all of their fields are also Freeze. This would require types to "opt-in" to guaranteeing they do not have interior mutability, just like types now have to "opt-in" to being Copyable. I don't necessarily think this option is better than the status quo1, but it is something to consider.

Footnotes

  1. since Freeze can be somewhat observed on stable via static-promotion, and this would either break that until libraries add impl Freeze for MyPODType, or require a second BikeshedRawFreeze actually-auto-trait as a supertrait of Freeze for the compiler to use for determining static-promotion etc.

@RalfJung
Copy link
Member

Yeah that could work -- as you say, the existing Freeze trait has to stay around; we'd also have to make static promotion accept a type that's either AutoFreeze or OptInFreeze for the motivating example to work (which is all about static promotion of data at a generic type).

@RalfJung RalfJung changed the title Stabilization RFC for core::marker::Freeze in bounds [RFC] core::marker::Freeze in bounds May 11, 2024
@BurntSushi
Copy link
Member

Thanks for writing this up. I appreciate the effort that went into it and the desire to push things forward.

But... I think this RFC needs a fair bit of work. After reading it, I'm still left with some very fundamental questions (and I expect these to be answered in the RFC):

  • What is the Freeze trait used for? How will Rust programmers use it?
  • What is an example of a typical use of the Freeze trait in Rust programs?
  • What would happen if we didn't stabilize Freeze?
  • The guide level explanation doesn't say anything about what Freeze is. The guide level section should not be a stub.
  • The reference level explanation should be reasonably self-contained and explain things in a fair bit of detail. I also have to say that I followed the links and they did not help me to understand what Freeze is.
  • There is zero mention of the fact that this is a new marker trait in the drawbacks section. Marker traits present significant annoyances. This needs more discussion and an exploration for why a new marker trait is really truly warranted.
  • I don't understand why we also aren't trying to stabilize unsafe impl Freeze. The RFC just says it's orthogonal, and while that may be true, that isn't a good enough reason to split them apart. The default position should be to stabilize both, and only after some compelling justification to do otherwise should we try to split it apart. If we're splitting them apart, for example, there should be an exploration of the drawbacks of doing that. For example, I sometimes need to write unsafe impl Send for MyType. What happens when I need to do that for Freeze but can't? (And if that can't happen or is unlikely to happen, then that is a very interesting difference from other marker traits that should have discussion about it in the RFC.)

@joshlf
Copy link
Contributor

joshlf commented May 11, 2024

  • I don't understand why we also aren't trying to stabilize unsafe impl Freeze.

IIUC, we couldn't do that today since Freeze promises that a type contains no UnsafeCells. That property is entirely visible to the compiler - either you have no UnsafeCells, in which case the compiler knows your type is Freeze, or you have some, in which case it would be unsound to implement Freeze for your type. We'd need to relax it to simply say "doesn't permit interior mutation" or something to that effect, in which case it'd be more about whether any of your public API permits interior mutation (so it's more of a runtime property and less of a type property).

If we decide to keep it a type property - ie, the absence of UnsafeCells - then we could do impl Freeze rather than unsafe impl Freeze per @zachs18:

One possible alternative that would "work around" the semver issue would be to have Freeze be a "normal" (non-auto) trait that, like Copy, can only be implemented for types where all of their fields are also Freeze. This would require types to "opt-in" to guaranteeing they do not have interior mutability, just like types now have to "opt-in" to being Copyable. I don't necessarily think this option is better than the status quo1, but it is something to consider.

Footnotes

  1. since Freeze can be somewhat observed on stable via static-promotion, and this would either break that until libraries add impl Freeze for MyPODType, or require a second BikeshedRawFreeze actually-auto-trait as a supertrait of Freeze for the compiler to use for determining static-promotion etc.

@RalfJung
Copy link
Member

RalfJung commented May 11, 2024

Freeze is used by the compiler (codegen, const-check) and Miri to determine whether a type allows shared mutation. We definitely don't want to allow people to disallow shared mutation on their UnsafeCell-carrying types -- I don't see when that would ever be useful, it opens so many questions, and it's not what anyone asked for. (That makes Freeze quite different from Send and Sync: Send and Sync are not used by codegen or Miri to do any optimizations, they are basically entirely library concepts with just a bit of involvement in the compiler to handle static. Freeze is very much a language/opsem concept.)

So if we want to expose the Freeze trait that exists, that decides for codegen and Miri and const-checking whether there's any interior mutability, then we can't allow impls.

@ChayimFriedman2
Copy link

@joshlf For the purpose of zerocopy (and bytemuck), what advantages does exposing Freeze brings over a custom trait? You will have to derive it anyway because you have other required traits, right?

@zachs18
Copy link
Contributor

zachs18 commented May 11, 2024

For the purpose of zerocopy (and bytemuck), what advantages does exposing Freeze brings over a custom trait?

IMO the main advantage is that Freeze can be definitively checked by the compiler. Under the current Freeze as an auto trait, no derives or impls are needed (or possible) for it since it is automatically implemented by the compiler when it can be. 1

An additional advantage is that there would be only one trait to encode the invariant. bytemuck and zerocopy (and any other cratr) could each implement their own Immutable trait, but with Freeze in the stdlib this would not be necessary.

You will have to derive it anyway because you have other required traits, right?

For my sketch of bytemuck's API with Freeze, Freeze is simply added as an additional bound on the functions where it is needed, it is not a supertrait or subtrait of any bytemuck trait itself.

Footnotes

  1. Also, even a hypothetical Copy-like "opt-in-but-still-validated-impls" version of Freeze would still be easier to correctly implement since an incorrect implementation simply wouldn't compile (just like impl Copy for struct Wrapper(String) doesn't compile today).

@joshlf
Copy link
Contributor

joshlf commented May 11, 2024

@joshlf For the purpose of zerocopy (and bytemuck), what advantages does exposing Freeze brings over a custom trait? You will have to derive it anyway because you have other required traits, right?

The most important reason is that the compiler can see into the internals of types in the standard library. For example, I filed this extremely silly issue because there's technically no guarantee provided to code outside of the standard library that Box<T> doesn't contain any UnsafeCells. That's obviously true, but in order for zerocopy to uphold its soundness policy, we need an explicit guarantee. Such a guarantee would be useful to basically nobody but us. By contrast, Box already implements Freeze, so if we could use Freeze directly, we wouldn't have to quibble about things like that. Here's another PR that would be at least partially obviated by Freeze.

Besides that, here are some other reasons:

  • All of the normal advantages of not having to own something like this. For example, zerocopy and bytemuck - if we were both to implement the same idea - would duplicate code between the two of us.
  • We have to manually implement Immutable for a huge number of types, all of which is extra code we have to write and maintain (this file contains the keyword Immutable 159 times!). Worse, since it's an unsafe trait, it's also safety proofs we have to write and maintain. Maintaining safety proofs is a giant headache since there's no programmatic way to discover that your safety proof isn't valid anymore since it's just prose, so we do a lot of manual work to make sure our safety proofs are forwards-compatible (this relates to the soundness policy I mentioned before). Often a single safety comment will be blocked for weeks or months while we try to get guarantees landed in the language documentation.
  • The custom derives, while not complicated by custom derive standards, are much uglier than the equivalent in-compiler implementation since the latter has direct access to a more natural representation of a type's fields and their types. We have to hack it by adding clumsy where bounds and such.
  • Proc macros are slow to compile and slow to execute, while the compiler-supported implementation will presumably be much faster
  • The compiler might support fancier reasoning at some point, e.g. permitting UnsafeCell<()>. Supporting this natively in zerocopy would be very difficult. This particular example may be contrived, but my point is that, in general, the compiler can support much richer analysis that would require feats of code gymnastics to pull off outside the compiler.

@p-avital
Copy link
Author

Thanks to everyone taking an interest in this RFC.

First, a quick apology for it starting up so half-assed, I clearly underestimated the task. I started this RFC as a knee-jerk reaction to 1.78 merging the breaking change the RFC mentions without providing a way for const references to generics to exist.

I'm still committed to getting this to move forward, but I have limited availability for the next 2 weeks, so please bear with me as I rewrite this entire thing to the standard I should have started it with.

I would really appreciate getting some early feedback on the direction people around here would like to see this RFC take regarding:

  • Should this RFC target the existing core::marker::Freeze or propose an alternative (let's strawman it as Frieza) with similar purpose?
    • A personal priority I have for this RFC is allowing const references to generics in stable Rust again. Frieza: core::marker::Freeze is my highest priority.
  • Recursivity:
    • core::marker::Freeze only accounts for "local" immutability. This is good enough for certain purposes (such as static promotion), but not for stuff like maps that would like to ensure their keys are totally immutable.
    • Would your application benefit from Frieza being recursive? Would it suffer from it being so?
  • Auto-trait-iness: do you feel happy/neutral/unhappy about Frieza being an auto-trait?
    • If the consensus is that more auto-traits=bad, creating Frieza seems like the right approach to me.
    • Otherwise, we could just expose core::marker::Freeze, and a recursive equivalent could be built either inside or outside core later.

Sorry for my high latency for the beginnings of this RFC. I really appreciate all the feedback you can give, and would like to mention that PRs to this PR's branch are very welcome :)

@RalfJung
Copy link
Member

RalfJung commented May 13, 2024

@p-avital happy to hear that you're not discouraged by all the extra work we're throwing your way. :)

core::marker::Freeze only accounts for "local" immutability. This is good enough for certain purposes (such as static promotion), but not for stuff like maps that would like to ensure their keys are totally immutable.

Even if Freeze was recursive, types can use raw pointers or global static to implement shared mutable state. So I don't think this RFC should try to do anything in that direction -- between promotion of generic consts, zerocopy, and bytemuck we have three users that all only care about "shallow" immutability. A map can use an unsafe trait PartialEqPure or so if a proof is required that partial_eq (and/or other functions) are pure.

@BurntSushi
Copy link
Member

@p-avital It's hard for me to give you good feedback here because I think I lack a fair bit of context. Most of the discussion (and the RFC itself) seems to be very high context here, and it's context that I don't have.

With that said, I can say that adding a new auto/marker trait is a Very Big Deal. It's not just that "auto traits = bad," but that adding a new one comes with very significant trade offs. That doesn't mean that adding one is impossible, but it does mean, IMO, that adding one needs to come with significant benefits. Or that it's the only path forward for some feature that most everyone deems to be essential.

From the discussion, it sounds like there are alternatives. So those alternatives, at minimum, should be explored.

@RalfJung
Copy link
Member

@BurntSushi note that the Freeze marker trait, albeit being unstable, is already a semver hazard. Whenever I have a constant of type T, there are things the compiler will let me do only if T: Freeze. So, while new auto traits are a Very Big Deal, this is not an entirely new auto trait, and it merely extends the existing semver hazard to also affect crates that do not allow creating const values of their type. But any crate that has a const fn new() -> Self already needs to worry about Freeze, in today's Rust.

Here's a demonstration.

@BurntSushi
Copy link
Member

@RalfJung Yikes. OK. Thank you for pointing that out. That's exactly the kind of context I was missing.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Apr 2, 2025
@RalfJung
Copy link
Member

RalfJung commented Apr 3, 2025

One, plan to stabilize unsafe impl NoCell for .., discuss what the contract would be for that unsafe impl, and discuss the other implications of this.

I am not a fan of this approach due to interactions with the aliasing model. This means either we need to eventually introduce a separate trait that controls whether and where shared references are exempt from immutability (since that needs to be properly and stably guaranteed at some point, this would have to be another stable trait), or we need to figure out how user-defined unsafe impl NoCell for ... affect the aliasing model. In the current implementation in Miri, such an impl would affect surrounding enum and union and make them not interior mutable any more, but it would not affect surrounding struct and tuples. It would also newly create the situation where a type can be NoCell but a recursive traversal will hit UnsafeCell; it's unclear how to make sense of that and all existing users of Freeze in the compiler would need to be very carefully audited (I know at least one existing user where just allowing user-defined unsafe impl NoCell for ... directly leads to unsoundness in the sense that Miri would say there is no UB but we'd generate LLVM IR with UB). Turns out I was wrong about what Miri does... the point is this was all written assuming that there can never ever be an UnsafeCell inside a NoCell type and I am not convinced it's worth breaking that assumption.

I don't think we should give users the power to have T: NoCell for a type that actually contains an UnsafeCell.

A ForceNoCell<T> that contains a T has the same problem; we should not have this.

Three, plan to not change PhantomData and introduce a new PhantomOwnedEmbedded or similar with the new behavior.

This is my preferred approach, except that I find the naming a bit unfortunate. PhantomBox or PhantomDataIndirect for the indirect case and PhantomData for the embedded case are better names IMO. But if we want to stay fully backwards compatible, of course that will not work.

FWIW I find the "embedded" term confusing here, it sounds like you are talking about embedded systems.

@Sky9x
Copy link

Sky9x commented Apr 4, 2025

Exactly. I think ideally, we should also have PhantomNotSend and PhantomNotSync, like we have PhantomPinned

I think this would be better expressed as impl !Send for Foo {}, impl !Sync for Foo {}, etc

@Sky9x
Copy link

Sky9x commented Apr 4, 2025

I don't think we should give users the power to have T: NoCell for a type that actually contains an UnsafeCell.

These impls are very concerning as they violate the assumption that Freeze is structural, an assumption that the compiler relies on for determining if a static must be placed in read only or mutable static memory. Such a check needs to be exact: a type either contains by-value UnsafeCell or it doesn't. Adding unsafe impl Freeze for Foo {} if Foo contains eg. Cell is a massive footgun as such a stray impl could easily cause hard-to-diagnose segfaults when the program attempts to mutate inside the cell.

Users should not be able to violate such an assumption, even with unsafe code. freeze_impls should never be stable, nor should any kind of PhantomCell etc.

@RalfJung
Copy link
Member

RalfJung commented Apr 4, 2025

nor should any kind of PhantomCell etc.

PhantomCell is fine. ForceNoCell is a problem.

Thank you @teor2345

Co-authored-by: teor <teor@riseup.net>
@Kyykek
Copy link

Kyykek commented Apr 4, 2025

These impls are very concerning as they violate the assumption that Freeze is structural, an assumption that the compiler relies on for determining if a static must be placed in read only or mutable static memory. Such a check needs to be exact: a type either contains by-value UnsafeCell or it doesn't. Adding unsafe impl Freeze for Foo {} if Foo contains eg. Cell is a massive footgun as such a stray impl could easily cause hard-to-diagnose segfaults when the program attempts to mutate inside the cell.

I'm not sure if this was already mentioned anywhere but that is an oversimplification for generics, since the compiler has no info about the type's contents or implemented traits.
An example where a manual implementation could be useful is with an unsafe or sealed trait that has an associated type with some (safety) invariant that would imply that the associated type is Freeze under some condition (e.g. if Self: Freeze) . Then, a generic struct G<T: Tr>(T::Assoc) containing that associated type could spell out something like unsafe impl<T: Freeze> Freeze for G<T>, like what can be done with e.g. Send. A concrete example that would benefit is generic sized arrays based on type numbers, which have to be implemented using recursion and associated types (though this is niche).
Furthermore, we can already bypass non-Freeze-ness in unsafe code by transmuting to a sufficiently sized and aligned MaybeUninit<impl Freeze> type, promoting it, then casting back to the original type; this is already UB.
Also, whether or not a type contains a cell or not is always provable after monomorphization (perhaps through a second, internal trait), even if there were unsound Freeze implementations.
I'm not claiming that it is a good idea to allow manual implementation, just that there are potential uses for this.

@RalfJung
Copy link
Member

RalfJung commented Apr 4, 2025

I'm not sure if this was already mentioned anywhere but that is an oversimplification for generics, since the compiler has no info about the type's contents or implemented traits.

Generics are fully instantiated when compiling code, and at that point we rely on Freeze being just a way to check for the presence of UnsafeCell.

NoCell does not have to take over that role of Freeze, but if it does not, we may need another stable Freeze-like trait in the future if we want to properly specify our aliasing model.

I'm not denying there are uses for manual impls of this trait, but it's a highly non-trivial feature and at this point I do not think we have strong enough motivation for it.

@Kyykek
Copy link

Kyykek commented Apr 4, 2025

perfectly reasonable, and also we can get around this by having a second version of the type that has the extra bound :)

@tmandry
Copy link
Member

tmandry commented Apr 4, 2025

I spawned a Zulip thread for bikeshedding of new possible PhantomData names, like PhantomEmbedded and PhantomIndirect mentioned in scottmcm's comment: Naming for new PhantomData-like types.

@traviscross
Copy link
Contributor

Another option, maybe, would be to 1) fix PhantomData as planned, and 2) stabilize Unique<T: ?Sized>.

pub struct Unique<T: ?Sized> {
    pointer: NonNull<T>,
    _marker: PhantomData<T>,
}
unsafe impl<T: ?Sized> NoCell for Unique<T> {}

Then we could tell people to write PhantomData<Unique<T>>, which seems entirely intuitive, and is pleasingly compositional.

@tmandry
Copy link
Member

tmandry commented Apr 4, 2025

Another option, maybe, would be to 1) fix PhantomData as planned, and 2) stabilize Unique<T: ?Sized>.

Yeah I would be on board for this. In fact I brought it up in the lang team meeting the other day and someone said it was intentionally unstable, but didn't know the reason. If we can in fact stabilize Unique<T> (or Owned<T> as was brought up in the zulip thread) that would be great.

@RalfJung
Copy link
Member

RalfJung commented Apr 5, 2025

Historically, Unique was meant to be a type that equips pointers with noalias annotations. However, we don't have a satisfying model of what exactly the requirements for that would be. Tree Borrows has experimental support for making Unique special but it's not working well; more work would be needed. So I think that's why it is unstable.

Or we could decide that Unique definitely does not have any noalias implications.

Would Unique<T> be Send/Sync if T is? I think it should.

This would be yet another pointer type in core, so likely all the NonNull APIs would want to be duplicated for Unique, which is a lot.

@Kixunil
Copy link

Kixunil commented Apr 5, 2025

Or we could decide that Unique definitely does not have any noalias implications.

Would be a shame to lose those optimizations wouldn't it?

Would Unique<T> be Send/Sync if T is? I think it should.

It already is.

@RalfJung
Copy link
Member

RalfJung commented Apr 7, 2025

Would be a shame to lose those optimizations wouldn't it?

More like, losing optimization potential -- currently, Unique does not get any aliasing optimizations.
Also see #3712: even for Box we have no evidence of noalias being particularly useful for performance.

@traviscross
Copy link
Contributor

traviscross commented Apr 7, 2025

Also see:

(With thanks to @RalfJung for the link.)

@RalfJung
Copy link
Member

RalfJung commented Apr 9, 2025

@Nadrieril brings up an interesting data point: in this paper, they found a bug (Bug #30) where someone used a type Opaque = PhantomData<UnsafeCell<*mut T>>; for FFI purposes, expecting &Opaque to get read-write permissions. IOW, this is unsafe code authors relying for soundness on PhantomData<T>: !NoCell when T: !NoCell.

So, by using PhantomData as name for the "indirect" case, we risk soundness issues like that. By using PhantomData as name for the "embedded" case, we risk breaking some code (crater found 250 regressions).

@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated

We discussed this today and proposed that this probably needs a design meeting:

@rustbot rustbot removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Apr 16, 2025
@nikomatsakis
Copy link
Contributor

@RalfJung I don't quite follow this last example. I was thinking that there are kind of 3 levels here -- I'm wondering if you agree with this characterization. Something like

  • "materialized" -- T: NoCell excludes zero-sized types and things, so PhantomData<T>: NoCell for all T. This matches (AIUI) the behavior of Freeze today.
  • "phantom" -- PhantomData<T>: NoCell, which is what we proposed, and what is consistent with other auto traits
  • "logical' -- you can unsafe impl NoCell for T which is promising that you do not modify bytes when an &T may be live (I think)

I understand you to be opposed to 'logical', I believe because of its interaction with opsem. I guess that you don't want to have to think about what traits are implemented to understand whether some memory is part of a cell and hence determine whether something is UB. Is that correct? (I guess in this case it would be library UB, though?)

I don't know your take on what I called "phantom". I can't tell whether that last example is indicating a case where "phantom" would be helpful or not or if it's not relevant at all because of the pointer indirection (*mut T).

@Jules-Bertholet
Copy link
Contributor

What we were specifically discussing today were cases where the "pointer-type with the relevant characteristics regarding lifetime" would be Box<T>. The problem is that for no_std crates, Box<T> isn't an option, and so we'd be leaving people with no good options for that.

I have an old internals thread about this missing raw pointer type.

@riking
Copy link

riking commented Apr 16, 2025

... I'd suggest that the RFC propose option 1, that we stabilize unsafe impl NoCell for .., as the primary solution.

  • "logical' -- you can unsafe impl NoCell for T which is promising that you do not modify bytes when an &T may be live (I think)

I think this is incompatible with NoCell being a precise representation of the rules that code generation cares about, unless you have a setup where it's UB to call UnsafeCell::get nested inside a unsafe impl NoCell type. And the current RFC text and discussion makes no mention of what the added Undefined Behavior should look like for incorrectly impl NoCell.

If we said there would be a compiler lint ("using this type is immediate Undefined Behavior") against unsafe impl NoCell if the type always has an active UnsafeCell no matter which enum/union variants are selected: would that still be useful?

I think, no, that is a whole-program proof and a severe semver hazard. The safety proof of the unsafe impl would be that no values are ever constructed with the Cell-containing enum variants, which is a whole-program property.

I am extremely doubtful of any valid uses for unsafe impl NoCell on a type that (through struct fields, single-variant enums, or single-variant unions only) embeds an UnsafeCell and suspect such an impl would be always wrong.

NoCell being a precise representation of the rules that code generation cares about -- that there can be no data races if a &T is held -- is what makes it useful for safety proofs. Allowing unsafe impl with no UB + doesn't affect codegen would destroy that correspondence. Allowing unsafe impl with UB is ludicrously error-prone as far as I can tell right now. I do not think allowing it is a good idea.

@Jules-Bertholet
Copy link
Contributor

NoCell being a precise representation of the rules that code generation cares about -- that there can be no data races if a &T is held -- is what makes it useful for safety proofs.

No, it merely needs to be an underapproximation of what codegen cares about. This leaves crate authors freedom to evolve their crate by delaying a commitment to NoCell.

@RalfJung
Copy link
Member

RalfJung commented Apr 17, 2025

@nikomatsakis

I don't quite follow this last example

You mean this? Which part is confusing? All I am saying is: there are cases where PhantomEmbedded is needed for soundness (PantomIndirect would be unsound), and at least one such case has been found in the wild by that paper.

I was thinking that there are kind of 3 levels here

I can't follow what you mean by these levels. Are these different user-defined types, using different fields to encode different things they need from the language? Or are these different proposals for how the language works? The three items are too terse for me to understand what each case actually refers to.

I understand you to be opposed to 'logical', I believe because of its interaction with opsem. I guess that you don't want to have to think about what traits are implemented to understand whether some memory is part of a cell and hence determine whether something is UB. Is that correct?

Roughly, yes. What we currently do in the aliasing model to find out which parts of the data behind a shared reference allow mutation is the following:

  • Recursively traverse product types (tuples, structs, closure environments, ...)
    • When we hit UnsafeCell, the inside of it is mutable
    • When we hit a union or enum, stop the recursion. This is immutable if and only of if it is NoCell
    • Everything else is immutable

If we allow unsafe impl NoCell, we'll want to change this, though I am not entirely sure how. Does such an impl always "win" over all the inner types? I.e., we could have

  • Recursively traverse product types (tuples, structs, closure environments, ...)
    • If this type is NoCell, stop the recursion, everything here is immutable
    • When we hit UnsafeCell, the inside of it is mutable
    • When we hit a union or enum, the inside is mutable (since we already ensured !NoCell)
    • Everything else is immutable

I find this a bit harder to think about, but maybe that's just a matter of habit. It also means this unsafe impl NoCell is really quite unsafe -- if any of the types you use for your fields uses interior mutability now or in a future semver-compatible version, it will be UB to call those operations based on an &YourType reference. It makes me quite uneasy, and while I'm certainly willing to consider it in the future, I don't think we have strong enough motivation for it right now. All we need is PhantomEmbedded and PhantomIndirect (one of which should be called PhantomData, and IMO that should be PhantomEmbedded, but the main point is having both of those types).

(I guess in this case it would be library UB, though?)

Everything the aliasing model does must be language UB. Are we talking about the same thing?

I don't know your take on what I called "phantom".

I don't understand what you are proposing here so I can't answer this question.^^ PhantomData<T>: NoCell sounds like you mean this to be true for all T but that was the main point of "materialized", no? I think you'll need to use more than those 22 characters to describe this option. :)

@riking

The safety proof of the unsafe impl would be that no values are ever constructed with the Cell-containing enum variants, which is a whole-program property.

No, the safety requirement would be that no mutation happens. Just constructing the Cell should be entirely fine.

Stacked Borrows is not happy with this, but that's a (hard to fix) Stacked Borrows bug which should not affect our decision here. Tree Borrows fixes this bug, so there are plausible models that do not have this problem.

@Jules-Bertholet

No, it merely needs to be an underapproximation of what codegen cares about. This leaves crate authors freedom to evolve their crate by delaying a commitment to NoCell.

Agreed. However. if unsafe impl NoCell is something users can write and is intended to have opsem consequences (bring back noalias to shared references), then opsem must treat NoCell as the exact truth, not some underapproximation.

If it does not have opsem consequences, we have the very strange situation where we'll put const promoted data into immutable memory for types where opsem may actually permit mutation through shared references. I think we should not do that -- it would basically be more of rust-lang/unsafe-code-guidelines#493.


## `core::marker::PhantomCell`

This ZST is proposed as a means for maintainers to reliably opt out of `NoCell` without constraining currently `!NoCell` ZSTs to remain so.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any situation where PhantomCell is a necessity, other than to prevent future API breakage beforehand? IIUC it's always required to use UnsafeCell to create internal mutability, thus it's still unsound to add a _marker: PhantomCell and expect to do anything creative with this struct (like mutating other fields behind &), yes?

If so, NoCell's behavior sounds like a diverge from soundness guarantees and optimization invariants (Freeze). People who write PhantomCell would be only to opt-out their structs from const promotions, and they won't expect a surprising de-optimization of all other code using &Struct. Also in this way, user has no reason to override Freeze impl, making it unnecessary to be exposed to users at all.

@Jules-Bertholet
Copy link
Contributor

Another wrinkle with unsafe impl NoCell, if it has opsem consequences, is that such impls cannot be conditional on lifetimes.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. I-lang-radar Items that are on lang's radar and will need eventual work or consideration. proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.