From 113c8bd08f600b577ec2294d944ad4d56ea33dbc Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 21 Dec 2023 17:22:52 +0000 Subject: [PATCH 1/7] Update reference & deshadowing. This changes the RFC in two significant ways: * As requested widely, it now proposes that we implement Receiver for NonNull and Weak. This requires us, for the first time, to add explicit code to spot potentially shadowed methods and avoid such shadowing. This is described. * It expands the Reference section to describe changes to the probing algorithm, which are now a little more extensive than the previous version of the RFC described, because we now search two different chains - one for types into which the receiver can be converted, and another chain for locations to search for possible methods. --- text/3519-arbitrary-self-types-v2.md | 89 ++++++++++++++++++++++++---- 1 file changed, 79 insertions(+), 10 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index e6c8d7553b0..4c5ffc4d573 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -228,15 +228,46 @@ where (See [alternatives](#no-blanket-implementation) for discussion of the tradeoffs here.) -It is also implemented for `&T`, `&mut T`, `*const T` and `*mut T`. +It is also implemented for `&T`, `&mut T`, `Weak`, `NonNull`, `*const T` and `*mut T`. -## Compiler changes +## Compiler changes: method probing -The existing Rust [reference section for method calls describes the algorithm for assembling method call candidates](https://doc.rust-lang.org/reference/expressions/method-call-expr.html). This algorithm changes in one simple way: instead of dereferencing types (using the `Deref`) trait, we use the new `Receiver` trait to determine the next step. +The existing Rust [reference section for method calls describes the algorithm for assembling method call candidates](https://doc.rust-lang.org/reference/expressions/method-call-expr.html), and there's more detail in the [rustc dev guide](https://rustc-dev-guide.rust-lang.org/method-lookup.html). -(Note that the existing algorithm isn't quite as simple as following the chain of `Deref`. In particular, `&T` and `&mut T` are considered as candidates too at each step; this RFC does not change that.) +The key part of the first page is this: -Because a blanket implementation is provided for users of the `Deref` trait and for `&T`/`&mut T`, the net behavior is similar. But this provides the opportunity for types which can't implement `Deref` to act as method receivers. +> Then, for each candidate type `T`, search for a visible method with a receiver of that type in the following places: +> - `T`'s inherent methods (methods implemented directly on `T`). +> Any of the methods provided by a visible trait implemented by `T`. + +This changes. + +The list of candidate types is assembled in exactly the same way, but we now search for a visible method with a receiver of that type in _more_ places. + +Specifically, instead of using the list of candidate types assembled using the `Deref` trait, we search a list assembled using the `Receiver` trait. As `Receiver` is implemented for all types that implement `Deref`, this is a longer list. + +It's particularly important to emphasize that the list of candidate receiver types _does not change_ - that's still assembled using the `Deref` trait just as now. But, a wider set of locations is searched for methods with those receiver types. + +For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. + +## Compiler changes: deshadowing +[compiler-changes-deshadowing]: #compiler-changes-deshadowing + +The major functional change to the compiler is described above, but a couple of extra adjustments are necessary to avoid future compatibility breaks by method shadowing. + +Specifically, that page also states: + +> If this results in multiple possible candidates, then it is an error, and the receiver must be converted to an appropriate receiver type to make the method call. + +This changes. For smart pointer types which implement `Receiver` (such as `NonNull`) the future addition of any method would become an incompatible change, because it would run the risk of this ambiguity if there were a method of the same name within `T`. So, if there are multiple candidates, and if one of those candidates is in a more "nested" level of receiver than the others (that is, further along the chain of `Receiver`), we will choose that candidate and warn instead of producing a fatal error. + +Similarly, + +> Note: the lookup is done for each type in order, which can occasionally lead to surprising results. + +This changes too, for the same reason. We check for matching candidates for `T`, `&T` and `&mut T`, and again, if there's a candidate on an "inner" type (that, is, further along the chain of `Receiver`) we will choose that type in preference to less nested types and emit a warning. + +(The current reference doesn't describe it, but the current algorithm also searches for method receivers of type `*const Self` and handles them explicitly in case the receiver type was `*mut Self`. We do not check for cases where a new `self: *mut Self` method on an outer type might shadow an existing `self: *const SomePtr` method on an inner type. Although this is a theoretical risk, such compatibility breaks should be easy to avoid because `self: *mut Self` are rare. It's not readily possible to apply the same de-shadowing approach to these, because we already intentionally shadow `*const::cast` with `*mut::cast`.) ## Object safety @@ -276,6 +307,11 @@ The existing branches in the compiler for "arbitrary self types" already emit ex } ``` We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. +- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple + method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. +- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen + in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it + in preference to `self: T` or `self: &T` in the outer type. # Drawbacks [drawbacks]: #drawbacks @@ -297,7 +333,44 @@ Rust standard library smart pointers are designed with this shadowing behavior i Furthermore, the `Deref` trait itself [documents this possible compatibility hazard](https://doc.rust-lang.org/nightly/std/ops/trait.Deref.html#when-to-implement-deref-or-derefmut), and the Rust API Guidelines has [a guideline about avoiding inherent methods on smart pointers](https://rust-lang.github.io/api-guidelines/predictability.html#smart-pointers-do-not-add-inherent-methods-c-smart-ptr). -These method shadowing risks also apply to `Receiver`. This RFC does not make things worse for types that implement `Deref`, it only adds additional flexibility to the `self` parameter type for `T::m`. +This RFC does not make things worse for types that implement `Deref`. + +_However_, this RFC allow types to implement `Receiver`, and in fact does so for `NonNull` and `Weak`. `NonNull` and `Weak` were not designed with method shadowing concerns in mind. This would run the risk of breakage: + +```rust +struct Concrete; + +impl Concrete { + fn wardrobe(self: Weak) { } +} + +fn main() { + let concrete: Weak = /* obtain */; + concrete.wardrobe() +} +``` + +If Rust now adds `Weak::wardrobe(self)`, the above valid code would start to error. + +The same would apply in this slightly different circumstance: + +```rust +struct Concrete; + +impl Concrete { + fn wardrobe(self: &Weak) { } // this is now a reference +} + +fn main() { + let concrete: Weak = /* obtain */; + concrete.wardrobe() +} +``` + +If Rust added `Weak::wardrobe(&self)` we would start to produce an error here. If Rust added `Weak::wardrobe(self)` then it would be +even worse - code would start to call `Weak::wardrobe` where it had previously called `Concrete::wardrobe`. + +The [#compiler-changes-deshadowing](deshadowing section of the compiler changes, above), describes how we avoid this. The compiler will take pains to identify any such ambiguities. If it finds them, it will warn of the situation and then choose the innermost method (in the example above, always `Concrete::wardrobe`). # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -367,10 +440,6 @@ This RFC proposes to implement `Receiver` for `*mut T` and `*const T` within the We prefer the option of specifying behavior in the library using the normal trait, though it's a compatibility break for users of Rust who don't adopt the `core` crate (including compiler tests). -## Implement for `Weak` and `NonNull` - -`Weak` and `NonNull` were not supported by the prior unstable arbitrary self types support, but they share the property that it may be desirable to implement method calls to `T` using them as self types. Unfortunately they also share the property that these types have many Rust methods using `self`, `&self` or `&mut self`. If we added to the set of Rust methods in future, we'd [shadow any such method calls](#method-shadowing). We can't implement `Receiver` for these types unless we come up with a policy that all subsequent additions to these types would instead be associated functions. That would make the future APIs for these types a confusing mismash of methods and associated functions, and the extra user complexity doesn't seem merited. - ## Not do it [not-do-it]: #not-do-it From 9c593663115361fb50e4088312ca1c668fa687cc Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Mon, 12 Feb 2024 16:58:39 +0000 Subject: [PATCH 2/7] Include example lint. --- text/3519-arbitrary-self-types-v2.md | 31 +++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 4c5ffc4d573..75b2dc27a44 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -309,9 +309,34 @@ The existing branches in the compiler for "arbitrary self types" already emit ex We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. - As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. -- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen - in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it - in preference to `self: T` or `self: &T` in the outer type. +- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. An example warning might be: + + ``` + warning[W0666]: ambiguous function call + --> src/main.rs:13:4 + | + 13 | orbit_weak.retrograde(); + | ^^^^^^^^^^^^ + | + = note: you may have intended a call to `Orbit::retrograde` or + to `Weak::retrograde` + = note: this method won't be called + --> src/rc/rc.rs:136:21 + | + 136 | fn retrograde(&self) { + | ^^^^^^^^^^^^^^^^^ + | + = note: because we'll call this method instead + --> src/space/near_earth.rs:357:68 + | + 357 | fn retrograde(self: Weak) { + | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + | + = help: call as a function not a method: + ~ Orbit::retrograde(orbit_weak) + = help: call as a function not a method: + ~ Weak::retrograde(orbit_weak) + ``` # Drawbacks [drawbacks]: #drawbacks From a5e274b3353275d93dfb23a99338b918de663630 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Wed, 14 Feb 2024 18:45:05 +0000 Subject: [PATCH 3/7] Update text/3519-arbitrary-self-types-v2.md Co-authored-by: Mads Marquart --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index 75b2dc27a44..b2f42524570 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -248,7 +248,7 @@ Specifically, instead of using the list of candidate types assembled using the ` It's particularly important to emphasize that the list of candidate receiver types _does not change_ - that's still assembled using the `Deref` trait just as now. But, a wider set of locations is searched for methods with those receiver types. -For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. +For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types in the method call expression are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types. ## Compiler changes: deshadowing [compiler-changes-deshadowing]: #compiler-changes-deshadowing From 7ea82bd025eaf910d31396ccedf526aec91f087e Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 13:49:14 +0000 Subject: [PATCH 4/7] Move lint to the correct place. --- text/3519-arbitrary-self-types-v2.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index b2f42524570..c4ac97a114e 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -307,10 +307,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex } ``` We don't know a use-case for this. There are several cases where this can result in misleading diagnostics. (For instance, if such a method is called with an incorrect type (for example `smart_ptr.a::<&Foo>()` instead of `smart_ptr.a::()`). We could attempt to find and fix all those cases. However, we feel that generic receiver types might risk subtle interactions with method resolutions and other parts of the language. We think it is a safer choice to generate an error on any declaration of a generic `self` type. -- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple - method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. -- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. An example warning might be: - +- As noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing) we will downgrade an existing error to a warning if there are multiple method candidates found, if one of those candidates is further along the chain of `Receiver`s than the others. An example warning might be: ``` warning[W0666]: ambiguous function call --> src/main.rs:13:4 @@ -337,6 +334,7 @@ The existing branches in the compiler for "arbitrary self types" already emit ex = help: call as a function not a method: ~ Weak::retrograde(orbit_weak) ``` +- As also noted in [#compiler-changes-deshadowing](the section about compiler changes for deshadowing), we will produce a new warning if a method in an inner type is chosen in preference to a method in an outer type ("inner" = further along the `Receiver` chain) and the inner type is either `self: &T` or `self: &mut T` and we're choosing it in preference to `self: T` or `self: &T` in the outer type. (The warning would be very similar to the above.) # Drawbacks [drawbacks]: #drawbacks From 01c6ca7e611ef9d627000e8f7636d12d6f8ca3da Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 13:57:15 +0000 Subject: [PATCH 5/7] Add future work section for MaybeUninit etc. --- text/3519-arbitrary-self-types-v2.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index c4ac97a114e..fc97d73f907 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -555,6 +555,30 @@ Even if the reader takes the view that all calls into foreign languages are intr A previous PR based on the `Deref` alternative has been proposed before https://github.com/rust-lang/rfcs/pull/2362 and was postponed with the expectation that the lang team would [get back to `arbitrary_self_types` eventually](https://github.com/rust-lang/rfcs/pull/2362#issuecomment-527306157). +# Future work + +We could consider implementing `Receiver` for other types, e.g. [`std::cell`](https://doc.rust-lang.org/std/cell/index.html) types, [`std::sync`](https://doc.rust-lang.org/std/sync/index.html) types, [`std::cmp::Reverse`](https://doc.rust-lang.org/std/cmp/struct.Reverse.html), [`std::num::Wrapping`](https://doc.rust-lang.org/nightly/std/num/struct.Wrapping.html), [`std::mem::MaybeUninit`](https://doc.rust-lang.org/std/mem/union.MaybeUninit.html), [`std::task::Poll`](https://doc.rust-lang.org/nightly/std/task/enum.Poll.html), and so on - possibly even for arrays, `Vec`, `BTreeSet` etc. + +There seems to be no disadvantage to doing this - taking `Vec` as an example, it would only have any effect on the behavior of code if somebody implemented a method taking `Vec` as a receiver. On the other hand, it's hard to imagine use-cases for some of these. It seems best to consider these future possibilities based on whether the end-result seems natural or strange. + +```rust +impl Vexation { + fn do_something_to_vec(self: Vec) { } + fn do_something_to_maybeuninit(self: MaybeUninit) {} +} + +fn main { + let v = Vec::new(); + v.push(Vexation); + v.do_something_to_vec(); // this seems weird and I can't imagine a use-case + + let mut m = MaybeUninit::::uninit(); + m.do_something_to_maybeuninit(); // this seems fine and useful and so maybe we should in future implement Receiver for MaybeUninit +} +``` + +For now, though, we should clearly restrict `Receiver` to those types for which there's a demonstrated need. + # Feature gates This RFC is in an unusual position regarding feature gates. There are two existing gates: From 09f46f7c87fe4f6abc06b1f89058546da72cca14 Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 13:58:33 +0000 Subject: [PATCH 6/7] Code error. --- text/3519-arbitrary-self-types-v2.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index fc97d73f907..feaceacc5d1 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -568,7 +568,7 @@ impl Vexation { } fn main { - let v = Vec::new(); + let mut v = Vec::new(); v.push(Vexation); v.do_something_to_vec(); // this seems weird and I can't imagine a use-case From 6eb91e99181b6e3d94a02884ea0deb425af05aee Mon Sep 17 00:00:00 2001 From: Adrian Taylor Date: Thu, 15 Feb 2024 15:15:45 +0000 Subject: [PATCH 7/7] Suggestions for compiler changes section. --- text/3519-arbitrary-self-types-v2.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/3519-arbitrary-self-types-v2.md b/text/3519-arbitrary-self-types-v2.md index feaceacc5d1..b53fc5adbd2 100644 --- a/text/3519-arbitrary-self-types-v2.md +++ b/text/3519-arbitrary-self-types-v2.md @@ -236,17 +236,19 @@ The existing Rust [reference section for method calls describes the algorithm fo The key part of the first page is this: +> The first step is to build a list of **candidate receiver types**. Obtain these by repeatedly dereferencing the receiver expression's type, adding each type encountered to the list, then finally attempting an unsized coercion at the end, and adding the result type if that is successful. Then, for each candidate `T`, add `&T` and `&mut T` to the list immediately after `T`. + > Then, for each candidate type `T`, search for a visible method with a receiver of that type in the following places: > - `T`'s inherent methods (methods implemented directly on `T`). > Any of the methods provided by a visible trait implemented by `T`. -This changes. +We'll call this second list the **candidate methods**. -The list of candidate types is assembled in exactly the same way, but we now search for a visible method with a receiver of that type in _more_ places. +With this RFC, the candidate receiver types are assembled the same way - nothing changes. But, the **candidate methods** are assembled in a different way. Specifically, instead of iterating the candidate receiver types, we assemble a new list of types by following the chain of `Receiver` implementations. As `Receiver` is implemented for all types that implement `Deref`, this may be the same list or a longer list. Aside from following a different trait, the list is assembled the same way, including the insertion of equivalent reference types. -Specifically, instead of using the list of candidate types assembled using the `Deref` trait, we search a list assembled using the `Receiver` trait. As `Receiver` is implemented for all types that implement `Deref`, this is a longer list. +We then search each type for inherent methods or trait methods in the existing fashion - the only change is that we search a potentially longer list of types. -It's particularly important to emphasize that the list of candidate receiver types _does not change_ - that's still assembled using the `Deref` trait just as now. But, a wider set of locations is searched for methods with those receiver types. +It's particularly important to emphasize also that the list of candidate receiver types _does not change_. But, a wider set of locations is searched for methods with those receiver types. For instance, `Weak` implements `Receiver` but not `Deref`. Imagine you have `let t: Weak = /* obtain */; t.some_method();`. We will now search `impl SomeStruct {}` blocks for an implementation of `fn some_method(self: Weak)`, `fn some_method(self: &Weak)`, etc. The possible self types in the method call expression are unchanged - they're still obtained by searching the `Deref` chain for `t` - but we'll look in more places for methods with those valid `self` types.