Tracking issue for RFC 3519: `arbitrary_self_types`
arielb1 opened this issue Β· 121 comments
This is the tracking issue for RFC 3519: Arbitrary self types v2.
The feature gate for this issue is #![feature(arbitrary_self_types)]
.
About tracking issues
Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.
Steps
- Accept RFC.
- Implement.
- Add documentation to the dev guide.
- See the instructions.
- Add documentation to the reference.
- See the instructions.
- Add formatting for new syntax to the style guide.
- See the nightly style procedure.
Unresolved Questions
None.
Related
Implementation history
TODO.
(Below follows content that predated the accepted Arbitrary Self Types v2 RFC.)
- figure out the object safety situation
- figure out the handling of inference variables behind raw pointers
- decide whether we want safe virtual raw pointer methods
Object Safety
See #27941 (comment)
Handling of inference variables
Calling a method on *const _
could now pick impls of the form
impl RandomType {
fn foo(*const Self) {}
}
Because method dispatch wants to be "limited", this won't really work, and as with the existing situation on &_
we should be emitting an "the type of this value must be known in this context" error.
This feels like fairly standard inference breakage, but we need to check the impact of this before proceeding.
Safe virtual raw pointer methods
e.g. this is UB, so we might want to force the call <dyn Foo as Foo>::bar
to be unsafe somehow - e.g. by not allowing dyn Foo
to be object safe unless bar
was an unsafe fn
trait Foo {
fn bar(self: *const Self);
}
fn main() {
// creates a raw pointer with a garbage vtable
let foo: *const dyn Foo = unsafe { mem::transmute([0usize, 0x1000usize]) };
// and call it
foo.bar(); // this is UB
}
However, even today you could UB in safe code with mem::size_of_val(foo)
on the above code, so this might not be actually a problem.
More information
There's no reason the self
syntax has to be restricted to &T
, &mut T
and Box<T>
, we should allow for more types there, e.g.
trait MyStuff {
fn do_async_task(self: Rc<Self>);
}
impl MyStuff for () {
fn do_async_task(self: Rc<Self>) {
// ...
}
}
Rc::new(()).do_async_stuff();
Why would you need this?
Why wouldn't you write an impl like this:
impl MyStuff for Rc<()> {
fn do_async_task(self) {
// ...
}
}
I'd rather define the trait different. Maybe like this:
trait MyStuff: Rc {
fn do_async_task(self);
}
In this case, Rc would be a trait type. If every generic type implemented a specific trait (this could be implemented automatically for generic types) this seems more understandable to me.
This could only be allowed for trait
methods, right?
For inherent methods, I can't impl Rc<MyType>
, but if impl MyType
can add methods with self: Rc<Self>
, it seems like that would enable weird method shadowing.
This is still pending lang team decisions (I hope there will be at least 1 RFC) but I think it will only be allowed for trait method impls.
You can't implement anything for Rc<YourType>
from a crate that does not own the trait.
So changes needed:
- remove the current error message for trait methods only, but still have a feature gate.
- make sure
fn(self: Rc<Self>)
doesn't accidentally become object-safe - make sure method dispatch woks for
Rc<Self>
methods - add tests
Iβll look into this.
Note that this is only supported to work with trait methods (and trait impl methods), aka
trait Foo {
fn foo(self: Rc<Self>);
}
impl Foo for () {
fn foo(self: Rc<Self>) {}
}
and is NOT supposed to work for inherent impl methods:
struct Foo;
impl Foo {
fn foo(self: Rc<Self>) {}
}
I got caught in some more Stylo work that's gonna take a while, so if someone else wants to work on this in the meantime feel free.
Is this supposed to allow any type as long as it involves Self
? Or must it impl Deref<Target=Self>
?
trait MyStuff {
fn a(self: Option<Self>);
fn b(self: Result<Self, Self>);
fn c(self: (Self, Self, Self));
fn d(self: Box<Box<Self>>);
}
impl MyStuff for i32 {
...
}
Some(1).a(); // ok?
Ok(2).b(); // ok?
(3, 4, 5).c(); // ok?
(box box 6).d(); // ok?
I've started working on this issue. You can see my progress on this branch
@arielb1 You seem adamant that this should only be allowed for traits and not structs. Aside from method shadowing, are there other concerns?
inherent impl methods are loaded based on the type. You shouldn't be able to add a method to Rc<YourType>
that is usable without any use
statement.
That's it, if you write something like
trait Foo {
fn bar(self: Rc<Self>);
}
Then it can only be used if the trait Foo
is in-scope. Even if you do something like fn baz(self: u32);
that only works for modules that use
the trait.
If you write an inherent impl, then it can be called without having the trait in-scope, which means we have to be more careful to not allow these sorts of things.
@arielb1 Can you give an example of what we want to avoid? I'm afraid I don't really see what the issue is. A method you define to take &self
will still be callable on Rc<Self>
, the same as if you define it to take self: Rc<Self>
. And the latter only affectsRc<MyStruct>
, not Rc<T>
in general.
I've been trying to figure out how we can support dynamic dispatch with arbitrary self types. Basically we need a way to take a CustomPointer<Trait>
, and do two things: (1) extract the vtable, so we can call the method, and (2) turn it into a CustomPointer<T>
without knowing T
.
(1) is pretty straightforward: call Deref::deref
and extract the vtable from that. For (2), we'll effectively need to do the opposite of how unsized coercions are implemented for ADTs. We don't know T
, but we can can coerce to CustomPointer<()>
, assuming CustomPointer<()>
has the same layout as CustomPointer<T>
for all T: Sized
. (Is that true?)
The tough question is, how do we get the type CustomPointer<()>
? It looks simple in this case, but what if CustomPointer
had multiple type parameters and we had a CustomPointer<Trait, Trait>
? Which type parameter do we switch with ()
? In the case of unsized coercions, it's easy, because the type to coerce to is given to us. Here, though, we're on our own.
@arielb1 @nikomatsakis any thoughts?
and is NOT supposed to work for inherent impl methods:
Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)
I've been trying to figure out how we can support dynamic dispatch with arbitrary self types.
I do want to support that, but I expected it to be out of scope for this first cut. That is, I expected that if a trait uses anything other than self
, &self
, &mut self
, or self: Box<Self>
it would be considered no longer object safe.
I do want to support that, but I expected it to be out of scope for this first cut.
I know, but I couldn't help looking into it, it's all very interesting to me :)
Wait, why do you not want it work for inherent impl methods? Because of scoping? I'm confused. =)
We need some sort of "orphan rule" to at least prevent people from doing things like this:
struct Foo;
impl Foo {
fn frobnicate<T>(self: Vec<T>, x: Self) { /* ... */ }
}
Because then every crate in the world can call my_vec.frobnicate(...);
without importing anything, so if 2 crates do this there's a conflict when we link them together.
Maybe the best way to solve this would be to require self
to be a "thin pointer to Self" in some way (we can't use Deref
alone because it doesn't allow for raw pointers - but Deref
+ deref of raw pointers, or eventually an UnsafeDeref
trait that reifies that - would be fine).
I think that if we have the deref-back requirement, there's no problem with allowing inherent methods - we just need to change inherent method search a bit to also look at defids of derefs. So that's probably a better idea than restricting to trait methods only.
Note that the CoerceSized
restriction for object safety is orthogonal if we want allocators:
struct Foo;
impl Tr for Foo {
fn frobnicate<A: Allocator+?Sized>(self: RcWithAllocator<Self, A>) { /* ... */ }
}
Where an RcWithAllocator<Self, A>
can be converted to a doubly-fat RcWithAllocator<Tr, Allocator>
.
Because then every crate in the world can call my_vec.frobnicate(...); without importing anything, so if 2 crates do this there's a conflict when we link them together.
Are saying is that there would be a "conflicting symbols for architechture x86_64..." linker error?
Maybe the best way to solve this would be to require self to be a "thin pointer to Self" in some way (we can't use Deref alone because it doesn't allow for raw pointers - but Deref + deref of raw pointers, or eventually an UnsafeDeref trait that reifies that - would be fine).
I'm confused, are you still talking about frobnicate
here, or have you moved on to the vtable stuff?
I'm confused, are you still talking about
frobnicate
here, or have you moved on to the vtable stuff?
The deref-back requirement is supposed to be for everything, not only object-safety. It prevents the problem when one person does
struct MyType;
impl MyType {
fn foo<T>(self: Vec<(MyType, T)>) { /* ... */ }
}
While another person does
struct OurType;
impl OurType {
fn foo<T>(self: Vec<(T, OurType)>) {/* ... */ }
}
And now you have a conflict on Vec<(MyType, OurType)>
. If you include the deref-back requirement, there is no problem with allowing inherent impls.
@arielb1 , you suggest that the point of this is to get around the coherence rules, but I can't see how this wouldn't fall afoul of the same incoherence that the orphan rules are designed to prevent. Can you explain further?
Secondly, the syntax is misleading. Given that fn foo(&mut self)
allows one to write blah.foo()
and have it desugar to foo(&mut blah)
, one would intuitively expect fn foo(self: Rc<Self>)
to allow one to write blah.foo()
and desugar to foo(Rc::new(blah))
... which, indeed, would obviously be pointless, but the discrepancy rankles.
...Oh bleh, in my experiments I'm realizing that fn foo(self: Self)
, fn foo(self: &Self)
, and fn foo(self: &mut Self)
are all surprisingly allowed alternatives to fn foo(self)
, fn foo(&self)
, and fn foo(&mut self)
... and that, astonishingly, fn foo(self: Box<Self>)
is already in the language and functioning in the inconsistent way that I'm grumpy about in the above paragraph. Of course we special-cased Box
... :P
I'm all for supporting extra types for Self
β I've been looking forward to it for three years.
@shepmaster feel free to try them out! Now that bors has merged my PR, they're available on nightly behind the arbitrary_self_types
feature gate: https://play.rust-lang.org/?gist=cb47987d3cb3275934eb974df5f8cba3&version=nightly
@bstrie I don't have as good a handle on the coherence rules as @arielb1, but I can't see how arbitrary self types would fall afoul of them. Perhaps you could give an example?
Secondly, the syntax is misleading. Given that fn foo(&mut self) allows one to write blah.foo() and have it desugar to foo(&mut blah), one would intuitively expect fn foo(self: Rc) to allow one to write blah.foo() and desugar to foo(Rc::new(blah))... which, indeed, would obviously be pointless, but the discrepancy rankles.
For the few methods that use non-standard self types, I get that this could be annoying. But they should be used sparingly, only when needed: the self type needs to Deref to Self
anyway, so having the method take &self
or &mut self
would be preferred because it is more general. In particular, taking self: Rc<Self>
should only be done if you actually need to take ownership of the Rc
(to keep the value alive past the end of the method call).
Secondly, the syntax is misleading. Given that fn foo(&mut self) allows one to write blah.foo() and have it desugar to foo(&mut blah), one would intuitively expect fn foo(self: Rc) to allow one to write blah.foo() and desugar to foo(Rc::new(blah))
I don't find that "inversion principle" intuitive. After all, you can't call fn foo(&mut self)
if your self
isn't mutable. If you have a foo(&mut self)
, you need to pass it something that is basically an &mut Self
. If you have a foo(self: Rc<Self>)
, you need to pass it something that is basically an Rc<Self>
.
More than that, we already have self: Box<Self>
today with no "autoboxing".
Recent Nightlies emit a lot of warnings like this when compiling Servo:
warning[E0619]: the type of this value must be known in this context
--> /Users/simon/projects/servo/components/script/dom/bindings/iterable.rs:81:34
|
81 | dict_return(cx, rval.handle_mut(), true, value.handle())
| ^^^^^^^^^^
|
= note: this will be made into a hard error in a future version of the compiler
First, I had to do git archeology in the rust repo to find that that warning was introduced bc0439b, which is part of #46837, which links here. The error message should probably include some URL to let people read further details and context, ask questions, etc.
Second, I have no idea what this warning is telling me. Why is this code problematic now, while it wasnβt before? What should I do to fix it? The error message should explain some more.
The breakage from this change seems to be larger than it's normally allowed for compatibility warnings.
A crater run wasn't performed, but if even rustc alone hits multiple instances of broken code (#46914), it means than effect on the ecosystem in general is going to be large.
@SimonSapin I'm working on changing that warning to a future-compatibility lint right now (#46914), which will reference a tracking issue that explains the reason for the change; I still have to finish writing the issue though so right now it doesn't explain anything
I guess we didn't realize there would be so much breakage from that PR. Right now, I'm finding lots of places in rust-lang/rust that produce this warning, and they are only popping up now that I'm turning it into a full-blown lint, which is turned into a hard error by deny(warnings)
.
Can anyone give a summary of what the status is here? Given that we have two accepted, high-priority RFCs mentioning this feature (rust-lang/rfcs#2349 and rust-lang-nursery/futures-rfcs#2), it's concerning that we still don't have even anything resembling an RFC for this. Since I assume (?) that the RFCs in question are going to end up influencing things in libstd I suppose it's not absolutely imperative that we stabilize this anytime soon, but even having an idea of what the design goals of this feature are would be a good start to making sure we don't box ourselves into a corner somehow. It seems unprecedented for a feature to progress so far without even a pre-RFC...
@bstrie there is an RFC PR that @withoutboats has opened. As far as implementation is concerned, everything in that RFC is implemented behind the arbitrary_self_types
feature gate, except that all custom receiver types are considered non-object-safe, and dynamic dispatch hasn't been implemented. I've been working on implementing that, and you can follow this WIP PR for updates.
There are also some things that are implemented that aren't included in that RFC: notably, raw pointer method receivers, and receivers that deref transitively to Self
rather than directly implementing Deref<Target=Self>
, e.g. &Rc<Self>
.
I hope that clears things up a bit. It doesn't completely address your concerns about RFCs being merged that are depending on an unimplemented, undocumented, and perhaps incompletely designed language feature, but hopefully, once I'm finished implementing the object-safety checks and dynamic dispatch, and that RFC gets merged, the foundation for those other RFCs will be less rocky.
@mikeyhew thanks! Up top I've added a link to the RFC so that people can follow along easier. To be clear, do we intend not to support object safety in receivers for this initial implementation pass, or is that just yet-to-be-implemented?
@bstrie the initial implementation pass is finished, and it did not include object-safety. The next pass, which I'm working on right now, will include object-safety.
There are interesting covarint uses of this :
fn foo<'a,E,F: FnOnce() -> Result<&'a mut Cat,E>>(self: F, bar: Bar) -> Result<Dog,E> {
let h = heavy_setup(bar);
let r = self()?.light_manipulations(h);
Ok( heavy_conclusions(r) )
}
Now foo(|| &mut my_mutex.bla.lock()?.deref_mut().blabla)
locks only briefly in the middle, but the API containing foo
need not plan around any particular struct hierarchy or locking scheme, whiole still enforcing that heavy_setup
be run before heavy_conclusions
.
@burdges that wouldn't be possible, because the self
argument must Deref to Self
. Why does the closure need to be a method receiver and not just a normal argument?
You could create a Thunk<T>
that holds a closure that gets called when it's dereferenced. As far as I know, though, Deref impls aren't supposed to panic, so that might be an antipattern.
Yes, a normal argument works just fine of course, and works on stable. I even used it that way by mistake, instead of writing (|| &mut my_mutex.bla.lock()?.deref_mut().blabla).foo()
.
At first blush, this seemingly made sense as a method receiver, but yeah arbitrary self types do sounds far off now, and normal arguments might make more sense and/or improve type inference and error messages.
Is this planned to be in 2018 edition?
I'm having some trouble with method resolution when passing through multiple dereferences:
#![feature(pin, arbitrary_self_types)]
use std::marker::Unpin;
use std::ops::{Deref, DerefMut};
#[derive(Copy, Clone)]
pub struct Pin<P> {
pointer: P,
}
impl<P, T: ?Sized> Deref for Pin<P> where
P: Deref<Target = T>,
{
type Target = T;
fn deref(&self) -> &T {
&*self.pointer
}
}
impl<P, T: ?Sized> DerefMut for Pin<P> where
P: DerefMut<Target = T>,
T: Unpin,
{
fn deref_mut(&mut self) -> &mut T {
&mut *self.pointer
}
}
trait Future {
type Output;
fn poll(self: Pin<&mut Self>) -> Self::Output;
}
impl Future for () {
type Output = ();
fn poll(self: Pin<&mut Self>) -> Self::Output { }
}
fn test(pin: Pin<&mut ()>) {
pin.poll()
}
@mikeyhew does this look like a bug or am I doing something wrong?
Am I right, there are no way to declare vector of Futures yet?:
Vec<Box<Future<Output = u32>>>
@withoutboats that looks like a bug. pin.poll()
should work. Future::poll(pin)
does work, so it is indeed a problem with method lookup.
@andreytkachenko as far as I know, the Future
trait from libcore is not object safe yet. So no, that's not possible yet. Once Pin
is considered object safe by the compiler, though, it will be.
@mikeyhew the bug seems connected to #53843 which manifests on stable and isn't tied to this feature
@andreytkachenko to work around the problem, there's currently a type called FutureObj
you can use
@mikeyhew @withoutboats Thank you for your replies, I'll take a look on FutureObj
Now that FutureObj
is removed in nightly, is there another workaround available?
I didn't knowFutureObj
was removed already, but it should be obsolete in a matter of weeks
Now that FutureObj is removed in nightly, is there another workaround available?
FutureObj
is now being provided by the futures
crate, because other changes to the API made it unnecessary for it to be in std.
@mikeyhew need futures-preview-0.7
#54383 was just merged (π), which means that receiver types like Rc<Self>
and Pin<&mut Self>
are now object-safe.
The next step is to make the receiver types in the standard library be usable as method receivers without the arbitrary_self_types feature, while still requiring the feature flag for ones we don't want to stabilize. Since receiver types are checked using the Deref
trait, and Deref
is stable, we need some way to differentiate between receiver types that are "blessed" and those that require the feature flag. I have thought of a few ways to do this:
- use an unstable
#[receiver]
attribute on the struct definition. The problem with this is there is that we would only check the outermost struct, and in the case ofPin<P>
you could makeP
be some random type that derefs toSelf
to circumvent the checks. - use an unstable
Receiver<T>
trait, where we check if the receiver type implementsReceiver<Self>
.T
implementsReceiver<T>
,Rc<T>
implementsReceiver<U>
ifT: Receiver<U>
, etc. This would be cool, but unfortunately it doesn't work with the current trait system due to overlapping impls. - use an unstable
Receiver
trait that is a subtrait ofDeref
, and check for it during the autoderef loop when we are checking if the receiver transitively derefs toSelf
. This is the most promising option.
I'm working on a branch that does option 3 right now, and will hopefully have a PR up soon.
@mikeyhew What about using the object safety traits as the limiter, since they are also unstable?
@withoutboats That could work for trait methods, but I'm not sure how that would work with inherent methods, since Self
isn't a type parameter
@mikeyhew thanks, that makes sense. I don't think its important how its implemented, but ideally nothing new would show up in the public API, since this just a hack to let us use std defined pointers on stable. Is it possible to make the Receiver
trait private and only check that its implemented if the arbitrary_self_types
feature flag is not enabled (i.e. on stable)?
Technically it couldn't be private, since it's used in liballoc. But I could use #doc(hidden)
if you want
Today I encountered a problem where f(self: Rc<Self>)
or f(self: &Rc<Self>)
for objects of type Rc<dyn Trait>
would be the right solution. All workarounds are somewhat ugly. Aggravating the situation, the methods are part of a public interface, affecting the offshoots of my code base.
For this reason, I am positively surprised at the current progress.
Update: now that #56805 has merged, self: Rc<Self>
and self: Arc<Self>
are usable without a feature flag on nightly. Also due to #56805, any receiver type that works on its own will also work if wrapped in Pin<..>
. This currently requires the pin
feature, but that requirement will be removed once #56939 is merged.
I'm trying to make a GUI library, and I see the need for a self: Rc<RefCell<Self>>
(because I need to mutate existing UI elements). (Strictly speaking, it should be Gc<...>
, but we're not there yet).
UI elements need to be able to store (Weak
?) references to correlated objects.
@njaard that unfortunately doesn't work, even with #!feature(arbitrary_self_types)
, because RefCell
doesn't implement Deref
. Conceptually, there's nothing wrong with it as a receiver type, but we would have to change the Deref
trait hierarchy so that Deref
and DerefMut
had a common supertrait, instead of DerefMut
requiring Deref
. That would be tough to do at this point, since any change to those traits would have to be backward-compatible.
I have a concern about the design of this feature. Consider:
struct Foo(u32);
impl Foo {
fn foo(Self(x): Self) { println!("OK!"); } // <-- [A]
}
fn main() {
let x = Foo(5);
x.foo();
}
I have written lots of code where I would love to (soon) be able to write self
patterns that destructure self
as in line [A]
above, just like we can destruct any other parameter with similar syntax today. It seems like the current design of this feature requires the self
pattern to be exactly self
. I worry this would be further fossilizing the mistake (IMO) made in #16293 (comment).
Specifying the type for self looks very strange - I have always believed that the self is connected with the impl Self.
Why instead of self: Wrap not write impl Wrapper:
trait Future {
type Output;
fn poll(self: Pin<&mut Self>) -> Self::Output;
}
impl Future for () {
type Output = ();
fn poll(self: Pin<&mut Self>) -> Self::Output { }
}
->
trait Future {
type Output;
fn poll(self) -> Self::Output;
}
impl Future for Pin<&mut ()> {
type Output = ();
fn poll(self) -> Self::Output { }
}
that the self is connected with the impl Self
It is related. All of these are valid in stable Rust:
struct Foo;
impl Foo {
fn a0(self) {}
fn a1(self: Self) {}
fn b0(&self) {}
fn b1(self: &Self) {}
fn c0(&mut self) {}
fn c1(self: &mut Self) {}
fn d0(self: Box<Self>) {}
}
There happen to be shorthand versions for the three most common versions. This feature just expands the other types allowed besides Box
.
There happen to be shorthand versions for the three most common versions. This feature just expands the other types allowed besides
Box
.
OK, I withdraw my objection then. I had no idea we could already do self: &Self
already.
@briansmith I think @shepmaster was replying to @chessnokov, not to you. Regarding what you said:
I would love to (soon) be able to write self patterns that destructure self as in line [A] above
The thing is, Rust needs the self
keyword to tell the difference between an ordinary associated function and a method. So even if destructuring function arguments was allowed, it probably wouldn't be allowed for methods. But you can always destructure the argument at the top of your function body, e.g. with let Foo(x) = self;
even if destructuring function arguments was allowed
It is.
fn add_tuple((a, b): (u32, u32)) -> u32 { a + b }
Just played around with this a bit and found that whilst a receiver of self: *mut Self
is allowed, self: NonNull<Self>
isn't (in a trait) and produces a compile error.
error[E0307]: invalid method receiver type: std::ptr::NonNull<Self>
--> src/main.rs:38:29
|
38 | unsafe fn initialize(self: NonNull<Self>, arguments: Self::InitializeArguments);
| ^^^^^^^^^^^^^
|
= note: type of `self` must be `Self` or a type that dereferences to it
= help: consider changing to `self`, `&self`, `&mut self`, or `self: Box<Self>`
To my mind, where I know the receiver is a non-null pointer, but not necessarily initialized properly, this is a useful way of writing code as documentation; particularly for a trait, where the implementor can be certain he hasn't been given null.
It might also be worth considering Option<NonNull<Self>>
, too, even if only to warn that it would be better expressed as *mut Self
.
self: NonNull<Self>
isn't
Because NonNull
doesn't implement Deref
. See also #48277.
considering
Option<NonNull<Self>>
Option
also doesn't implement Deref
. AFAICT it cannot, at least not without panicking when it's None
, which seems like a very bad ergonomic idea.
*mut Self
doesnβt implement Deref
either.
*mut Self
doesnβt implementDeref
either.
Sure, but that was a specific addition (See also #46664):
Maybe the best way to solve this would be to require
self
to be a "thin pointer to Self" in some way (we can't useDeref
alone because it doesn't allow for raw pointers - butDeref
+ deref of raw pointers, or eventually anUnsafeDeref
trait that reifies that - would be fine).
Raw pointers are definitely special-cased in the compiler, while NonNull
is a regular struct and not a language item.
Raw pointers are definitely special-cased in the compiler
Indeed. NonNull<Self>
is something that really should be supported at some point (at least if raw pointer receivers are), but it would require a separate trait from Deref
that returns *const Self::Target
, and we would need an RFC for that
@mikeyhew Another RFC? That's disappointing to learn. I haven't got the desire to go down that route. At least the thought's there.
@shepmaster I the code I am now commonly writing when dealing with low-level pointer stuff, Option<NonNull<T>>
comes up quite a bit. It'd just be nice to have the compiler either deal with it by assuming *mut Self
.
This is the trait that I was thinking of. Both NonNull<T>
and Option<NonNull<T>>
could implement it.
trait DerefRaw {
type Target;
fn deref_raw(&self) -> *const Self::Target;
}
Iβm really not sure about doing this for Option<NonNull<T>>
. Isnβt the whole point of it that you must check for None/null before accessing the pointer?
Option<NonNull<T>>
is, in many cases, much easier to work with than *mut T
and contains far more expressive power. It also more 'naturally' inevitably gets created by code working with NonNull<T>
; suddenly moving to *mut T
is weird, and, less-Rusty; one can't simply do Some(non_null)
as a result of a function, say. It also allows 'balance' in match
clauses, which are shorter to read and more ergonomic than if
with is_null()
; it does not require the casual reader of code to suddenly understand as much of raw pointers - knowledge of Rust's Option
type is enough; and lastly, it makes up for the absence of is_not_null()
, for which !x.something.is_null()
reads appalling and requires more visual processing (and is easier to miss).
Sorry, I should have clarified. Using Option<NonNull<T>>
in general sounds good. What Iβm not sure about is implementing DerefRaw
as proposed above for Option<NonNull<T>>
. Sure, it has an obvious implementation, but converting to a nullable *const T
seems to defeat the point of using Option<NonNull<T>>
in the first place.
Unless we have decided a representation of a dynamic-sized enum
I'm opposed to Option<NonNull<dyn Stuff>>
.
@withoutboats Ohh, good point, I forgot about that. That means there's potentially a real difference between *mut T
and Option<NonNull<T>>
.
@kennytm I think you're confusing Option<NonNull<dyn Trait>>
with Option<dyn Trait>
Are you talking about Option<NonNull<*mut dyn Trait>>
(which is a type that makes sense) or about NonNull<dyn Trait>
(which I don't think makes much sense)?
BTW, from an API design standpoint, one thing that would work would be to have a special ArbitrarySelfTypePtr
trait (bikeshed on name):
trait ArbitrarySelfTypePtr {
type Next: ?Sized;
}
impl<T: Deref + ?Sized> ArbitrarySelfTypePtr for T {
type Next = T::Target;
}
impl<T: ?Sized> ArbitrarySelfTypePtr for *mut T {
type Next = T;
}
impl<T: ?Sized> ArbitrarySelfTypePtr for *const T {
type Next = T;
}
impl<T: ?Sized> ArbitrarySelfTypePtr for NonNull<T> {
type Next = T;
}
// Maybe, if we really want that?
impl<T: ?Sized> ArbitrarySelfTypePtr for Option<T> {
type Next = T;
}
The "ArbitrarySelfTypePtr" exists solely so we can have a "sane" deref chain for inherent methods etc.
@arielb1 I like your suggested ArbitrarySelfTypePtr
trait because it logically documents what is and isn't possible for a self
type. It would make it easy to refer to in compiler errors, eg an arbitrary self type must implement the trait ArbitrarySelfTypePtr
.
BTW, for the avoidance of doubt, I'm not absolutely wedded to the Option<NonNull<T>>
requirement, it's just that it logically seemed it should exist.
Did anyone consider f(self: Weak<Self>)
?
In my point of view this seems reasonable since Arc
and Rc
are already allowed.
That seems unsafe since the self pointer could then be dropped while yhe method is executing, leaving self
to dangle.
That seems unsafe since the self pointer could then be dropped while yhe method is executing, leaving
self
to dangle.
Hmm? Am I missing something? (probably)
Isn't it just an ergonomic 'shortcut', being able to write f(self: Weak<Self>)
instead of i.e. f(me: Weak<Self>)
?
Thus it is still required to call self.upgrade()
to access Self
which isn't unsafe.
Weak<Self>
is in the same boat as *const/mut Self
and NonNull<Self>
: it's safe to call a method with it, but it doesn't implement Deref
. Right now the unstable arbitrary_self_types
feature only supports receiver types that implement Deref
, plus *const/mut Self
as a special case.
Weak<Self>
is in the same boat as*const/mut Self
andNonNull<Self>
: it's safe to call a method with it, but it doesn't implementDeref
. Right now the unstablearbitrary_self_types
feature only supports receiver types that implementDeref
, plus*const/mut Self
as a special case.
Thank you for the clarification!
Any chance Weak<Self>
could be added to the additional special cases? (caution rust-compiler-newbie speaking) Looking at the current state, it seems that one would need to handle another speicialised case in receiver_is_valid or more precisely in Autoderef and it would need to be called like include_raw_pointers. From the naming scheme that feels odd(?) to me, because Weak<Self>
can fail to "Deref
" to Self
(well, as you mentioned, like *const/mut Self
).
Maybe another reason for an ArbitrarySelfType
-likish trait?
Yeah, I think using a separate trait from Deref
as @arielb1 suggested would be a good idea. There is a running list of types that would benefit from being method receivers, but can't implement Deref
:
*const/mut Self
Weak<Self>
(bothRc
andArc
versions)NonNull<Self>
RefCell<Self>
, and by composition,Rc<RefCell<Self>>
- similarly,
Mutex<Self>
Option<Self>
andResult<E, Self>
(these ones are questionable)
Of the above list, all but Option
and Result
could produce a valid *const Self
and therefore could implement a hypothetical UnsafeDeref
trait. However, I'm starting to think that it would be better to have a trait that is specifically used for method lookup, whether or not it is decided thatOption<Self>
should be usable as a method receiver or not.
Also, here is a potential name for the trait: MethodReceiver
We can do a similar trick to CoerceUnsized
and whitelist MethodReceiver
in the compiler for pointers/references, and require that any structs that implement MethodReceiver
, themselves contain exactly one field that reaches the target type after any number of applications of MethodReceiver
.
This differs from CoerceUnsized
in that it uses an associated type, like Deref
, instead of a parameter, so it's less flexible, but it allows repeated application in the compiler.
Doing it without a method for performing a "deref" operation means x.foo()
-> <_>::foo(x)
sugar can be decoupled from any kind of safety concerns.
(That still leaves x.foo()
-> (***x).foo()
sugar but maybe those can be decoupled?)
Wanted to comment a small pain point I've encountered using this feature I haven't seen mentioned yet.
If we have a method receiving a *const T
, such as fn len(self: *const Self)
, we can't call it directly with a mutable pointer, *mut T
, although we can just cast it and call it manually with (var as *const T).len()
.
Might be worth adding a special case for this, just as we can call &self
methods with a &mut
reference.
Example:
#![feature(arbitrary_self_types)]
pub trait Test {
fn a(self: *const Self);
}
impl Test for u32 {
fn a(self: *const Self) {}
}
pub fn main() {
let mut a = 5;
let a_ptr: *const u32 = &mut a;
// Works fine
a_ptr.a();
let a_mut_ptr: *mut u32 = &mut a;
// Currently works
(a_mut_ptr as *const u32).a();
// Currently results in error "method not found in `*mut u32`"
// Would result to calling `(a_mut_ptr as *const u32).len()`
a_mut_ptr.a();
}
I guess one limitation of using an associated type for MethodReceiver
would be that we couldn't provide a blanket impl<T> MethodReceiver<T> for T
, which unless we wanted to try performing some kind of auto-ref on arbitrary-self-accepting methods would prevent something like this from working (based on the current Deref
-based design): Playground
(In my use-case there though I don't have a problem requiring .as_mut().poison()
, because I don't expect it to be as common as .lock().poison()
), but I think we should also keep the generic use-case for this trait in mind besides just concrete arbitrary-self-types.
Also, is the expectation that this MethodReceiver
trait will be publicly implementable?
If so, it seems like we'd probably end up recommending types that implementing Deref
(where there's already an understanding that you don't add inherent methods to avoid shadowing) also implement this trait. The bounds in my example would then become MethodReceiver<Target = Self> + DerefMut<Target = Self>
, which seems ok.
If not, I think we shouldn't implement it for any non-fundamental containers, like Mutex
, otherwise we'd be making the std
implementations of those types special compared to ecosystem ones.
Currently lifetimes cannot be elided when using arbitrary_self_types
:
struct A(u32);
impl A {
// compile error
pub fn value(self: &Arc<Self>) -> &u32 { &self.0 }
// ^ error: missing lifetime specifier. this function's return type contains a borrowed value, but there is no value for it to be borrowed from
// ok
pub fn value_lifetimed<'a>(self: &'a Arc<Self>) -> &'a u32 { &self.0 }
}
And The Rust Reference says:
https://doc.rust-lang.org/reference/lifetime-elision.html
If the receiver has type &Self or &mut Self, then the lifetime of that reference to Self is assigned to all elided output lifetime parameters.
Should this rule also be expanded to arbitary self types?
Is that a feature like UFCS(Uniform Function Call Syntax)?
Has there been any discussion about self: &Cell<T>
? That seems especially interesting in that &Cell<T>
is (as of Rust 1.37) convertible from &mut T
, so methods taking &Cell<T>
could be useful in existing programs, without any change in data layout. I understand there's a Deref
requirement that currently stands in the way, but from the thread above it sounds like people are considering relaxing that requirement somehow.
What could be the reason I cannot get a receiver with an extra trait bound (e.g. Send
) to work?
If you remove the + Send
bound on all impls, everything works fine.
struct Ptr<T: ?Sized + Send>(Box<T>);
// impls for Deref, CoerceUnsized and DispatchFromDyn
trait Trait: Send { fn ptr_wrapper(self: Ptr<Self>) -> i32; }
impl Trait for i32 { fn ptr_wrapper(self: Ptr<Self>) -> i32 { *self }}
fn main() {
let p = Ptr(Box::new(5)) as Ptr<dyn Trait>;
assert_eq!(p.ptr_wrapper(), 5);
}
error[E0038]: the trait "Trait" cannot be made into an object
Going to retag this as wg-traits; kind of the best wg- label right now, even though we don't really discuss this
Input from T-lang Backlog Bonanza meeting from 2022-02-09: Can anyone write down a summary of the status of this feature?
Should any of the checkboxes in the description be checked off now as done? Do new checkboxes need to be added?
Changing WG-traits to T-types. We discussed this in today's types triage meeting and ultimately decided that there are types decisions to be made here, so the team label feels appropriate. We don't have any concrete plans for this right now though.
Here's a summary of this issue as requested in this comment @pnkfelix.
Disclaimer: I haven't been involved so this may not be exactly right. But I'm interested in taking it forward so this summary was useful to compile anyway.
What works now
- Nightly only, behind feature flag
arbitrary_self_types
: - Stabilized already:
Rc<Self>
,Arc<Self>
andPin<P>
may be used as a receiver type (@withoutboats' issue #55786 amd @mikeyhew's PR #56805)
Other work done
- @withoutboats raised an RFC (rust-lang/rfcs#2362) which was marked as postponed in August 2019 (rust-lang/rfcs#2362 (comment))
Concerns
This attempts to be a complete summary of all known concerns raised in the discussions (at least, those visible on github). Please shout if I've missed something!
-
Major things we can defer till later (i.e. not stabilize yet):
- Object safety. The algorithm for object safety is quite complex and relies on some object layout assumptions described in
DispatchFromDyn
. There are not currently any known problems here but it seems wise to be cautious in stabilizing this bit. For example, "before this gets merged, I think we should either remove the part about object safety, or say that the way object-safety is determined will remain unstable for a while, and will probably need a new RFC for a final decision to be made". - It only works on types implementing
Deref
. There have been many requests to call methods on types which can't implementDeref
: for instance,RefCell
,NonNull
,Weak
. A summary is here. It's proposed this be solved using aMethodReceiver
trait (formerly known asArbitrarySelfTypePtr
). This also helps with some of the C++ interop use cases which is why I'm interested..
- Object safety. The algorithm for object safety is quite complex and relies on some object layout assumptions described in
-
Smaller issues that we probably need to address in the next bit of stabilization:
- Calling methods on raw trait pointers with invalid vtables might be UB. Any use of a raw pointer in Rust typically requires
unsafe
. Calling trait methods (*const dyn Trait
, for instance) via arbitrary self types might "use" the vtable, and therefore should be unsafe. (This might also apply to future extension via use ofMethodReceiver
.) - Handling of inference variables. As noted in the initial issue description
*const _
may require some specific error messages just like&_
does.
- Calling methods on raw trait pointers with invalid vtables might be UB. Any use of a raw pointer in Rust typically requires
-
Smaller issues which we can defer till later:
- Lifetime elision isn't yet done. Minor inconvenience? Example, example.
- Can't call
*const T
methods on a*mut T
. Minor inconvenience.
-
Issues which are resolved or aren't relevant to stabilization:
- Method shadowing. In the early days of this effort there were concerns about method shadowing and a proposal of an additional orphan rule. It was agreed that there are no problems here if we rely on
Deref
. - Name of the feature. Some say it should be "custom method receivers" rather than "arbitrary self types".
- Method shadowing. In the early days of this effort there were concerns about method shadowing and a proposal of an additional orphan rule. It was agreed that there are no problems here if we rely on
Next steps
This seems to decompose nicely into several sequential RFCs.
- (already done in #56805!) Stabilize support for
Rc<Self>
,Arc<Self>
andPin<P>
using a hard-codedReceiver
trait withincore
. (This initial split was proposed here: #44874 (comment)) - Extend stable support to truly custom method receivers - any type which implements
Deref<Target=T>
- but do not yet attempt object safety beyond the hard-coded types which already implement the hiddenReceiver
trait. (This split was proposed back here #44874 (comment)). We should also ensure that this stabilization does not yet allow calls to*const dyn T
or*mut dyn T
, or that such calls requireunsafe
. We should also ensure that error messages relating to inference variables are great. As far as I can see, there are no other unresolved issues or complexities here. - Stabilize object safety. This would stabilize
DispatchFromDyn
or similar. - Extend to types which don't or can't support
Deref
by implementing aMethodReceiver
trait - Minor future extensions in any order: (a) Allow calling
*const T
methods on a*mut T
, (b) Solve lifetime elision, (c) Allow calls to*const dyn T
and*mut dyn T
.
The only blocker to doing step 2 is if we decide that we might not want to rely on Deref
for this feature. But that seems pretty certain by this point...?
Calling methods on raw trait pointers with invalid vtables might be UB. Any use of a raw pointer in Rust typically requires unsafe. #44874 (comment). (This might #44874 (comment).)
This is closely related to rust-lang/rfcs#3324. There it was decided that upcasting on raw pointers is safe. This means, at the very least, that the safety invariant for raw dyn pointers says that the vtable is correct. Therefore it would only make sense to also make method calls safe.
This means unsafe code constructing invalid raw dyn pointers is currently on shaky grounds -- we likely want to permit some invalid values for intermediate local use, but it is not yet determined which ones.
It only works on types implementing
Deref
One thing that I want to make sure we know that we're deciding now, is that this will work "by default" with Deref
(whereas we could consider wanting to make it opt-in).
This effectively locks us in the future to the MethodReceiver
having a blanket impl<T: Deref> MethodReceiver for T
.