I'm cross-posting my question from Stack Overflow in the hopes that someone knowledgeable about pinning and futures has a chance to see it:
https://stackoverflow.com/q/56058494/155423
If it's relevant, there's a 250 point bounty on the question.
I'm cross-posting my question from Stack Overflow in the hopes that someone knowledgeable about pinning and futures has a chance to see it:
https://stackoverflow.com/q/56058494/155423
If it's relevant, there's a 250 point bounty on the question.
@Nemo157 provided some information in Discord, but I'm still hoping for a complete, cohesive answer...
nemo157
@shepmaster the key for
Map
is that aPin<&mut F>
is never observed, so thef
field is never considered to have been pinnedshepmaster
But I don't know what you mean by "observed" here
nemo157
The
Map
type has chosen to propagate the pinning guarantees to itsfuture
field, but not to itsf
fieldWhich you can also see with the conditional
Unpin
implementationshepmaster
Why is it allowed to make that choice about guarantees?
nemo157
Map::new
takes anF
by value, so it’s definitely not pinned there, after thatMap
has ownership of theF
but never allows construction of a pinned reference to itshepmaster
Couldn't the owner of
Map
have pinned the entire thing, which means thatF
has been pinned at some point?And the whole "once pinned forever pinned"?
nemo157
If nothing ever sees a pinned reference to the
F
then it’s not considered pinned, even if theMap
is pinned it has ownership of theF
and privacy boundaries stop anything else constructing a reference to one of its fieldsshepmaster
"If nothing ever sees a pinned reference to the F then it’s not considered pinned" — is that documented anywhere?
nemo157
I guess it’s not explicitly stated anywhere, it implicitly comes from
Pin<P>
being the type that provides the “pinning invariants”, so if you’ve not seen one you haven’t been provided those invariants to rely onSo by taking in
Pin<&mut Self>
in<Map as Future>::poll
Map
has been provided those invariants, but it has chosen to not pass them along toF
ever
That is what structural pinning is about:
First, I will note P<T>
something like impl Deref<Target = T>
, that is, some (smart) pointer type P
that Deref::deref
s to a T
(Pin
only "applies" to / make sense on such (smart) pointers).
Let's say we have:
struct Wrapper<Field> {
field: Field,
}
Now, the question is, whether we can get a Pin<P< Field >>
from a Pin<P< Wrapper<Field> >>
, by "projecting" our Pin<P<_>>
from the Wrapper
to its field
.
This already requires the basic projection P<Wrapper<Field>> -> P<Field>
, which is only even possible for
shared references: P<T> = &T
(this is not a very interesting case given that Pin<P<T>>
always deref
s to T
)
unique references: P<T> = &mut T
.
I will note this &[mut] T
So, the question is:
Can we go from
Pin<&[mut] Wrapper<Field>>
toPin<&[mut] Field>
?
The point that may still be unclear in the documentation is the following: it is up to the creator of Wrapper
to decide!
So there are two possible choices for the library author, regarding each one of the struct fields:
either there is a structural Pin
projection to that field;
(for instance, when the ::pin_utils::unsafe_pinned!
macro is used to define such projection)
Then, for the Pin
projection to be sound:
the whole struct must only implement Unpin
when all the fields for which there is a structural Pin
projection implement Unpin
,
unsafe
to move such fields out of a Pin<&mut Wrapper<Field>>
(or Pin<&mut Self>
when Self = Wrapper<Field>
); for instance, Option::take()
is forbidden.the whole struct may only implement Drop
if Drop::drop
does not move any of the fields for which there is a structural projection,
the struct cannot be #[repr(packed)]
(a corollary of the previous item).
In your given future::Map
example, this is the case of the future
field of the Map
struct.
or there is no Pin
projection to that field;
In that case, that field is not considered pinned! (by a Pin<&mut Wrapper<Field>>
)
thus whether Field
is Unpin
or not, does not matter;
unsafe
to move such fields out of a Pin<&mut Wrapper<Field>>
; for instance, Option::take()
is allowed.and ::pin_utils::unsafe_unpinned!
is safe to use to define a Pin<&mut Wrapper<Field>> -> &mut Field
projection.
Drop::drop
is also allowed to move such fields,
In your given future::Map
example, this is the case of the f
field of the Map
struct.
Also see the relevant module docs.
That's the same documentation that I quote and link to in the question, correct? To some level, that documentation is the reason for the question.
From @Yandros' answer (slightly reworded), the key part missing in the docs is:
the point that may still be unclear in the documentation is the following: it is up to the creator of
Wrapper
to decide [...] regarding each one of the struct fields
By my reading of the documentation, once a Pin<&mut Wrapper>
has ever been constructed, it's not possible to ever move a value out of the Wrapper
:
From the
pin
module documentation on theDrop
guarantee (emphasis mine):Concretely, for pinned data you have to maintain the invariant that its memory will not get invalidated from the moment it gets pinned until when drop is called. Memory can be invalidated by deallocation, but also by replacing a
Some(v)
byNone
, or callingVec::set_len
to "kill" some elements off of a vector.And Projections and Structural Pinning (emphasis mine):
You must not offer any other operations that could lead to data being moved out of the fields when your type is pinned. For example, if the wrapper contains an
Option<T>
and there is a take-like operation with typefn(Pin<&mut Wrapper<T>>) -> Option<T>
, that operation can be used to move aT
out of a pinnedWrapper<T>
-- which means pinning cannot be structural.
@RalfJung, would you mind pointing me to the section of the documentation that I must have misread that aligns with @Yandros' answer (or explain why the answer is wrong, if that's the unfortunate case)?
I second this reading: the documentation does not mention possible structurally unpinned fields when postulating a summary rule against moving fields out of a pinned value.
It is, yes. Sorry I missed that.
@Yandros is right when they say
We went back-and-forth in writing the docs, and the docs do explicitly say
For a type like
Vec<T>
, both possibilites (structural pinning or not) make sense, and the choice is up to the author
But it seems that has not been clear enough.
The issue is that yes, it is up to the creator to make that choice, but the choice has consequences. Namely, if you choose structural pinning, the docs list a "few extra requirements". So if any of these requirements are violated, there's not really a choice any more -- for those fields you cannot do structural pinning. That's why we went away from starting the docs by saying "it is up to the author", because people felt it would be weird to then continue "except when it is not".
Maybe someone else wants to give updating the docs a shot? I can help edit/review. I do like the way @Yandros set things up in their post, putting the choice first and the consequences second. Just when I tried the same in the docs people felt it was unclear.
The alternative is to put the criteria first: if you ever want to create a Pin<&mut Field>
, then you must have structural pinning, and hence you have to be careful in your drop
. But that seems harder to explain for me?
Pin
is a very clever type-level / API trick to forbid some memory moves. But, to be honest, it sometimes feels like it is too clever; I do think that it is the toughest API to design libraries with (luckily very few people should be doing that).
Structural Pin
-ning, or more precisely, the lack thereof, just adds up to the confusion. For instance, even if it is intuitive that Pin<P<Box<T>>
does not pin T
, this pattern becomes less clear for Pin<P<(T, U)>
: "if (T, U)
is pinned, then surely both T
and U
should also be pinned" corresponds to our human intuition (at least it did for me).
The best thing to fight against an intuition (again, at least for me) is a counterexample. I, for instance, appreciated the exploit_ref_cell
example.
So, in a similar vein to the Nomicon's Implementing Vec
, I think that a series of articles about Pin
usage in a library (e.g., implementing an intrusive doubly linked list (with &mut
mutation, thus different from ::intrusive_collections')).
The thing that may not have helped is the sentence quoted by @shepmaster and @mzabaluev:
I agree that the part emphasized by @shepmaster would need to mention that it applies to a structurally Pin
-ned field (at least that's how I understand it):
Wrapper
would provide / rely on structural Pin
-ning of some field: Option<F>
(non-unsafe
fn(Pin<&'_ mut Wrapper>) -> Option<Pin<&'_ mut F>>
),
The Drop
that fixes invariants would be in Wrapper
(instead of F
),
There would also be a non-unsafe
fn(Pin<&mut Wrapper>)
able to clear field
by setting it to None
(calling F
's drop glue, but not Wrapper
's)
⇒ Unsound.
I am actually confused because that sentence is not even about projections, it's about drop
. It says that when data is pinned, you cannot "invalidate" its storage, which includes switching to a different enum variant (if the pinned data lives inside an enum).
What is the connection to projections?
Normally setting to None
would call the drop glue of the old data in there, so that would be okay. But if you ptr::write
the None
, then indeed you broke the drop
guarantee.
I truly think that my missing piece is that "projections" 1 are evidently a field-by-field decision. I read the documentation as saying:
Once a
Pin<&mut Wrapper>
value exists, no field may ever be moved out of that value.
This is (evidently) not true because
Pin<&mut Wrapper>
value exists, it's now up to Wrapper
2 to decide which of its own fields are "actually" pinned and which aren't.And I can conceptually see that, but my point is that it's not obvious that a type can choose both. The way the documentation is laid out, it seems that a type chooses either structural pinning or not. The truth is that it chooses for each field
1 — a term which I am not familiar with so using it in the docs doesn't help me.
2 — well, anything that messes with Wrapper
s internals, so I roughly assume that Wrapper
's implementation is hidden away in a module for privacy.
That is fair. I was unsure how early to bring this up because it seemed like an advanced topic, and evidently I picked the worst possible kind of middle-ground.
Do you have any good suggestion for an alternative to "projection" that hopefully would work better for people less entrenched in theory? (People are not wrong when they claim we academics use our own weird language. )
Though to be fair, the pin-utils crate also uses that term.
I recognize that I'm falling into a logical fallacy that I hate ("If I do it, then so must everyone else, therefore it's common"), but having a future with multiple fields in it seems like a common occurrence. While newtypes are A Thing, I still expect that most structs are going to have multiple fields.
I'm not sure that I understand the concept thoroughly enough to offer alternatives. I don't even know that you have to come up with a new term; but defining it somewhere and referencing that definition where it's first used in the would be a good start, I think.
I can use the word "kwyjibo" multiple times and you still might not know what it is. I'd think the pin-utils crate is a poor example here because they seemingly rely on the users already knowing what it means.
This was supposed to happen "en passant" with
When can a struct have a "pinning projection", i.e., an operation with type
fn(Pin<&Struct>) -> Pin<&Field>
?
That clearly did not work.
I think that one is definitely true. What still might be missing in the docs (and in most peoples head) is a good explanation of what Pin
actually means. I couldn't yet find the best one nor can I claim that I fully understand Pin
. However my own conclusion during the last couple of weeks with it is that it might not be what the word and the current explanation describes, which is about "Pinning an object in memory", so that it can't be moved. This property would be super easy to explain, it's just &obj == const
.
However as far as I understand that's only one of two properties of pinning.
If it would be, then mutably accessing and assigning fields wouldn't be unsafe
in the general case.
The other property is that the object must be structurally intact. Which means all interior pointers must not get invalidated. Fields which are not interior pointers can be changed for Pin<&mut>
, others not (or can they, as long as one makes sure to that after the manipulation everything matches up again and no interior pointer dangles? I'm not even sure?). Structurally intact seems to be some weak kind of immutable.
What is required for this property is a type by type decision. And that's also why the same applies for "pin projections". This property makes unfortunately pretty hard to describe in a generic fashion to describe what consequences pinning has on members, and whether those are pinned too. Since the question came up around "projections": Maybe those can be described as "whether pinning an object also requires it's fields to be pinned or not", or "the mechanism of going from a pinned objects to either it's pinned or non-pinned field".
So in total my impression is that Pin<T>
is something like Pinned<StructurallyFrozen<T>>
. Or maybe it's just WhatEverIsRequiredToMakeAsyncAwaitWork<T>
- which depends from type to type.
I do not follow. What do you mean by "structurally intact"? Mutably accessing and assigning fields is only unsafe because without support from rustc, pinning projections cannot be implemented safely. In particular, the requirement that drop
must not move out of fields with structural pinning cannot be enforced. If we had a way to make the compiler ensure that
drop
must be implemented with Pin<&mut Self>
instead of &mut self
, andT: Unpin
only if Field: Unpin
, andrepr(packed)
then we could make it safe to go from Pin<&mut Self>
to Pin<&mut Field>
. There is nothing inherently unsafe about this.
Similarly, it would be legal today to have something like this
#[repr(transparent)]
struct Unpinned<T>(T);
impl<T> Unpin for Unpinned<T> {}
// All the usual constructor, `Deref`, `DerefMut` and so on.
which could be used to wrap fields that we do not want structural pinning for. Together with safe pinning projections as outlined above, we could then safely go from Pin<&mut Self>
to Pin<&mut Unpinned<Field>>
(via the projection) to &mut Field
(via Pin::get_mut
).
It's certainly not that, because then we would not have the drop guarantee.
My main problem is that I don't know how to express this. I have written a blog post and a follow-up on the topic last year, let me try again here.
The key point is: A type in pinned state owns the memory it is stored in.
All of the rules for pinning, including the drop guarantee, basically fall out from that idea.
What I mean by this is the following: if we consider x: Box<T>
, the no matter the T
we can call
fn leak_content<T>(x: Box<T>) { mem::forget(*x) }
which will deallocate the Box
without any "intervention" from T
. We can also call
fn move_to_different_box<T>(x: Box<T>) -> Box<T> { Box::new(*x) }
which will put the T
into a new place and deallocate the old place where T
was, again without any "intervention" from T
. And we can do
fn repurpose_for_different_instance<T>(mut x: Box<T>, t: T) -> Box<T> {
mem::forget(*x);
*x = t;
x
}
which will "throw away" the old T
and put a different thing into the same place in memory.
All of these demonstrate that the Box
owns the memory that T
is stored in -- and because there an be only one owner, that means that T
does not own that memory. T
could be a pointer and own the memory that pointer points to (e.g. if T
is Vec<i32>
), but the memory occupied by T
itself is not under T
s control -- T
could be moved to a different piece of memory any time (move_to_different_box
) or the memory might just go away (leak_content
) or be repurposed entirely (repurpose_for_different_instance
). In particular, T
can not just coordinate (in unsafe code) with some other party "hey there, let me just hand you some part of the ownership of the memory I am stored in so that you can do stuff with that memory any time it pleases you". T
cannot give away ownership of something it does not have!
This prohibits self-referential structs (the reason why futures need pinning): a reference borrows (temporarily owns) the memory it points to, but since T
does not own the memory it is stored in, it cannot borrow that away to a reference.
Pinning enables T
to do exactly this. A pinned T
does control ownership of the memory it is located in, so it can hand out ownership of that memory to other parties. This enables self-referential structs: now T
does own the memory it is stored in, i.e., the memory its fields are stored in, so it can borrow that away to create a reference. This also enables intrusive collections where T
gives up ownership of its own memory entirely and hands it off to the list that it becomes a part of. I invite you to imagine that in the three example functions I have above, the T
is an element of an intrusive linked list -- and then try to understand why any of these operations is catastrophic for safety.
I hope y'all can get something out of my long ramblings here. I realized one thing though: the part about "replacing
Some
by None
" in the docs is somewhat of a red herring. Any kind of replacement without previously calling the destructor is bad. After all, if you have stored an intrusive linked list element somewhere, overwriting that with another intrusive linked list element would break the list. So it's not just moving or deallocating or switching the enum variant, even ptr::write
of another thing of the same type breaks the drop
guarantee (like repurpose_for_different_instance
above). This goes in-line with what I said above about ownership: to overwrite the memory that some data is stored in, you need to own that memory! But if that memory contains pinned data, ownership is in the hand of the pinned type, and your only way to get it back is to call drop
.
I submitted a PR to improve the documentation of structural pinning and pinning projections. Please let me know what you think!
Quite nice improvements, I think it is finally getting to the point where it is pretty clear
I, too, have been working on this Pin
issue. This thread had me think about unsafe
being required in many places (e.g., to construct a self-referential struct) because Rust cannot make sure the pin-projections + Drop
invariants are respected. However, with proc_macro
s such a thing should be doable.
So here is a PoC I intend to expand into a fully-documented crate if you confirm it is sound (you can look at the examples and the code generated by the macro, e.g., with cargo expand
).
It provides a PinDrop
trait, that the macro uses to derive the trivially sound but unsafe
Drop
implementation that casts self: &mut Self
into a Pin<&mut Self>
(it currently requires a parameter to specify whether there is one such implementation or if it is empty; with specialization this will not even be necessary).
For each field, either the #[transitively_pinned]
or #[unpinned]
attribute is expected, to specify the projection semantics (I expect #[unpinned]
to be a sane default, but I think it is better to require that the author stop and think to make the choice for each field).
Instead of PhantomPinned
(given all the talk about structural pinning, I realise that for once the phantom field hack is quite ugly), there is a (transparent) wrapper that unimplements Unpin
, and that provides a NonNull
address getter from a (shared) Pin
ned reference.
futures::Map
use ::std::{
pin::Pin,
future::Future,
task::{Context, Poll},
};
use ::easy_pin::easy_pin;
#[easy_pin(Unpin)]
struct Map<Fut, F> {
#[transitively_pinned]
future: Fut,
#[unpinned]
f_opt: Option<F>,
}
impl<Fut, F, Ret> Future for Map<Fut, F>
where
Fut : Future,
F : FnOnce(Fut::Output) -> Ret,
{
type Output = Ret;
fn poll (
mut self: Pin<&'_ mut Self>,
cx: &'_ mut Context,
) -> Poll<Self::Output>
{
match self.as_mut().pinned_future_mut().poll(cx) { //
| Poll::Pending => Poll::Pending,
| Poll::Ready(output) => {
let f =
self.unpinned_f_opt_mut()
.take()
.expect(concat!(
"Map must not be polled after ",
"it returned `Poll::Ready`",
));
Poll::Ready(f(output))
},
}
}
}
No unsafe
required.
self-referential struct
#[easy_pin]
pub
struct SelfReferential {
#[transitively_pinned]
string: PinSensitive<String>,
#[unpinned]
at_string: NonNull<String>,
}
impl SelfReferential {
pub
fn new (string: impl Into<String>) -> Pin<Box<Self>>
{
let mut pinned_box = Box::pin(Self {
string: PinSensitive::new(string.into()),
at_string: NonNull::dangling(),
});
let string_address: NonNull<String> =
pinned_box.as_ref()
.pinned_string()
.pinned_address()
;
*pinned_box.as_mut().unpinned_at_string_mut() = string_address;
pinned_box
}
#[inline]
pub
fn at_string<'__> (self: Pin<&'__ Self>) -> &'__ String
{
unsafe {
// Safety: the only way to get a Pin<&Self> is through
// Self::new().as_ref(), ensuring the pointer is well-formed.
self.get_ref().at_string.as_ref()
}
}
}
As you can see, no unsafe
is required in the constructor (only in a getter to promote the NonNull
to a reference).
EDIT: Removed the intrusive linked list since it was actually unsound: it allowed having a Pin<&mut Node>
while there may be an existing &Node
pointing to the same memory, and even worse, it could lead to a dangling reference
Still, this just shows a bad implementation of intrusive linked list where the .next()
and .prev()
methods were unsound. It had nothing to do with easy_pin
constructions; it was just caused by an oversimplified example
I'm afraid I can't read proc_macro
code very well. But how do you ensure that I don't do something like
#[easy_pin]
pub struct SelfReferential {
#[transitively_pinned]
string: PinSensitive<String>,
#[unpinned]
at_string: NonNull<String>,
}
impl Drop for SelfReferential {
fn drop(&mut self) {
// *oops* I got unpinned access to `self`!
}
}
Basically, some compile-fail tests would be good for this crate.
And similarly, how do you prevent
#[easy_pin]
pub struct SelfReferential {
#[transitively_pinned]
string: PinSensitive<String>,
#[unpinned]
at_string: NonNull<String>,
}
impl Unpin for SelfReferential {}
And finally you have to make sure the struct does not get a #[repr(packed)]
.
The Drop
problem is prevented because the proc macro implements an empty Drop
; in case you want a specific Drop
, you can use #[easy_pin(Drop)]
, which will use PinDrop
(thus erroring unless the programmer has implemented it). In both cases there is no direct access to Drop
and its unpinned &mut Self
.
This has not yet been implemented, but AFAIK the proc_macro
has access to the other attributes, so I can compile_error!
if exactly that attribute is written (if another proc_macro
adds this attribute then depending on the order this may not be detectable; an explicit #[repr(not(packed))]
attribute would be needed to truly enforce it: what do you think about it?)
unsafe
, so that it errors if the struct is packed!Damn, I hadn't thought of that one . I will have to test if it conflicts with the generated bounded
Unpin
impl (sadly I was offering such impl as an opt_in
, given that impls with trivial bounds are not yet stable, I will have to make it opt_out with keyword containing the unsafe
word).
Plus with specialization it may become possible to have specialized marker traits, may it not?
Yep, will do; I wish those were easier to set up, though
Nice!
This has some ergonomic issues though; types without Drop
have some extra properties (e.g. you can move out of them). But I guess that won't be a big problem for the use-cases here.
Does the proc macro also receive attributes that are added "before" it?
#[repr(packed)]
#[easy_pin]
pub struct ...
Nice
We might even allow overlapping marker trait impls without specialization as they don't have all the problems of overlapping general impls.
So, yeah, here it becomes very annoying that Unpin
is safe to implement. Cc @withoutboats @cramertj
Fully agreed.