When is it safe to move a member value out of a pinned future?

I'm cross-posting my question from Stack Overflow in the hopes that someone knowledgeable about pinning and futures has a chance to see it:

https://stackoverflow.com/q/56058494/155423

If it's relevant, there's a 250 point bounty on the question.

1 Like

@Nemo157 provided some information in Discord, but I'm still hoping for a complete, cohesive answer...

nemo157

@shepmaster the key for Map is that a Pin<&mut F> is never observed, so the f field is never considered to have been pinned

shepmaster

But I don't know what you mean by "observed" here

nemo157

The Map type has chosen to propagate the pinning guarantees to its future field, but not to its f field

Which you can also see with the conditional Unpin implementation

shepmaster

Why is it allowed to make that choice about guarantees?

nemo157

Map::new takes an F by value, so it’s definitely not pinned there, after that Map has ownership of the F but never allows construction of a pinned reference to it

shepmaster

Couldn't the owner of Map have pinned the entire thing, which means that F has been pinned at some point?

And the whole "once pinned forever pinned"?

nemo157

If nothing ever sees a pinned reference to the F then it’s not considered pinned, even if the Map is pinned it has ownership of the F and privacy boundaries stop anything else constructing a reference to one of its fields

shepmaster

"If nothing ever sees a pinned reference to the F then it’s not considered pinned" — is that documented anywhere?

nemo157

I guess it’s not explicitly stated anywhere, it implicitly comes from Pin<P> being the type that provides the “pinning invariants”, so if you’ve not seen one you haven’t been provided those invariants to rely on

So by taking in Pin<&mut Self> in <Map as Future>::poll Map has been provided those invariants, but it has chosen to not pass them along to F ever

That is what structural pinning is about:

First, I will note P<T> something like impl Deref<Target = T>, that is, some (smart) pointer type P that Deref::derefs to a T (Pin only "applies" to / make sense on such (smart) pointers).

Let's say we have:

struct Wrapper<Field> {
    field: Field,
}

Now, the question is, whether we can get a Pin<P< Field >> from a Pin<P< Wrapper<Field> >>, by "projecting" our Pin<P<_>> from the Wrapper to its field.

This already requires the basic projection P<Wrapper<Field>> -> P<Field>, which is only even possible for

  • shared references: P<T> = &T (this is not a very interesting case given that Pin<P<T>> always derefs to T)

  • unique references: P<T> = &mut T.

I will note this &[mut] T

So, the question is:

Can we go from Pin<&[mut] Wrapper<Field>> to Pin<&[mut] Field>?

The point that may still be unclear in the documentation is the following: it is up to the creator of Wrapper to decide!

So there are two possible choices for the library author, regarding each one of the struct fields:

  • either there is a structural Pin projection to that field;
    (for instance, when the ::pin_utils::unsafe_pinned! macro is used to define such projection)

    Then, for the Pin projection to be sound:

    • the whole struct must only implement Unpin when all the fields for which there is a structural Pin projection implement Unpin,

      • Thus, no implementation is allowed to use unsafe to move such fields out of a Pin<&mut Wrapper<Field>> (or Pin<&mut Self> when Self = Wrapper<Field>); for instance, Option::take() is forbidden.
    • the whole struct may only implement Drop if Drop::drop does not move any of the fields for which there is a structural projection,

    • the struct cannot be #[repr(packed)] (a corollary of the previous item).

    • In your given future::Map example, this is the case of the future field of the Map struct.

  • or there is no Pin projection to that field;

    In that case, that field is not considered pinned! (by a Pin<&mut Wrapper<Field>>)

    • thus whether Field is Unpin or not, does not matter;

      • implementations are allowed to use unsafe to move such fields out of a Pin<&mut Wrapper<Field>>; for instance, Option::take() is allowed.
    • and ::pin_utils::unsafe_unpinned! is safe to use to define a Pin<&mut Wrapper<Field>> -> &mut Field projection.

    • Drop::drop is also allowed to move such fields,

    • In your given future::Map example, this is the case of the f field of the Map struct.

1 Like

Also see the relevant module docs.

That's the same documentation that I quote and link to in the question, correct? To some level, that documentation is the reason for the question.

From @Yandros' answer (slightly reworded), the key part missing in the docs is:

the point that may still be unclear in the documentation is the following: it is up to the creator of Wrapper to decide [...] regarding each one of the struct fields

By my reading of the documentation, once a Pin<&mut Wrapper> has ever been constructed, it's not possible to ever move a value out of the Wrapper:

From the pin module documentation on the Drop guarantee (emphasis mine):

Concretely, for pinned data you have to maintain the invariant that its memory will not get invalidated from the moment it gets pinned until when drop is called. Memory can be invalidated by deallocation, but also by replacing a Some(v) by None, or calling Vec::set_len to "kill" some elements off of a vector.

And Projections and Structural Pinning (emphasis mine):

You must not offer any other operations that could lead to data being moved out of the fields when your type is pinned. For example, if the wrapper contains an Option<T> and there is a take-like operation with type fn(Pin<&mut Wrapper<T>>) -> Option<T>, that operation can be used to move a T out of a pinned Wrapper<T> -- which means pinning cannot be structural.

@RalfJung, would you mind pointing me to the section of the documentation that I must have misread that aligns with @Yandros' answer (or explain why the answer is wrong, if that's the unfortunate case)?

I second this reading: the documentation does not mention possible structurally unpinned fields when postulating a summary rule against moving fields out of a pinned value.

1 Like

It is, yes. Sorry I missed that.

@Yandros is right when they say

We went back-and-forth in writing the docs, and the docs do explicitly say

For a type like Vec<T> , both possibilites (structural pinning or not) make sense, and the choice is up to the author

But it seems that has not been clear enough. :slight_smile:

The issue is that yes, it is up to the creator to make that choice, but the choice has consequences. Namely, if you choose structural pinning, the docs list a "few extra requirements". So if any of these requirements are violated, there's not really a choice any more -- for those fields you cannot do structural pinning. That's why we went away from starting the docs by saying "it is up to the author", because people felt it would be weird to then continue "except when it is not".

Maybe someone else wants to give updating the docs a shot? I can help edit/review. I do like the way @Yandros set things up in their post, putting the choice first and the consequences second. Just when I tried the same in the docs people felt it was unclear.
The alternative is to put the criteria first: if you ever want to create a Pin<&mut Field>, then you must have structural pinning, and hence you have to be careful in your drop. But that seems harder to explain for me?

2 Likes

Pin is a very clever type-level / API trick to forbid some memory moves. But, to be honest, it sometimes feels like it is too clever; I do think that it is the toughest API to design libraries with (luckily very few people should be doing that).

Structural Pin-ning, or more precisely, the lack thereof, just adds up to the confusion. For instance, even if it is intuitive that Pin<P<Box<T>> does not pin T, this pattern becomes less clear for Pin<P<(T, U)>: "if (T, U) is pinned, then surely both T and U should also be pinned" corresponds to our human intuition (at least it did for me).

The best thing to fight against an intuition (again, at least for me) is a counterexample. I, for instance, appreciated the exploit_ref_cell example.

So, in a similar vein to the Nomicon's Implementing Vec, I think that a series of articles about Pin usage in a library (e.g., implementing an intrusive doubly linked list (with &mut mutation, thus different from ::intrusive_collections')).

  • (I could try to do that, if nobody else wants to / has the time for it)

The thing that may not have helped is the sentence quoted by @shepmaster and @mzabaluev:

  • I agree that the part emphasized by @shepmaster would need to mention that it applies to a structurally Pin-ned field (at least that's how I understand it):

    1. Wrapper would provide / rely on structural Pin-ning of some field: Option<F>
      (non-unsafe fn(Pin<&'_ mut Wrapper>) -> Option<Pin<&'_ mut F>>),

    2. The Drop that fixes invariants would be in Wrapper (instead of F),

    3. There would also be a non-unsafe fn(Pin<&mut Wrapper>) able to clear field by setting it to None (calling F's drop glue, but not Wrapper's)

    ⇒ Unsound.

I am actually confused because that sentence is not even about projections, it's about drop. It says that when data is pinned, you cannot "invalidate" its storage, which includes switching to a different enum variant (if the pinned data lives inside an enum).

What is the connection to projections?

Normally setting to None would call the drop glue of the old data in there, so that would be okay. But if you ptr::write the None, then indeed you broke the drop guarantee.

I truly think that my missing piece is that "projections" 1 are evidently a field-by-field decision. I read the documentation as saying:

Once a Pin<&mut Wrapper> value exists, no field may ever be moved out of that value.

This is (evidently) not true because

  1. Once a Pin<&mut Wrapper> value exists, it's now up to Wrapper 2 to decide which of its own fields are "actually" pinned and which aren't.
  2. This is on a field-by-field basis.

And I can conceptually see that, but my point is that it's not obvious that a type can choose both. The way the documentation is laid out, it seems that a type chooses either structural pinning or not. The truth is that it chooses for each field


1 — a term which I am not familiar with so using it in the docs doesn't help me.

2 — well, anything that messes with Wrappers internals, so I roughly assume that Wrapper's implementation is hidden away in a module for privacy.

That is fair. I was unsure how early to bring this up because it seemed like an advanced topic, and evidently I picked the worst possible kind of middle-ground.

Do you have any good suggestion for an alternative to "projection" that hopefully would work better for people less entrenched in theory? (People are not wrong when they claim we academics use our own weird language. :wink: )

Though to be fair, the pin-utils crate also uses that term.

I recognize that I'm falling into a logical fallacy that I hate ("If I do it, then so must everyone else, therefore it's common"), but having a future with multiple fields in it seems like a common occurrence. While newtypes are A Thing, I still expect that most structs are going to have multiple fields.

I'm not sure that I understand the concept thoroughly enough to offer alternatives. I don't even know that you have to come up with a new term; but defining it somewhere and referencing that definition where it's first used in the would be a good start, I think.

I can use the word "kwyjibo" multiple times and you still might not know what it is. I'd think the pin-utils crate is a poor example here because they seemingly rely on the users already knowing what it means.

This was supposed to happen "en passant" with

When can a struct have a "pinning projection", i.e., an operation with type fn(Pin<&Struct>) -> Pin<&Field> ?

That clearly did not work. :wink:

I think that one is definitely true. What still might be missing in the docs (and in most peoples head) is a good explanation of what Pin actually means. I couldn't yet find the best one nor can I claim that I fully understand Pin. However my own conclusion during the last couple of weeks with it is that it might not be what the word and the current explanation describes, which is about "Pinning an object in memory", so that it can't be moved. This property would be super easy to explain, it's just &obj == const.

However as far as I understand that's only one of two properties of pinning.
If it would be, then mutably accessing and assigning fields wouldn't be unsafe in the general case.

The other property is that the object must be structurally intact. Which means all interior pointers must not get invalidated. Fields which are not interior pointers can be changed for Pin<&mut>, others not (or can they, as long as one makes sure to that after the manipulation everything matches up again and no interior pointer dangles? I'm not even sure?). Structurally intact seems to be some weak kind of immutable.

What is required for this property is a type by type decision. And that's also why the same applies for "pin projections". This property makes unfortunately pretty hard to describe in a generic fashion to describe what consequences pinning has on members, and whether those are pinned too. Since the question came up around "projections": Maybe those can be described as "whether pinning an object also requires it's fields to be pinned or not", or "the mechanism of going from a pinned objects to either it's pinned or non-pinned field".

So in total my impression is that Pin<T> is something like Pinned<StructurallyFrozen<T>>. Or maybe it's just WhatEverIsRequiredToMakeAsyncAwaitWork<T> - which depends from type to type.

I do not follow. What do you mean by "structurally intact"? Mutably accessing and assigning fields is only unsafe because without support from rustc, pinning projections cannot be implemented safely. In particular, the requirement that drop must not move out of fields with structural pinning cannot be enforced. If we had a way to make the compiler ensure that

  • drop must be implemented with Pin<&mut Self> instead of &mut self, and
  • T: Unpin only if Field: Unpin, and
  • the struct is not repr(packed)

then we could make it safe to go from Pin<&mut Self> to Pin<&mut Field>. There is nothing inherently unsafe about this.

Similarly, it would be legal today to have something like this

#[repr(transparent)]
struct Unpinned<T>(T);

impl<T> Unpin for Unpinned<T> {}

// All the usual constructor, `Deref`, `DerefMut` and so on.

which could be used to wrap fields that we do not want structural pinning for. Together with safe pinning projections as outlined above, we could then safely go from Pin<&mut Self> to Pin<&mut Unpinned<Field>> (via the projection) to &mut Field (via Pin::get_mut).

It's certainly not that, because then we would not have the drop guarantee.

My main problem is that I don't know how to express this. :wink: I have written a blog post and a follow-up on the topic last year, let me try again here.

What I think pinning actually means

The key point is: A type in pinned state owns the memory it is stored in.
All of the rules for pinning, including the drop guarantee, basically fall out from that idea.

What I mean by this is the following: if we consider x: Box<T>, the no matter the T we can call

fn leak_content<T>(x: Box<T>) { mem::forget(*x) }

which will deallocate the Box without any "intervention" from T. We can also call

fn move_to_different_box<T>(x: Box<T>) -> Box<T> { Box::new(*x) }

which will put the T into a new place and deallocate the old place where T was, again without any "intervention" from T. And we can do

fn repurpose_for_different_instance<T>(mut x: Box<T>, t: T) -> Box<T> {
  mem::forget(*x);
  *x = t;
  x
}

which will "throw away" the old T and put a different thing into the same place in memory.

All of these demonstrate that the Box owns the memory that T is stored in -- and because there an be only one owner, that means that T does not own that memory. T could be a pointer and own the memory that pointer points to (e.g. if T is Vec<i32>), but the memory occupied by T itself is not under Ts control -- T could be moved to a different piece of memory any time (move_to_different_box) or the memory might just go away (leak_content) or be repurposed entirely (repurpose_for_different_instance). In particular, T can not just coordinate (in unsafe code) with some other party "hey there, let me just hand you some part of the ownership of the memory I am stored in so that you can do stuff with that memory any time it pleases you". T cannot give away ownership of something it does not have!

This prohibits self-referential structs (the reason why futures need pinning): a reference borrows (temporarily owns) the memory it points to, but since T does not own the memory it is stored in, it cannot borrow that away to a reference.

Pinning enables T to do exactly this. A pinned T does control ownership of the memory it is located in, so it can hand out ownership of that memory to other parties. This enables self-referential structs: now T does own the memory it is stored in, i.e., the memory its fields are stored in, so it can borrow that away to create a reference. This also enables intrusive collections where T gives up ownership of its own memory entirely and hands it off to the list that it becomes a part of. I invite you to imagine that in the three example functions I have above, the T is an element of an intrusive linked list -- and then try to understand why any of these operations is catastrophic for safety.

I hope y'all can get something out of my long ramblings here. :wink: I realized one thing though: the part about "replacing Some by None" in the docs is somewhat of a red herring. Any kind of replacement without previously calling the destructor is bad. After all, if you have stored an intrusive linked list element somewhere, overwriting that with another intrusive linked list element would break the list. So it's not just moving or deallocating or switching the enum variant, even ptr::write of another thing of the same type breaks the drop guarantee (like repurpose_for_different_instance above). This goes in-line with what I said above about ownership: to overwrite the memory that some data is stored in, you need to own that memory! But if that memory contains pinned data, ownership is in the hand of the pinned type, and your only way to get it back is to call drop.

7 Likes

I submitted a PR to improve the documentation of structural pinning and pinning projections. Please let me know what you think!

https://github.com/rust-lang/rust/pull/61878

2 Likes

Quite nice improvements, I think it is finally getting to the point where it is pretty clear :slight_smile:


I, too, have been working on this Pin issue. This thread had me think about unsafe being required in many places (e.g., to construct a self-referential struct) because Rust cannot make sure the pin-projections + Drop invariants are respected. However, with proc_macros such a thing should be doable.

So here is a PoC I intend to expand into a fully-documented crate if you confirm it is sound (you can look at the examples and the code generated by the macro, e.g., with cargo expand).

  • It provides a PinDrop trait, that the macro uses to derive the trivially sound but unsafe Drop implementation that casts self: &mut Self into a Pin<&mut Self> (it currently requires a parameter to specify whether there is one such implementation or if it is empty; with specialization this will not even be necessary).

  • For each field, either the #[transitively_pinned] or #[unpinned] attribute is expected, to specify the projection semantics (I expect #[unpinned] to be a sane default, but I think it is better to require that the author stop and think to make the choice for each field).

  • Instead of PhantomPinned (given all the talk about structural pinning, I realise that for once the phantom field hack is quite ugly), there is a (transparent) wrapper that unimplements Unpin, and that provides a NonNull address getter from a (shared) Pinned reference.

Examples

  1. futures::Map

    use ::std::{
        pin::Pin,
        future::Future,
        task::{Context, Poll},
    };
    
    use ::easy_pin::easy_pin;
    
    #[easy_pin(Unpin)]
    struct Map<Fut, F> {
        #[transitively_pinned]
        future: Fut,
    
        #[unpinned]
        f_opt: Option<F>,
    }
    
    impl<Fut, F, Ret> Future for Map<Fut, F>
    where
        Fut : Future,
        F : FnOnce(Fut::Output) -> Ret,
    {
        type Output = Ret;
    
        fn poll (
            mut self: Pin<&'_ mut Self>,
            cx: &'_ mut Context,
        ) -> Poll<Self::Output>
        {
            match self.as_mut().pinned_future_mut().poll(cx) { //
                | Poll::Pending => Poll::Pending,
                | Poll::Ready(output) => {
                    let f =
                        self.unpinned_f_opt_mut()
                            .take()
                            .expect(concat!(
                                "Map must not be polled after ",
                                "it returned `Poll::Ready`",
                            ));
                    Poll::Ready(f(output))
                },
            }
        }
    }
    

    No unsafe required.

  2. self-referential struct

    #[easy_pin]
    pub
    struct SelfReferential {
        #[transitively_pinned]
        string: PinSensitive<String>,
    
        #[unpinned]
        at_string: NonNull<String>,
    }
    
    impl SelfReferential {
        pub
        fn new (string: impl Into<String>) -> Pin<Box<Self>>
        {
            let mut pinned_box = Box::pin(Self {
                string: PinSensitive::new(string.into()),
                at_string: NonNull::dangling(),
            });
            let string_address: NonNull<String> =
                pinned_box.as_ref()
                    .pinned_string()
                    .pinned_address()
            ;
            *pinned_box.as_mut().unpinned_at_string_mut() = string_address;
            pinned_box
        }
    
        #[inline]
        pub
        fn at_string<'__> (self: Pin<&'__ Self>) -> &'__ String
        {
            unsafe {
                // Safety: the only way to get a Pin<&Self> is through
                // Self::new().as_ref(), ensuring the pointer is well-formed.
                self.get_ref().at_string.as_ref()
            }
        }
    }
    

    As you can see, no unsafe is required in the constructor (only in a getter to promote the NonNull to a reference).


EDIT: Removed the intrusive linked list since it was actually unsound: it allowed having a Pin<&mut Node> while there may be an existing &Node pointing to the same memory, and even worse, it could lead to a dangling reference :confused:

Still, this just shows a bad implementation of intrusive linked list where the .next() and .prev() methods were unsound. It had nothing to do with easy_pin constructions; it was just caused by an oversimplified example :sweat_smile:

I'm afraid I can't read proc_macro code very well. But how do you ensure that I don't do something like

#[easy_pin]
pub struct SelfReferential {
    #[transitively_pinned]
    string: PinSensitive<String>,

    #[unpinned]
    at_string: NonNull<String>,
}

impl Drop for SelfReferential {
  fn drop(&mut self) {
    // *oops* I got unpinned access to `self`!
  }
}

Basically, some compile-fail tests would be good for this crate. :wink:

And similarly, how do you prevent

#[easy_pin]
pub struct SelfReferential {
    #[transitively_pinned]
    string: PinSensitive<String>,

    #[unpinned]
    at_string: NonNull<String>,
}

impl Unpin for SelfReferential {}

And finally you have to make sure the struct does not get a #[repr(packed)].

The Drop problem is prevented because the proc macro implements an empty Drop; in case you want a specific Drop, you can use #[easy_pin(Drop)], which will use PinDrop (thus erroring unless the programmer has implemented it). In both cases there is no direct access to Drop and its unpinned &mut Self.

This has not yet been implemented, but AFAIK the proc_macro has access to the other attributes, so I can compile_error! if exactly that attribute is written (if another proc_macro adds this attribute then depending on the order this may not be detectable; an explicit #[repr(not(packed))] attribute would be needed to truly enforce it: what do you think about it?)

  • Actually I could "hack" with a hidden reference access to one of the fields somewhere, without unsafe, so that it errors if the struct is packed!

Damn, I hadn't thought of that one :sweat_smile: . I will have to test if it conflicts with the generated bounded Unpin impl (sadly I was offering such impl as an opt_in, given that impls with trivial bounds are not yet stable, I will have to make it opt_out with keyword containing the unsafe word).
Plus with specialization it may become possible to have specialized marker traits, may it not? :grimacing:

Yep, will do; I wish those were easier to set up, though

Nice!

This has some ergonomic issues though; types without Drop have some extra properties (e.g. you can move out of them). But I guess that won't be a big problem for the use-cases here.

Does the proc macro also receive attributes that are added "before" it?

#[repr(packed)]
#[easy_pin]
pub struct ...

Nice :rofl:

We might even allow overlapping marker trait impls without specialization as they don't have all the problems of overlapping general impls.

So, yeah, here it becomes very annoying that Unpin is safe to implement. :confused: Cc @withoutboats @cramertj

Fully agreed.