How the pin SmartPoiner fixed data?

I know that the purpose of Rust's Pin is to solve the issue of self-referential structs. It is used to prevent certain data from being moved within the data itself. When I looked at the source code of Pin, it essentially seems like a pointer. And I came up with a solution to address the immovability of data, which is to allocate the data on the heap and use a pointer to reference it. Then, when moving, we can simply copy the pointer. This way, the data remains fixed on the heap while the pointer's position keeps changing, but its value, which points to the data on the heap, remains constant.

Even if the data is not allocated on the heap but in the stack, the same approach can be applied by using a pointer to the starting position of the data in the stack. And since the deallocation of the stack is uncertain, we can use unsafe operations when manipulating the pointer. Although the pointer may point to empty data due to stack deallocation, it is still within the realm of unsafe logic. When using the pointer, we can ensure that the stack data is valid. If we cannot guarantee it, there's no solution, as you are already performing unsafe operations.

Furthermore, the pointers I mentioned above can be implemented using some smart pointers.

However, despite searching online, I haven't found a clear explanation of how Pin solves the immovability issue and what kind of data is being moved. Additionally, I noticed that some uses of Pin involve nesting another layer of pointers, which I feel is unnecessary. For example,

``````Pin<Box<dyn Future<Output = T> + Send + 'static>>
``````

Pin itself is already a pointer, and it is then nested inside a Box pointer. Can't we remove Pin? It can be transformed into

``````Box<dyn Future<Output = T> + Send + 'static>.
``````

In this way, the Box pointer points to the future data, and when moving, we can simply copy the Box's data. This way, the data of the future is not being moved either. It's just like a pointer, pointing to the future. We just need to copy the pointer, and the data's position within the pointer remains unchanged.

So, is there something wrong with what I mentioned? How does Pin actually solve the issue of immovable data?

1 Like

When I looked at the source code of Pin, it essentially seems like a pointer.
â€¦ Pin itself is already a pointer,

This is critical to understand: No, a `Pin` is not a pointer. `Pin` is a type that (when used as intended) contains a pointer. When you write `Pin<Box<T>>`, the `Box` is the pointer that `Pin` is controlling. There is only one pointer.

And I came up with a solution to address the immovability of data, which is to allocate the data on the heap and use a pointer to reference it. Then, when moving, we can simply copy the pointer.

This is exactly what `Pin<Box<T>>` does. `Pin` is a formalization of this idea that, if we point to something, we can refrain from moving it and only move the pointer.

Even if the data is not allocated on the heap but in the stack, the same approach can be applied by using a pointer to the starting position of the data in the stack.

This is what the `pin!` macro does: it creates a pointer to a stack variable, and provides it to you as `Pin<&mut T>`.

This is how `Pin<Box<T>>` already works. The key thing that `Pin<Box<T>>` does that `Box<T>` does not is: prohibit moving out of the `Box`. In general, `Pin` takes an existing pointer type and provides the additional guarantee that the pinned value (the value the pointer points to) won't be moved.

The value of having `Pin` rather than just a `PinnedBox` type is that you can produce `Pin<&mut ...>` pointers to parts of the pinned value ("pin projection"). This allows a complex pinned type (like a future), composed out of other pinning-needing types, to address those parts of itself without needing more than one pinned allocation.

13 Likes

This is the clearest description of Pin I've read and the questions that Rgoogle asked were really insightful. Thank you, both of you!

1 Like

So, how does Pin ensure that the pinned data is not moved?

It is not just about move or copy. When we use * to dereference an Arc or Rc pointer, or any other smart pointer, there are two possibilities. If the data is not copyable, dereferencing the pointer may fail or result in moving ownership (which should result in an error). If the data is copyable, * dereference will create a copy of the data.

Therefore, Pin must handle the deref and deref_mut functions differently. However, I looked at the implementation of Pin for these two functions. Their purpose is to dereference the pointer and add &, then wrap it in another Pin.

The result is Pin, where data is the content wrapped in Box. If we continue to dereference it *, it depends on whether the data has implemented deref or deref_mut. If it has, then adding * will give us the same result as above - dereferencing the wrapped data and wrapping it in another Pin.

So, is this how Pin ensures that the data is not moved? It relies on deref and deref_mut to never obtain a reference to the data, but only the inner data wrapped in Pin. It's like dereferencing Pin and getting the same type of Pin, except that the data wrapped inside Pin is different - it is the result of dereferencing.

So, how does Pin ensure that the data is not moved?

It's really confusing.

In order for data not be be movable you need two factors: The data needs to be behind a pinned reference which limits the API as discussed above, but crucially also the data itself must be of a type that is defined as â€śthe typeâ€™s author thinks it can be relevant for the type not to be movableâ€ť, which is expressed via the `Unpin` trait. Specifically, only data types `T` that donâ€™t implement `Unpin` can be actually prevented from being moved in the first place.

`Unpin` is an auto-trait, so itâ€™s automatically implemented for most types; the way to opt out is to include a field of type `PhantomPinned` (which incidentally is probably one of the only types that is `!Unpin` but `Copy`). Making use of the pinning guarantees does however then also involve usage of `unsafe` from the typeâ€™s author, typically.[1]

Such types will most typically also not implement `Copy` (in fact I believe thatâ€™s essentially always the case) because, as you correctly described, otherwise you could simply copy (and thus sort-of â€śmoveâ€ť) pinned value by dereferencing.

Otherwise, data behind a reference in Rust can only be moved by helper APIs, which include `mem::replace` for example. Such APIs can move data from behind mutable references, which is why `Pin<PointerTo<T>`â€™s main job is to prevent any API that can obtain a `&mut T` reference to the target. Types like `Pin<Box<T>>` or `Pin<&mut T>` do thus not implement `DerefMut` (except for target types `T: Unpin` which cannot â€śreallyâ€ť be pinned in the first place), but they do implement `Deref`, because immutable `&T` references provide no way to move the values.

1. However, there are non-`unsafe` ways to define sensible `!Unpin` types, too; one is that all `async {}` blocks or `async fn` futures are compiler-generated anonymous types that are `!Unpin` (and `!Copy`) and make use of this property in their implementation; the other is that types that wrap other `!Unpin` types can make use of â€śstructural pinningâ€ť of fields without explicit `unsafe` through helper macros like the `pin-project` crate. â†©ď¸Ž

2 Likes

So you're saying that all types in Rust are movable by default and automatically implement the Unpin trait.

For types that are not movable, you manually implement the !Unpin trait.

What is the purpose of these two traits? Does it mean that the compiler will give an error if it detects moving of non-movable data?

If that's the case, what is the significance of wrapping it with Pin?

As mentioned by the previous user, this is how Pin<Box> already works. The important thing that Pin<Box> does, which Box does not, is to prohibit moving out of the Box.

How does Pin ensure that data cannot be moved out?

So, how does Pin ensure that the pinned data is not moved?

`Pin` is two things here.

First, when `T: !Unpin`, `Pin<SomePtr<T>>` never hands out a `&mut T` except as an unsafe operation. This way, the owner of a `Pin<SomePtr<T>>` cannot use `std::mem::swap` or similar to move the pinned data. @steffahn just wrote more about that above.

Second, the act of creating a `Pin` (also unsafe when `T: !Unpin`) is a promise by its creator that the data is not going to be moved. For example, `Box::pin()` is safe, and creates a `Pin<Box<T>>`; it does not allow `T` to be moved because in general `Box` is always a unique owning pointer, so there is no way to get at the `T` except through the `Box` pointer. Thus, `Pin<Box<T>>` is both unique like `Box`, and prohibits the ways you might move out of `Box` because it's wrapped in `Pin` (so you cannot call `Box::into_inner()` or anything else that would move out.).

`Pin` doesn't do much; it is a type that denotes a prior promise, and its methods are designed to not break that promise. When you see a `Pin<SomePtr<T>>` you can rely on that either

• `T: Unpin` (read this as "T does not need to be immovable; you are allowed to unpin it"), or
• Somebody called `Pin::new_unchecked` in order to make the promise of not moving.

`Pin` is a way to communicate that promise from the creator of the `Pin` to the user of it.

3 Likes

No, itâ€™s purely done in library, without compiler support for enforcing anything. For example the library function for obtaining `&mut T` from `Pin<Box<T>>` has a `T: Unpin` bound, the library function that gets you a `Pin<&mut T>` doesn't.

The reason why itâ€™s done purely in libraries I think is mostly â€śbecause we canâ€ť, i.e. it was possible to do it with an API defined in the standard library, so avoiding the need for extra complication of the language was possible.[1]

The purpose of `Unpin` (which is only one trait; Iâ€™m writing `!Unpin` to mean â€śdoesnâ€™t implement `Unpin`â€ť) is of course a bit curious; why make it so that types can be put behind `Pin` but then they arenâ€™t actually pinned? The answer is: itâ€™s very common that `Pin`ned references (particularly `Pin<&mut T>`) has to be created in order to conform to a particular generic interface. First and foremost, and the very thing `Pin` was invented for, this is the `Future` trait with its `fn poll(self: Pin<&mut Self, â€¦) -> â€¦` method.

This method signature uses `Pin<&mut Self>` instead of `&mut Self` to allow the common use case of futures involving `async fn` or `async` block with self-referencing, unmovable-after-first-poll data types. However it doesnâ€™t intend to force pinning on users that have `Future` types that donâ€™t need the pinning, hence the design of `Pin` is such that a type that opts out of pinning with the (often automatically generated; so arguably, itâ€™s rather an opt-in for pinning, instead an opt-out) implementation of `Unpin` can be freely converted between `Pin<&mut T>` and `&mut T` (the direction I had not yet mentioned is `Pin::new` to create, for example `Pin<&mut T>` from `&mut T` when `T: Unpin`.

So TL;DR, the purpose of `Unpin` is to allow API like the `Future` trait to still have the right actor to choose whether pinning is used, that is, the author of the type that might require pinned values for implementing its functionality.

Also, this demonstrate well why `Pin` is so hard to learn. It is a concept that is generic in two axes

• itâ€™s generic over the type of pointer being used, which is actually a higher-order kind of generics, sort-of, because it involves a generic parameter where you input types that themselves usually have a generic parameter
• itâ€™s generic over whether or not pinning actually takes place, which is determined by `Unpin` implementations of the target type

If you imagined we have two pinned-pointer types that always actually disallow moving, like `PinnedBox<T>` and `PinnedMutRef<'a, T>`, then you can translate the most use case of `Pin` like

type in Rust its meaning, if   `Foo: Unpin` its meaning, if not   `Foo: Unpin`
`Pin<Box<Foo>>` essentially just a   `Box<Foo>` actually pinned   `PinnedBox<T>`
`Pin<&'a mut Foo>`  essentially just a   `&'a mut Foo`  actually pinned   `PinnedMutRef<'a T>`

1. Of course, this also has downsides, for example structural pinning / pinning projections, which Iâ€™ve mentioned in a footnote in my previous answer, requires either unsafe code, or help from some macros hiding `unsafe` code, instead of having the compiler directly â€śunderstandâ€ť what youâ€™re trying to do, which can be less ergonomic. â†©ď¸Ž

4 Likes

So I summarized it. In Rust, there are two things: Unpin (movable) and Copy. In Rust, there is a rule that if all fields of a struct implement Unpin, then the struct is Unpin, otherwise it is !Unpin (fixed). The same rule applies to Copy.

Furthermore, Rust has an empty struct called PhantomPinned. It implements !Unpin (fixed). So when we add a field of this type to our struct, it means that our type is not movable (although it can still be moved if it implements Copy). This is just a representation, not a requirement (like a marker). If it does not implement Copy, then moving will occur (when assigning or passing parameters).

There are four possibilities when putting data into Pin. The data may implement Unpin and may implement Copy. It may implement Unpin and !Unpin. The same goes for Copy. So it's 2 * 2 = 4.

As follows:

``````  The following table shows the meaning of Pin being fixed. It does not refer to the Pin struct itself.

Pin   Copy
0      0           With DerefMut, dereferencing results in an error. No Copy.
0      1           With DerefMut, dereferencing makes a copy of the data.
1      0           Without DerefMut, dereferencing results in an error. No Copy.
1      1           Without DerefMut, dereferencing makes a copy of the data.
``````

The key point is when Pin is 1 and Copy is 0 or 1.

When the data is fixed, the Pin struct does not provide a deref_mut method because it is constrained by the generics. Only types that implement Unpin have this method, while !Unpin does not.

But why does not providing the deref_mut method ensure that data implementing !Unpin cannot be moved? This is because Rust has a function: std::mem::replace(&mut T, data) or swap() function. The replace function writes the data into the &mut T position. And this function is safe, not unsafe. So, if the data wrapped by Pin is !Unpin, and if we can call the deref_mut method, we will get a mutable reference of &mut T. We can pass this reference to the first parameter of the replace function, which returns the old value. Then we can assign the old value to other variables. At this point, the position has changed, and ownership has been moved. It can be observed that as long as there is a mutable reference, position movement can occur. Therefore, Pin must solve this by not providing a return of &mut T, which means not providing the deref_mut method. This is achieved by constraining it through a trait bound. Only types that implement Unpin (movable) have this method. Naturally, types that implement !Unpin do not have this method.

So, Pin ensures that immovable data is not moved by not providing a deref_mut method. However, if movable data is inside Pin, it depends on the developer's decision. You can choose to move it by calling deref_mut to get a mutable reference and then using replace or swap to move it. This will result in a change in the position of the variables in memory. Or you can choose not to move it by not calling deref_mut.

Thus, Pin does not forcibly guarantee that the data enclosed by it cannot be moved. It only ensures that in the case of !Unpin, the data cannot be moved by not providing the deref_mut method. In other cases, it is up to the developer to decide.

Another question is why Pin must contain a pointer. This is because the pointer allows the data to be obtained through deref_mut and then swap can be called. This way, the data can be moved. There might not be a smart pointer that decides whether to provide the deref_mut method based on whether the data implements Unpin. If there is one, it would be at the same level as Pin.

So, understanding these principles, would you like to create your own Pin implementation?

1 Like

But if this data implements!UnPin (fixed) and Copy.

Then you can dereference and copy the data, but it doesn't mean the ownership of the data has moved. The copied data still remains in the same location. It just creates an additional copy of the data.
Does this extra data have any unsafe situations? I tested it, and it seems not because I cannot drop the copied data. When I pass the data to drop, the compiler actually secretly copies the data, so what I'm killing is actually the copy.
So, this is as far as I can test.

yes or no ????

yes or no ??

yes or no??

Please don't spam, the community is very responsive without the need to make repeated demands for answers to your question.

6 Likes

Not quite. `Copy` implementations are explicit or `derive`d, itâ€™s not an auto-trait; though itâ€™s not an ordinary trait either: you can only manually implement it if all fields implement it.

Dereferencing is not necessarily an error without `Copy`. Only if you access the refererenced thing by-value. So e.g.

``````let by_value = *x;
``````

might error, but

``````let by_mutable_ref = &mut *x;
``````

not necessarily.

Also if the type does implement `Copy`, dereferencing only makes a copy if you access the dereferenced expression by-value. (And in this case, the thing does through `Deref`, not `DerefMut`, because copying only needs immutable access.)

Thatâ€™s exactly right!

The whole picture is a bit bigger. Itâ€™s essentially about all API that could end up moving the value (or providing a mutable reference); `deref_mut` and `mem::replace`/`mem::swap` are just prominent examples, and a case covered in the API of `Pin` itself. Other examples include `Pin::new` which is restricted to `Unpin`, because otherwise, you can re-borrow a `&mut T`, put it into `Pin<&mut T>`, but then you still have the original `&mut T` after the `Pin<&mut T>` is dropped, providing a way to move out of data that once was declared/promised as â€śpinnedâ€ť. Itâ€™s an overall API promise, a contract of sorts, that the data cannot be moved, and users of `unsafe` Rust should ensure not to break this contract.

(Itâ€™s also a contract between the one defining a datatype and the one using it, and the one defining the datatype can decide to move the struct, or parts of it (understanding structural pinning / projections is a useful thing for a deeper understanding here), as they know the underlying reason for having the pinning restriction on their type.)

The answer to this that I would give is two-fold.

On one hand, pinning data (i.e. making sure it cannot move) must include a pointer indirection, as you can always move the `Pin<â€¦>` handle you have gotten, and its contents must not move, so they must be behind pointer indirection. On the other hand, on the question why you need to put a pointer type argument into `Pin`, like `Pin<Box<T>>` or `Pin<&mut T>`, instead of `Pin` itself being defined to be a pointer, is for generality / a smaller API. The alternative would be that each pointer type gets its own pinned version, like the `PinnedBox<T>` or `PinnedMutRef<'a, T>` I have sketched above. This is a larger API surface (more new types), and instead having a single wrapper-type `Pin` that turns a non-pinned-pointer-type into a pinned-pointer-type is quite elegant.

4 Likes

Dropping is always irrelevant (a no-op) for data that implements `Copy`. So concerns of â€śdropping a copyâ€ť are irrelevant, as dropping never does anything.

Similarly, thinking about â€śownershipâ€ť being moved or not is not the most useful thing for `Copy` types, because types implementing `Copy` are arguably the least â€śowningâ€ť kind of types in Rust. I would still agree with the statement

but partially for the reason that there isnâ€™t any ownership that could be moved to begin with. A â€śmoveâ€ť operation in Rust isnâ€™t much more in the first place than

• copying the shallow data
• never using the old value again
• making sure no drop implementation is called on the old value either
• this is usually done by the compiler with static analysis or generated dynamic â€śdrop flagsâ€ť
• but if you resort to manual usage of `ptr::copy`-like operations and `unsafe` code, you might encounter the need to reason about these conditions manually

given `Copy` types are always free of any drop code, you can hence logically (pretend to) â€śmoveâ€ť a `Copy` type by just copying it and never using the old value again.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.