Is `Unpin` required in `Pin::new`?

beroal · December 27, 2024, 9:54pm

I'm talking about the Unpin bound in

impl<Ptr: Deref<Target: Unpin>> Pin<Ptr> {
    pub const fn new(pointer: Ptr) -> Pin<Ptr> { ... }
}

Can you give me an example program that manages memory incorrectly with the following function:

fn my_new<Ptr: Deref>(pointer: Ptr) -> Pin<Ptr> {
    unsafe { Pin::new_unchecked(pointer) }
}

josephcsible · December 27, 2024, 10:03pm

Isn't my_new exactly equivalent to Pin::new_unchecked? So wouldn't the "Potential UB" examples in its documentation all qualify?

beroal · December 27, 2024, 11:40pm

I'm asking about incorrect memory management, not violations of contracts. For example, when an a.f is and should be a reference to a's field, but after some operation becomes a reference to another struct's field.

For example, in move_pinned_ref, they don't use p, so it doesn't cause incorrect memory management.

josephcsible · December 27, 2024, 11:58pm

A type being !Unpin specifically means exactly that it will manage memory incorrectly if it's moved while it was supposed to be pinned.

beroal · December 28, 2024, 12:29am

Sorry, by incorrect memory management I mean real problems like I described above or, say, dangling pointers, double free, a pointer of type u64 pointing to u32, etc.

drewtato · December 28, 2024, 2:07am

I've put together an attempt at an example. I'm pretty sure it could be improved, but it should be enough to explain why Pin::new_unchecked is unsafe.

use std::marker::PhantomPinned;
use std::pin::Pin;

struct SelfRef {
    c: Option<[u8; 5]>,
    d: *const u8,
    _phantom_pinned: PhantomPinned,
}

impl SelfRef {
    fn new() -> Self {
        Self {
            c: Some(*b"hello"),
            d: std::ptr::null(),
            _phantom_pinned: PhantomPinned,
        }
    }

    fn reference_self(self: Pin<&mut Self>) {
        let this = unsafe { self.get_unchecked_mut() };
        this.d = this.c.as_ref().unwrap().as_ptr();
    }

    fn clear(&mut self) {
        self.c = None;
    }
}

impl Drop for SelfRef {
    fn drop(&mut self) {
        if self.d.is_null() {
            return;
        }
        let s = unsafe {
            let slice = core::slice::from_raw_parts(self.d, 5);
            String::from_utf8_lossy(slice)
        };
        println!("{s:?}");
    }
}

fn move_pinned_ref(mut a: SelfRef, mut b: SelfRef) {
    b.clear();
    unsafe {
        let p = Pin::new_unchecked(&mut a);
        p.reference_self();
    }
    std::mem::swap(&mut a, &mut b)
}

fn main() {
    let a = SelfRef::new();
    let b = SelfRef::new();
    move_pinned_ref(a, b);
}

The struct SelfRef has a pointer that can point to the same struct. It has been written to ensure that when Pin is upheld, no UB can occur.

Then we introduce the bad function from the Pin::new_unchecked example. SelfRef::reference_self can only be called with Pin because drop is later going to read d. But then we violate that by moving a. a's pointer (now b's pointer) is now dangling, and we end up reading a None as 5 bytes.

You can create types that rely on Pin without unsafe with async functions. Imagine pinning an async function, calling poll on it, and then swapping it to another location. References that pointed to itself would be invalid, since the swapped future may not have progressed far enough to have data there.

steffahn · December 28, 2024, 2:14am

To turn unsound interaction with Pin's contract into a concrete dangling pointer, we need a type that relies on pinning guarantees – usually that’s going to be an async-block Future, because those are why Pin was introduced in the first place.

For example, it should be unsound for an API to produce a Pin<&mut T> reference, and then move the value anyway. (As long as the Pin<&mut T> is exposed to the API user, and the type T isn't owned by the API.)

use std::pin::Pin;
use std::ops::Deref;

fn my_new<Ptr: Deref>(pointer: Ptr) -> Pin<Ptr> {
    unsafe { Pin::new_unchecked(pointer) }
}

fn unsound_pin_then_move<T>(x: T, expose_to: impl FnOnce(Pin<&mut T>)) {
    let mut x_in_initial_place = x;
    let r: Pin<&mut T> = my_new(&mut x_in_initial_place);
    expose_to(r);
    let x_in_different_place = x_in_initial_place; 
}

For simplifying the exploitation, let's actually expose a Pin<&mut T> to this new position, too (though that isn't necessary; e.g. a Drop in the new place would also be unsound).

fn unsound_pin_then_move_then_pin<T>(x: T, mut expose_to: impl FnMut(Pin<&mut T>)) {
    let mut x_in_initial_place = x;
    let r: Pin<&mut T> = my_new(&mut x_in_initial_place);
    expose_to(r);
    let mut x_in_different_place = x_in_initial_place;
    let r2: Pin<&mut T> = my_new(&mut x_in_different_place);
    expose_to(r2);
}

Let's convert unsound_pin_then_move_then_pin into a dangling reference. Here's an example future we could try to use.

async {
    let my_array = [1, 2, 3, 4, 5];
    let r = &my_array;
    something().await;
    println!("array through r: {r:?}");
}

these Futures in Rust rely on pinning. The future directly contains the array my_array and the reference r. If these are moved together at the .await point, to a different place, r still points to the original position.

We need something that actually yields at the await:

enum YieldOnce {
    Initial,
    Yielded,
}
impl Future for YieldOnce {
    type Output = ();
    fn poll(mut self: Pin<&mut Self>, _: &mut Context<'_>) -> Poll<()> {
        use YieldOnce::*;
        match *self {
            Initial => {
                *self = Yielded;
                Poll::Pending
            }
            Yielded => Poll::Ready(())
        }
    }
}
async fn something() {
    YieldOnce::Initial.await
}

So we get this far:

fn main() {
    let fut = async {
        let my_array = [1, 2, 3, 4, 5];
        let r = &my_array;
        something().await;
        println!("array through r: {r:?}");
    };
    unsound_pin_then_move_then_pin(fut, |pinned| {
        // TODO
    });
}

Now, we need to use the pinned handle to move the future to the first await point, by polling once. To call Future::poll, we need to construct a no-op Context…

let w: Waker = todo!();
let cx = Context::from_waker(&w);

containing a no-op Waker…

struct Noop;
impl Wake for Noop {
    fn wake(self: Arc<Self>) {}
}
…
let w: Waker = Arc::new(Noop).into();

Finally, polling:

    unsound_pin_then_move_then_pin(fut, |pinned| {
        let _ = pinned.poll(&mut cx);
    });

(playground)

array through r: [1, 2, 3, 4, 5]

OKAY… it isn’t completely broken yet. But already partially, let's add some debug printing:

async {
    let my_array = [1, 2, 3, 4, 5];
    let r = &my_array;
    println!("{:p}, {:p}", &my_array, r);
    something().await;
    println!("{:p}, {:p}", &my_array, r);
    println!("array through r: {r:?}");
}

0x7ffc936e4cf8, 0x7ffc936e4cf8
0x7ffc936e4d18, 0x7ffc936e4cf8
array through r: [1, 2, 3, 4, 5]

aha, so the array now is in a different spot than the reference that's supposed to point to it.

Really all that's left is to make this more clearly broken. A segfault would be nice. One way to start would be to do something more to the initial place. How about an Option? Those tend to have fun usage of niches for None that can break the previously contained value…

fn unsound_pin_then_move_then_pin<T>(x: T, mut expose_to: impl FnMut(Pin<&mut T>)) {
    let mut x_in_initial_place = Some(x);
    let r: Pin<&mut T> = my_new(x_in_initial_place.as_mut().unwrap());
    expose_to(r);
    let mut x_in_different_place = x_in_initial_place.take().unwrap();
    let r2: Pin<&mut T> = my_new(&mut x_in_different_place);
    expose_to(r2);
}

0x7ffc2ae55248, 0x7ffc2ae55248
0x7ffc2ae55268, 0x7ffc2ae55248
array through r: [-1154135295, 24654, 719671872, 32764, 719671856]

ah, already much more broken. And I’m not even sure how exactly it went wrong, haha!
(playground)

But I promised segfaults! Easy, we already have something where integers get messed up… what if these were pointers?

async {
    let my_array = [Some(&42)];
    let r = &my_array;
    println!("{:p}, {:p}", &my_array, r);
    println!("array through r (before await): {r:?}");
    something().await;
    println!("{:p}, {:p}", &my_array, r);
    println!("array through r (after await): {r:?}");
}

0x7ffe58ff88d8, 0x7ffe58ff88d8
array through r (before await): [Some(42)]
0x7ffe58ff88f0, 0x7ffe58ff88d8
array through r (after await): [Some(415531848)]

promising, miri is unhappy for a long time already btw, but apparently the OS was still fine with this pointer access. Only one solution: Double indirection!^[1]

let my_array = [Some(&&42)];

0x7ffc6a80b138, 0x7ffc6a80b138
array through r (before await): [Some(42)]
0x7ffc6a80b150, 0x7ffc6a80b138
…

Exited with signal 11 (SIGSEGV): segmentation violation

…aaaaand BOOM

(playground)

if the outer pointer points to a slightly different position, but the data there is wildly different data, then defererencing twice should throw us out of legal address space entirely ↩︎

beroal · December 28, 2024, 1:07pm

Okay, I somehow forgot that we can keep the argument of my_new and move out through it. However, if Ptr is specifically Box<T>, we can't keep the argument, and my_new is safe and is already included as Box::into_pin. Thanks.

system · March 28, 2025, 1:08pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Question about `Pin::new()` help	27	828	May 3, 2023
Does the documentation contains unnecessary `unsafe`? help	2	325	July 9, 2023
Help to design a pinned object help	17	447	August 1, 2022
Pin & Unpin explained help	7	2132	December 29, 2023
Pin tutorial are confusing me help	23	2765	June 15, 2023

Is `Unpin` required in `Pin::new`?

Related topics