Confusion regarding pinning

I was trying to understand the Pinning the whole day but it was so confusing that I ended up with a ton of questions!

  1. they are saying we pin the futures which is polling because the self can get moved (ie: the future) then we will get invalid reference for accessing any values on the self? But why will the executor ever want to move an ongoing polling promise? Because when we call poll using the . notation like future.poll there is no chance of future getting moved.

  2. the example in asynchronous book also quite unstable for me! They are swapping using std:: mem but that swapping is changing the ownership of the internal data but not the ownership of the whole struct! But in future pinning we are trying to pin the whole future struct by doing pin self.

  1. Why even rust generating the d
    State machines from the async blocks In a unsafe manner? Because rust normally doesn't let us create ref struct so why are they creating those behind the scenes!

I hope any experienced one among all of u can help?

1 Like

Because it can't be done safely. Futures generated from an async context with local variables are necessarily self-referential, and self-referential types are necessarily unsafe.

I have no idea what any of this has to do with the particular choice of syntax, but in Rust, all values can be moved. It's not that the executor "will" or "wants to" move a future. It's that it can be moved – by whomever. And if it can happen, then the language and the compiler must account for it, otherwise everything async would be terribly unsound.

Please show actual code; a vague description in broken English doesn't allow us to infer what the problem is.

1 Like

There's nothing broken about what you quoted, it's just extremely vague.

1 Like

Thanks for replying! :heavy_heart_exclamation: Actually I am very sorry if my question is not on the line. But I had learnt rust from the book. Just few days a go! So I was trying to dive in asynchronous world properly now. But my previous concepts were little bit contradicting. And that's the whole reason for asking.

So is it like if I go out of the house an accident can happen so that's why I will pin myself to make myself accident proof?

Because rust itself is not allowing us to write any kind of code like that and I am also sure that runtime engineers are smart enough not to move a owner of the future because at that time the self will get invalid. So is no one is doing anything unstable then having self referential future is absolutely fine because even using unsafe rust it will be very tough for any one to move ownership while it's in use! So is it like the pinning is going for the 0.1% case that it may happen due to some magic?

This was the example I was talking about! Here they tried to show us why we need to pin . But they are not actually pinning the whole struct they are changing the internal property ownership! But when we move a struct we change the ownership of the struct! Not it's inner property.

https://rust-lang.github.io/async-book/04_pinning/01_chapter.html#pinning-in-detail

I would suggest not trying to understand async and Pin in detail until you have got much more experience of Rust.

( You can still use async without really understanding how it works, I do so all the time! )

3 Likes

It's as if you remembered the position of your room based on its GPS coordinates (i.e. your house would be the struct and the GPS coordinates would be the self-referential pointer). If you then change house (i.e. you move the struct) you would still think your room was at the old GPS coordinates, while there may be a different house or it may be demolished (i.e. the struct was replaced or the allocation was freed).

One could think "yeah but I don't remember the GPS coordinates of my room, I know it's up the stairs" (i.e. why not store the offset of the data relative to the struct). A problem with this approach is that it doesn't work for everything, for example the supermarket might be just outside your house, but if you change house the supermarket won't be outside it anymore, it will still be at the old absolute position (i.e. this happens for example if the body of the async function borrows from somewhere on the heap, like a &str from a String). The compiler cannot detect when something should move with the struct and when it shouldn't, and this might even depend on some runtime values, think of an if that in one case borrows from a local variable and the other borrows from a heap value!

1 Like

Thanks! I understood the same that you just said! But my problem is that I couldn't imagine a case where this move is to happen that we are craving for in case of self referential struct!

Everywhere it's saying it will cause a problem when we move a self referential struct! But my point is who will move it? Maybe the answer is no one! Because just imagine you are writing a method! Will you ever try to move the actual struct while the method call is in progress? Because at that time your &self will become null!

The valid case I can think of my mind where pin is useful is when we have interdependent future blocks.

Suppose future A has a data which is referrence to the actual data present in future B. In this case pinning is making sense to me!

Because future A and future B will be polled independently by the executor and again release them independently and again enqueue them on their wake for execution.

So future A will always refer to the actual GPS location of B to access it's data. So in this case if the executor changes the memory location of future A as it's juggling the the futures within its thread pool then the ownership of A or memory location may change!

In this case future B won't be able to poll effectively any more because it's data referrals has been destroyed.
So we need pinning.

Now if the thing is like interdependent future blocks is same as self referential struct then the thing make sense to me.

But otherwise I couldn't find a case.

It may also happen that I am new to rust that's why I am not getting it. May be after coding for some more time things may make sense.

TL;DR: this is the question which you never ask in Rust. Instead Rust uses entirely different idea and the question becomes: what would prevent executor from moving the promise?

By that reasoning the whole Rust machinery is not needed at all. Smart people would write smart code and smart people wouldn't do dumb mistakes and everything would be just peachy.

That's the whole premise behind C and C++. And you know what? People are not smart enough, they are not careful enough, they do write stupid things by accidents if not on purpose.

You are looking in the wrong place and thinking about wrong things.

What is that future that async function returns? Is it “specially crafted state machine that is esigned to do smart things”? Nope. Difference between fn foo and async fn foo is, in reality, pretty small: fn foo places all it's variables on stack (using %sp register on most architectures), async fn foo places all it's variables in some other place (although optimizer may change it).

And you may write something like this:

    async fn foo(x: Foo) {
      let x: i32 = 42;
      let y: &i32 = &x;
      x.await;
      println!("{x} {y}");
    }

I mean: who would forbid that code and why?

Now the question arises: what if I would start such function, execute it for awhile, and then decide to postpone it. What happens in that case?

Well, now you have self-referential data structure which you may, suddenly, move somewhere. Because every type must be ready for it to be blindly memcopied to somewhere else in memory. But hey! That's not fair: if someone would move that future when it's executing… everything would fall apart.

It's not question of being new to Rust or something like that. You just have to understand what Rust is trying to achieve.

And it's goal is “simple”: reduce number of places where we would have to appeal to the smartness of people who are not supposed to be dumb things.

Even smartest people that I know sometimes do stupid mistakes. And compiler is supposed to stop them. But it may only do something if it would be told about capabilities of things that we are dealing with.

If we want to have posponeable function then we have to pick one of three options:

  1. Ensure that people couldn't use references that refer local variables (note that since async fn's are merged into one large blob in may not be truly local variable, but also a variable which you received via reference from another async fn function).
  2. Change Rust fundamentally to make sure types may be moved in some “smart” fashion (think C++ move constructors).
  3. Invent some mechanism which would guarantee that self-referential data structure wouldn't be moves after creation even if complier would still be convinced that type is still always freely moveable.

Rust picked option #3 and [ab]used an interesting property of the future that's returned from async fn: before you start execution of said function and pass arguments into it said future may be moved freely (because it's just an empty piece of memory without anything initialized in it) thus we may permit to return “raw” !Unpin Future which would have to be processed unsafely at some point because of the bold sentence above.

2 Likes

Let's apply the same reasoning to raw pointers: who will ever dereference a null pointer? Nobody wants to do that. So why is it unsafe? Because it could happen, and if it happens you get UB and everything breaks. Similarly someone could potentially/accidentally move the future because there's nothing preventing this, so you have to guard against that.

Of course a future won't try to move itself and break everything while poll is being called. The problem is however inbetween two calls to poll. During the first call the self-reference could be created, then in the second call the future is expecting that reference to still be valid, which is true only if the future was not moved since the first call.

Nitpick: it becomes dangling (i.e. points to something invalid), not null (i.e. becomes the "zero" pointer).

This is not possible. future B needs exclusive access to its state in order to be polled, so future A cannot have a reference to it.

3 Likes

It totally does:

let future = some_async_fn();
let other = future; // moved
2 Likes

Note that Futures in Rust do nothing by themselves, and must be actively polled to make progress. This is different than Promises in JS, when scheduler starts automatically spawned promises. Implication of this is that a Future in Rust can be suspended at any await point. Problem that pinning solves is that one could move that future before polling it again. Consider following example:

#![feature(noop_waker)]

use std::{
    future::{self, Future},
    mem,
    pin::Pin,
    task::{Context, Poll, Waker},
};

async fn bar() {
    let mut polled = false;
    future::poll_fn(move |_| {
        if polled {
            Poll::Ready(())
        } else {
            polled = true;
            Poll::Pending
        }
    })
    .await
}

async fn foo(x: i32) -> i32 {
    let x = Box::new(x);
    bar().await;
    *x
}

fn main() {
    let waker = Waker::noop();
    let mut context = Context::from_waker(&waker);

    let mut future = foo(42);

    let pinned = unsafe { Pin::new_unchecked(&mut future) };
    let poll1 = pinned.poll(&mut context);
    assert_eq!(poll1, Poll::Pending);

    let mut other = foo(100);
    unsafe {
        let _ = Pin::new_unchecked(&mut other).poll(&mut context);
    }

    // XXX: Undefined behavior here!
    mem::swap(&mut future, &mut other);

    let pinned = unsafe { Pin::new_unchecked(&mut future) };
    let poll2 = pinned.poll(&mut context);
    // NOTE: This assertion will fail.
    assert_eq!(poll2, Poll::Ready(42));
}

State machine that compiler will generate for this needs to store address of x. However after first poll, future returned by foo returns Ready::Pending. If we were now to move the future, then pointer to x would be invalidated. And now when we poll again we are reading from some address we are not allowed to. In other words - an Undefined Behavior. Imagine that Future::poll would just take &mut Self instead of Pin<&mut Self>. There would be nothing in the type system that would prevent this error.

Pin is a solution for this problem. It is a contract between you and the compiler. Upon creating it, you are promising that whatever happens, pointee will never move again for the entire rest of its lifetime (as long as pointee does not implement Unpin). If you make this promise, then compiler can safely assume, that self-referential pointers in the state machine it will generate from the async block will never be invalidated. And in the result you can safely write async blocks and store local references across await points.

2 Likes

Two things help understand it:

  • Rust checks based on rules of its type system, based on broad general rules. It doesn't care that your particular code doesn't do something unsafe, when the rules of what it could do in theory don't completely forbid it from doing something unsafe. The rules say every type can be moved, so the safety boundary must act as if you tried to move every type, without checking whether you do or not.

  • Pin is a workaround for a missing language feature (non-move or self-referential types), and as such is more complicated and less ergonomic than a proper native feature could be, and may do less than you would expect (it's more like a "be careful" warning than a language feature). It needs unsafe and has requirements that you must manually uphold, because the compiler did not get ability to do this itself. Fortunately, you don't need to bother with Pin if you use high-level async/await syntax.

4 Likes

Thank you, everyone, for being incredibly supportive! Your encouragement will undoubtedly contribute to the rapid growth of the Rust community! I've finally grasped it! :heavy_heart_exclamation::sparkles:

After gaining understanding, I can truly see that it's just an additional feature, much like the borrow checker and ownership checks, designed to prevent risky moves of unsafe self-referential structs at compile time.

@akrauze @kornel @SkiFire13 @paramagnetic @khimru

Soon, I'll create a video tutorial using this code template. It will help newcomers avoid the same struggles I faced!

use std::future::Future;
use std::marker::PhantomPinned;
use std::pin::Pin;
use std::task::{Context, Poll};

enum State {
    Start,
    Middle,
    End,
}

struct MyFuture {
    state: State,
    value: String,
    self_pointer: *const String,
    _marker: PhantomPinned,
}

impl MyFuture {
    fn new() -> Self {
        let val = String::from("Hello");
        MyFuture {
            state: State::Start,
            value: val,
            self_pointer: std::ptr::null(),
            _marker: PhantomPinned, // This makes our type `!Unpin`
        }
    }
    //init the self pointer
    pub fn init(self: Pin<&mut Self>) {
        let self_ptr: *const String = &self.value;
        let this = unsafe { self.get_unchecked_mut() };
        this.self_pointer = self_ptr;
    }
    //print self pointer
    fn show_self_pointer(self: Pin<&Self>) {
        println!("self_pointer: {:p}", self.self_pointer);
    }
    //show value pointer
    fn show_val_pointer(self: Pin<&Self>) {
        println!("val_pointer: {:p}", &self.value);
    }
}

impl Future for MyFuture {
    type Output = i32;

    fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Self::Output> {
        let future = unsafe { Pin::get_unchecked_mut(self) };
        match future.state {
            State::Start => {
                // transition to the next state
                future.state = State::Middle;
                // return Poll::Pending to indicate that the future is not ready
                Poll::Pending
            }
            State::Middle => {
                // transition to the next state
                future.state = State::End;
                // return Poll::Pending to indicate that the future is not ready
                Poll::Pending
            }
            State::End => {
                // return Poll::Ready to indicate that the future is ready
                Poll::Ready(32)
            }
        }
    }
}

#[derive(Debug)]
struct MemoeryDemo {
    value: String,
    self_pointer: *const String,
}

//before conversion
// Stack:
// - MemoeryDemo.value (at address2): contains a String which is stored on the heap
// - MemoeryDemo.self_pointer (at address3): contains address2

// Heap:
// - String data (at address1): the actual data of the String

//after conversion
// Stack:
// - MemoeryDemo.value (at address4): contains a new String which is stored on the heap
// - MemoeryDemo.self_pointer (at address3): still contains address2 (old address)

// Heap:
// - String data (at address1): no longer valid
// - String data (at address5): new location of the String data

impl MemoeryDemo {
    fn new() -> Self {
        let val = String::from("Hello");
        MemoeryDemo {
            value: val,
            self_pointer: std::ptr::null(),
        }
    }
    fn init(&mut self) {
        let self_ptr: *const String = &self.value;
        self.self_pointer = self_ptr;
    }
    fn show_self_pointer(&self) {
        println!("b: {:p}", self.self_pointer);
    }
    fn show_val_pointer(&self) {
        println!("a: {:p}", &self.value);
    }
    fn address_experiment() {
        let mut my_future = MemoeryDemo::new();
        my_future.init();
        my_future.show_val_pointer();
        my_future.show_self_pointer();
        let my_future_2 = my_future;
        my_future_2.show_val_pointer();
        my_future_2.show_self_pointer();
    }
}

#[tokio::main]
async fn main() {
    //try to run the address_experiment -> you will see the raw pointer containing the old address
    MemoeryDemo::address_experiment();

    //using pin to solve the problem of address experiment
    let mut my_future = MyFuture::new();
    let mut pinned_future = unsafe { Pin::new_unchecked(&mut my_future) };
    MyFuture::init(pinned_future.as_mut());

    let print_pointers = |pinned_future: &mut Pin<&mut MyFuture>| {
        //print value pointer
        pinned_future.as_ref().show_val_pointer();
        //print self pointer
        pinned_future.as_ref().show_self_pointer();
        //print the value at self pointer
        println!("Value at self_pointer: {}", unsafe {
            &*pinned_future.as_ref().self_pointer
        });
    };

    //you can move the pinned future easily without any address issues -> here you are changing the wrapper not the underlying value

    print_pointers(&mut pinned_future.as_mut());

    let mut moved_pinned_future = pinned_future;

    print_pointers(&mut moved_pinned_future.as_mut());

    //but you will get compile time error if you try to move the underlying pinned value (ie the future itself)
    //because the fututre is of type !unpin for being self referential
    // ```let old = std::mem::replace(moved_pinned_future.get_mut(), MyFuture::new());```

    //poll the future slowly
    let mut pinned_future = moved_pinned_future;

    let _mid = MyFuture::poll(
        pinned_future.as_mut(),
        &mut Context::from_waker(futures::task::noop_waker_ref()),
    );
    let _end = MyFuture::poll(
        pinned_future.as_mut(),
        &mut Context::from_waker(futures::task::noop_waker_ref()),
    );

    let res = MyFuture::poll(
        pinned_future.as_mut(),
        &mut Context::from_waker(futures::task::noop_waker_ref()),
    );
    if let Poll::Ready(val) = res {
        println!("Polled value: {}", val);
    }

    //use await to poll the future
    let res2 = pinned_future.await;
    println!("Resolved Value: {:?}", res2);
}

1 Like

Btw this can be done safely with let mut pinned_future = std::pin::pin!(my_future);

3 Likes