Mistake in an example in The Rustonomicon (pin / ptr::write)?

This is an excerpt from The Rustonomicon book:

Let's consider a simplified implementation of Rc:

struct Rc<T> {
    ptr: *mut RcBox<T>,
}

struct RcBox<T> {
    data: T,
    ref_count: usize,
}

impl<T> Rc<T> {
    fn new(data: T) -> Self {
        unsafe {
            // Wouldn't it be nice if heap::allocate worked like this?
            let ptr = heap::allocate::<RcBox<T>>();
            ptr::write(ptr, RcBox {
                data,
                ref_count: 1,
            });
            Rc { ptr }
        }
    }

I do not understand how this part can possibly work:

            ptr::write(ptr, RcBox {
                data,
                ref_count: 1,
            });

What we do here is create an object on the stack and then memcpy its bytes to dst.

But what if the object is self-referential?
Should not this simplified version require T to be Pin?
Or it was just omitted for the sake of example/brevity?

Rust objects are, in general, required to be relocatable like this.

The entire point of Pin existing is to retrofit support for self-referential objects onto this existing paradigm. It does this by preventing any access to the pinned target that might change its address, even when the outer Pin gets moved around like a normal Rust object.

2 Likes

Yes, usually every object is relocatable like this... And when not, we can create something like Box::pin(X) (meaning that the box can be moved/copied, but whatever it refers to, will live forever at the same address).

But what happens if I pass such a self-referential object to the Rc::new() from that example?

If it has been made self-referential without being behind an indirection and Pin then someone has written bad unsafe code (and it is their own fault if any pointers end up dangling when it is moved). If it is behind an indirection like Box, then moving it will not affect its address, and having the Box wrapped in a Pin ensures noone can move it out of the Box (but the Rc doesn't know it is in a Box and doesn't try to move out of it anyway).

4 Likes

Also, Pin's guarantees against relocation don't kick in until after the first Pin is made.

So, if you want to make a Pin<Rc<T>> that relies on the fact that RcBox has a stable address, there still isn't a problem— You can build the object on the stack first, move it into the Rc::new call, and then construct your Pin<Rc<T>> afterwards to activate the guarantee.

3 Likes

Correct me if I am wrong, but this is NOT the logic/argument that is used in async Rust code.

Instead of saying that it is responsibility of a human to check that everything is fine, all poll_XXX functions require Pin (which is essentially a way to say something like: "either X can be moved around, or it will fail compilation").

If the above is correct, then why not use same technique/tools in this case? Would not it be safer / better?

This is the wrong way round; Pin is a way to say "X can rely on not being moved around". Without Pin, everything can be moved around freely, as in the example code you've shown from the Rustonomicon; with Pin, you're saying that X will not be moved around, and X can depend on not being moved around.

3 Likes

In async code the poll function is what makes the Future self-referential (it can't be self-referential before it is polled for the first time). The fact that poll takes a Pin<&mut Self> means it has to be behind a Pin before it becomes self-referential (because otherwise you can't call poll). The unsafe code here is usually generated by rustc inside the poll function, so there is nothing to check (and it can all be done with safe functions). If someone is breaking the requirements of the unsafe functions on Pin or writing their own code to make something self-referential outside of a Pin, then they are causing UB.

3 Likes

Generally, even values that are !Unpin like futures are not self-referential when created. It is not until you actually call poll for the first time that the future becomes self-referential.

This means that !Unpin values can be moved around as long as you haven't pinned them yet.

And yes, if you passed a pinned value to Rc::new, then that would be unsound. So all safe ways to pin a value will prevent you from later passing it to Rc::new.

3 Likes

This is the wrong way round

I am not sure I understand why...

This is what the official async book has to say about pin:

The Pin type wraps pointer types, guaranteeing that the values behind the pointer won't be moved if it is not implementing Unpin. For example, Pin<&mut T>, Pin<&T>, Pin<Box<T>> all guarantee that T won't be moved if T: !Unpin.

I interpret this as:

  • if X is Unpin (which is the case for 99% of types), then Pin<X> can be moved, and Pin alone does not prevent anything.
  • if X is !Unpin (some futures), then Pin<X> cannot be moved (and in this case we often have to use something like Pin<Box<X>>).

which is why I said:

either X can be moved around (X is Unpin, and Pin can be moved too), or it (Pin<X>) will fail compilation

Am I missing something?

Yes, but the question was, why in the example, they:

  • rely on T being memcpy-able (aka Unpin)
    • they are using: ptr::write(ptr, RcBox { data, ...
  • but do not express/enforce the fact in the interface/signature of the function via Pin<> (just like async rust does)?

This is incorrect. Pin<X> can always be moved safely, just like any other Rust type. If X implements Deref<Target=T> and T:!Unpin, then the API of Pin<X> ensures that safe code cannot move the pointed-to T no matter how hard it tries.


Because !Unpin doesn’t mean anything for types that haven’t been proactively placed behind a Pin<Something<T>>— By definition, you are allowed to move around any value that you own as much as you like. Pin works by “owning” the pinned T and giving that extra guarantee that the T won’t move even if the Pin’s owner decides to move the Pin itself to a new memory location.

2 Likes

If you own directly a value of type T (as Rc::new() does while it is executing), then one of these two things is true:

  • The T value has not been pinned, so you may move it as you see fit.
  • You are implementing a pinned container Pin<YourType<T>>. and it is your obligation not to move the value, reflected in the fact that you'll have some unsafe code introducing the Pin.

Rc::new() falls into the first case, as does most Rust code: there is no Pin involved, so there is no pinning guarantee for you to break.

It's only possible to break pinning by writing incorrect unsafe code somewhere.

2 Likes

Here I have a concrete example:

Why this code compiles?

use std::pin::Pin;
use std::marker::PhantomPinned;

#[derive(Debug)]
struct Test {
    a: String,
    _marker: PhantomPinned,
}



struct Rc<T> {
    ptr: *mut RcBox<T>,
}

struct RcBox<T> {
    data: T,
    ref_count: usize,
}

impl<T> Rc<T> {
    fn new(data: T) -> Self {
        unsafe {
            // Wouldn't it be nice if heap::allocate worked like this?
            let ptr = std::ptr::null_mut::<RcBox<T>>();
            std::ptr::write(ptr, RcBox {
                data,
                ref_count: 1,
            });
            Rc { ptr }
        }
    }
}

fn main() {
   let test = Test {a: String::from("asdf"), _marker: PhantomPinned, };
   let rc = Rc::new(test);
}

Test is !Unpin and says "I cannot be moved".

And yet, it gets copied by std::ptr::write.

The question is, why the example:

  • relies on T being memcpy-able (aka Unpin)
  • but do not express/enforce the fact in the interface/signature of the function via Pin<> (just like async rust does)?

(the question is ONLY about that example from the book, not real implementation of Rc, as I guess, it is different)

A T: !Unpin type is still memcpy-able if it has not yet been pinned.

4 Likes

Let me ask this question...

If you were writing a function that takes an object of type T and performs this:
std::ptr::write(ptr, data, (where data is of type T), what signature of the function you would use?

Something like:

fn do_the_thing<T>(data: T)

Without reference to Pin in here, data can be freely moved by Rust. If you wanted it to be pinned in place so that you couldn't do std::ptr::write(ptr, data), then you'd write:

fn do_the_thing<T>(data: Pin<&mut T>)
// or
fn do_the_thing<T>(data: Pin<Box<T>>)

Notice two differences:

  1. It's got Pin in it, telling you that you can't safely move the thing (in general - if it's Unpin, you can)
  2. I'm no longer referring to an owned T; I'm referring to it indirectly, via either a reference or a smart pointer type. It's either the smart pointer, or outside my code, that owns it.
2 Likes

Well, you already have an example, in what Rc::new() does. Here's another: Vec::push(). Again, it takes value: T.

More generally: ptr::write() is unsafe, but the safety concerns are to do with the destination being valid, not about T being moved. The fact that you own a T is, by default, sufficient information to know that the T may be moved; that's a general property of Rust, which is not changed by the existence of ptr::write().

The entire Pin mechanism is built around marking things as non-moving by not owning them directly, but through a pinning pointer type.

4 Likes