Why do we need this ManuallyDrop in Waker?

When reading ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/task/wake.rs, I found this line

    pub fn wake(self) {
        // The actual wakeup call is delegated through a virtual function call
        // to the implementation which is defined by the executor.

        // Don't call `drop` -- the waker will be consumed by `wake`.
        let this: ManuallyDrop<Waker> = ManuallyDrop::new(self);

        // SAFETY: This is safe because `Waker::from_raw` is the only way
        // to initialize `wake` and `data` requiring the user to acknowledge
        // that the contract of `RawWaker` is upheld.
        unsafe { (this.waker.vtable.wake)(this.waker.data) };
    }

I am wondering why we need this ManuallyDrop here. If I change the code like this

    pub fn wake(self) {
        // The actual wakeup call is delegated through a virtual function call
        // to the implementation which is defined by the executor.

        unsafe { (self.waker.vtable.wake)(self.waker.data) };
    }

After compiling some test code with cargo run --target x86_64-unknown-linux-gnu -Z build-std, there's no problem. The compiling passes and the async code work well as usual.

My understanding is like this:
Logically speaking, this.waker.data is consumed. In order to ensure the integrity of the self when it is dropped at the end of the wake function, a new copy on the stack is made here. But directly using self is not only grammatically sound because drop check can pass and the compiler would not cavil at my modified code, but also logical in a sense: consuming one's owned data (logically consumed but not actually consumed because you cannot say "consume a raw pointer") and then dropping itself.

Based on this understanding, the problem is "Is it really meaningful to pursue semantic perfection in the unsafe world at the cost of the time spent by LLVM to optimize the copying of useless stack space objects?" Why don't we just remove this copy and use the modified version if my analysis is correct?

Your modified code results in a use-after-free with certain wakers. The vtable.wake function consumes a refcount of the waker, so if you also run the destructor, you now decremented the refcount twice.

3 Likes

Thanks for your answer. However, I cannot understand why vtable.wake consumes a refcount of the waker. The simplified code involved in the invocation is as follows. It seems that it works well.

fn wake(data: *const ()) {
    let data_ptr = data as *const &str;

    // Unsafe block to dereference the raw pointer
    unsafe {
        if !data_ptr.is_null() {
            println!("Wake: {}", *data_ptr);
        } else {
            println!("Wake: null pointer");
        }
    }
}


struct RawWakerVTable{
    wake: unsafe fn(*const ()),
}

struct RawWaker{
    data: *const (),
    vtable: &'static RawWakerVTable,
}

struct Waker{
    waker: RawWaker,
}

impl Waker{
    pub fn wake(self) {
        unsafe { (self.waker.vtable.wake)(self.waker.data) }; // modified line
    }
}

static VTABLE: RawWakerVTable = RawWakerVTable{wake};

fn main() {
    let u = &"233" as *const &str as *const ();
    let w = Waker{waker: RawWaker{data: u, vtable: &VTABLE }};
    w.wake();
}

The output is

Wake: 233

If I have missed anything, it would be great if you were willing to modify the code above to demonstrate the "use-after-free" case.

The reference count (if any) is part of the implementation that the creator of the RawWakerVTable is expected to provide:

wake

This function will be called when wake is called on the Waker. It must wake up the task associated with this RawWaker.

The implementation of this function must make sure to release any resources that are associated with this instance of a RawWaker and associated task.

wake_by_ref

This function will be called when wake_by_ref is called on the Waker. It must wake up the task associated with this RawWaker.

This function is similar to wake, but must not consume the provided data pointer.

RawWakerVTable in std::task - Rust

If the Waker has any resources it owns, then it must clean them up when wake is called, or there will be a leak. Therefore, if there are such resources, calling wake twice is a use-after-free. This is just the usual logic of ownership: things should be freed exactly once.

If you were planning to keep the waker around, you’d call wake_by_ref() instead of wake().

I understand the problem by adding a drop function to the VTable.

fn wake(data: *const ()) {
    unsafe {
        println!("wake: {}", *(data as *const &str));
    }
}

fn drop(data: *const ()){
    unsafe {
        println!("drop: {}", *(data as *const &str));
    }
}

struct RawWakerVTable{
    wake: unsafe fn(*const ()),
    drop: unsafe fn(*const ()),
}

struct RawWaker{
    data: *const (),
    vtable: &'static RawWakerVTable,
}

struct Waker{
    waker: RawWaker,
}

impl Waker{
    pub fn wake(self) {
        unsafe { (self.waker.vtable.wake)(self.waker.data) }; // modified line
    }
}

impl Drop for Waker {
    fn drop(&mut self) {
        unsafe { (self.waker.vtable.drop)(self.waker.data) }
    }
}


static VTABLE: RawWakerVTable = RawWakerVTable{wake, drop};

fn main() {
    let u = &"233" as *const &str as *const ();
    let w = Waker{waker: RawWaker{data: u, vtable: &VTABLE }};
    w.wake();
}

Horribly, the output is

wake: 233
drop: 233

which is exact use-after-free as mentioned by @alice :exploding_head:! The gist lies in the drop of self at the end the the wake function in Waker.

Thank you to all who helped me figure it out. :hugs:

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.