PhantomData and dropck confusion

Imagine the following "lib" (I am creating a custom Box rather than a Vec for the sake of simplicity):

A custom Box type

use ::std::{*,
    ptr::NonNull,
};

/// Simplified raw allocation interface
pub
trait IsAllocator {
    /// Allocates memory for a T
    fn alloc<T> () -> NonNull<T>;

    /// Frees the pointed memory, but does **not** drop the pointee
    unsafe fn free<T> (_: NonNull<T>);
}

pub
enum Allocator {}

impl IsAllocator for Allocator { /* ... */ }

/// Box<T> is simpler than Vec<T> (always len == capacity == 1) 
pub
struct Box<T> {
    ptr: NonNull<T>, // Since we are the unique owner of T, covariance is fine
}

impl<T> Box<T> {
    pub fn new (value: T) -> Self
    {
        let non_null = Allocator::alloc();
        unsafe { ptr::write(non_null.as_ptr(), value); }
        Self { ptr: non_null }
    }

    pub
    fn get (&'_ self) -> &'_ T
    {
        unsafe { self.ptr.as_ref() }
    }

    pub
    fn get_mut (&'_ mut self) -> &'_ mut T
    {
        unsafe { self.ptr.as_mut() }
    }
}

impl<T> Drop for Box<T> { fn drop (&mut self) {
    unsafe {
        // before freeing the allocated memory, we may need to recursively
        // drop the pointee, in case it owns data with drop glue
        // (else we would leak memory, with, for instance, a Box<Box<i32>>)
        eprintln!("Dropping the Box at {:p}", self);
        if mem::needs_drop::<T>() {
            eprintln!(
                "[Recursively dropping the pointee at {:p}]", self.ptr,
            );
            ptr::drop_in_place::<T>(self.as_mut());
        } else {
            eprintln!("[No need to drop the pointee at {:p}]", self.ptr);
        }
        Allocator::free(self.ptr);
    }
}}

Basic usage

Now, imagine it being tested with the following program:

fn main ()
{
    let s = String::from("Hello, World!");
    let mut p = Box::new(None);
    *p.get_mut() = Some(&s);
    dbg!(p.get());
} // p is dropped; then s => OK

This compiles and run fine.

More convoluted example

Let's invert the order of the p and s declarations:

fn main ()
{
    let mut p = Box::new(None);
    let s = String::from("Hello, World!");
    *p.get_mut() = Some(&s);
    dbg!(p.get());
} // s is dropped before p, but &String has no drop glue => OK

This no longer compiles:

error[E0597]: `s` does not live long enough
  --> src/main.rs:92:25
   |
92 |     *p.get_mut() = Some(&s);
   |                         ^^ borrowed value does not live long enough
93 |     dbg!(p.get());
94 | } // s is dropped before p, but &String has no drop glue => OK
   | -
   | |
   | `s` dropped here while still borrowed
   | borrow might be used here, when `p` is dropped and runs the `Drop` code for type `Box`
   |
   = note: values in a scope are dropped in the opposite order they are defined

emphasis on:

But in our case the only way we might use (deref) the pointee is when dropping it if it has drop glue, which &String does not have.

So, when there is no drop glue, the pointer may dangle. Luckily, this can be expressed with the dropck_eyepatch unstable feature:

unsafe impl<#[may_dangle] T> Drop for Box<T> { fn drop (&mut self) {
    /* [the Drop logic has not changed] */
}}

This compiles and runs fine

"This compiles and runs fine" ?? Road to UB-land

Now that we have the (unstable and) unsafe impl Drop<#[may_dangle] T>, with nothing else changed, we will find ourselves with the problem that without realizing it, we have told dropck to trust us too much: when T has drop glue, we do use the pointer!; and there is actually no place where we have told dropck that (remember that as far as dropck is concerned, all the implementations of all functions are loop {}, i.e., it can obviously not look at the code being actually run; it just interprets function signatures).

Let's abuse it in action, with the following program:

#[repr(transparent)]
#[derive(Debug)]
struct PrintOnDrop<T : Debug> (T);
impl<T : Debug> Drop for PrintOnDrop<T> { fn drop(&mut self) {
    eprintln!("Dropping {0:?} at {0:p}", self);
}}

fn main ()
{
    let mut p = Box::new(None);
    let s = String::from("Hello, World!");
    *p.get_mut() = Some(PrintOnDrop(&s));
    dbg!(p.get());
} // s is dropped before p, and PrintOnDrop<&String> has drop glue => UAF

This prints the following:

[src/main.rs:105] p.get() = Some(
    PrintOnDrop(
        "Hello, World!"
    )
)
[Dropping the Box at 0x7ffedcb20b20]
[Recursively dropping the pointee at 0x556876d62a40]
Dropping PrintOnDrop("\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}orld!") at 0x556876d62a40

And there it is, a magnificent UAF, because dropck allows us to have Foo<T> exit a scope with a dangling T (such as T = PrintOnDrop<&'dangling String>) when:

  1. unsafe impl<#[may_dangle] T> Drop for Foo<T> is used,

  2. struct-wise, Foo<T> does not own T

    • when Foo<T> owns a T it is expected that T::drop() be called when Foo<T> is dropped (else the implementation would leak memory), thus "ignoring" the <#[may_dangle] T> attribute when T does have drop glue.

That second point is then paramount, and in our Box<T> case, all we had was a NonNull<T> pointer which on its own does not own T (it's just a newtype wrapper around *const T (with a non-zero lang attached to it)).

What does own T? Well, for instance, PhantomData<T> does.

FINAL FIXED VERSION

Hence, we must fix the library code with the following one:

/// Box<T> is simpler than Vec<T> (always len == capacity == 1)
#[allow(non_snake_case)]
pub
struct Box<T> {
    ptr: NonNull<T>,
    _owns_T: marker::PhantomData<T>,
}

impl<T> Box<T> {
    pub fn new (value: T) -> Self
    {
        let non_null = Allocator::alloc();
        unsafe { ptr::write(non_null.as_ptr(), value); }
        Self { ptr: non_null, _owns_T: Default::default() }
    }

    pub
    fn get (&'_ self) -> &'_ T
    {
        unsafe { self.ptr.as_ref() }
    }

    pub
    fn get_mut (&'_ mut self) -> &'_ mut T
    {
        unsafe { self.ptr.as_mut() }
    }
}

unsafe impl<#[may_dangle] T> Drop for Box<T> { fn drop (&mut self) {
    unsafe {
        // before freeing the allocated memory, we may need to recursively
        // drop the pointee, in case it owns data with drop glue
        // (else we would leak memory, with, for instance, a Box<Vec<_>>)
        eprintln!("Dropping the Box at {:p}", self);
        if mem::needs_drop::<T>() {
            eprintln!(
                "[Recursively dropping the pointee at {:p}]", self.ptr,
            );
            ptr::drop_in_place::<T>(self.get_mut());
        } else {
            eprintln!("[No need to drop the pointee at {:p}]", self.ptr);
        }
        Allocator::free(self.ptr);
    }
}}

This no longer compiles, and if we replace PrintOnDrop(&s) with &s it does still compile: yay!

error[E0597]: `s` does not live long enough
   --> src/main.rs:105:37
    |
105 |     *p.get_mut() = Some(PrintOnDrop(&s));
    |                                     ^^ borrowed value does not live long enough
106 |     dbg!(p.get());
107 | } // s is dropped before p, and PrintOnDrop<&String> has drop glue => UAF
    | -
    | |
    | `s` dropped here while still borrowed
    | borrow might be used here, when `p` is dropped and runs the `Drop` code for type `Box`
    |
    = note: values in a scope are dropped in the opposite order they are defined

Conclusion

Using PhantomData<T> for its dropck properties for some struct Foo<T> is only required when there is:

  • unsafe impl<#[may_dangle] T> Drop for Foo<T> (with the contract that T should then not be accessed except for a "potential" recursive T::drop())

  • Foo<T> does not own T struct-wise (for instance, when it only has a non-Unique pointer to T like *const T) and yet it drops T

    • Aside: when, by construction, the pointer is guaranteed to be NonNull, unaliased, and dropping the pointee, then (NonNull<T>, PhantomData<T>) is the right way to express it: that is, Unique<T>.
12 Likes