PhantomData and dropck confusion

I'm trying to understand the relation between PhantomData and dropck in greater depth esp. for the case like Vec or LinkedList that PhantomData makes generic drop sound.

I've already read nomicon as well as RFC-0769-sound-generic-drop and RFC-1238-nonparametric-dropck. Also found this explanations by @pnkfelix and @shepmaster or the same example.

However, due to many updates to rustc, none of the examples that suppose to not compile, compile now and I'm in a confusion state!

Here's the updated example given by Felix (though this is less about PhantomData and raw ptr I guess)!

Here's the example given by Jake.

Can anyone provide a clear example capturing unsoundness?

Mainly, for example, how can I clearly show that (in code)

struct Vec<T> {
    ptr: NonNull<T>,
    cap: usize,
    len: usize,
}

is unsound whereas

struct Vec<T> {
    ptr: NonNull<T>,
    cap: usize,
    len: usize,
    _marker: PhantomData<T>,
}

is sound in a normal code without using any uninitialized let binding let (a, b);?

Imagine the following "lib" (I am creating a custom Box rather than a Vec for the sake of simplicity):

A custom Box type

use ::std::{*,
    ptr::NonNull,
};

/// Simplified raw allocation interface
pub
trait IsAllocator {
    /// Allocates memory for a T
    fn alloc<T> () -> NonNull<T>;

    /// Frees the pointed memory, but does **not** drop the pointee
    unsafe fn free<T> (_: NonNull<T>);
}

pub
enum Allocator {}

impl IsAllocator for Allocator { /* ... */ }

/// Box<T> is simpler than Vec<T> (always len == capacity == 1) 
pub
struct Box<T> {
    ptr: NonNull<T>, // Since we are the unique owner of T, covariance is fine
}

impl<T> Box<T> {
    pub fn new (value: T) -> Self
    {
        let non_null = Allocator::alloc();
        unsafe { ptr::write(non_null.as_ptr(), value); }
        Self { ptr: non_null }
    }

    pub
    fn get (&'_ self) -> &'_ T
    {
        unsafe { self.ptr.as_ref() }
    }

    pub
    fn get_mut (&'_ mut self) -> &'_ mut T
    {
        unsafe { self.ptr.as_mut() }
    }
}

impl<T> Drop for Box<T> { fn drop (&mut self) {
    unsafe {
        // before freeing the allocated memory, we may need to recursively
        // drop the pointee, in case it owns data with drop glue
        // (else we would leak memory, with, for instance, a Box<Box<i32>>)
        eprintln!("Dropping the Box at {:p}", self);
        if mem::needs_drop::<T>() {
            eprintln!(
                "[Recursively dropping the pointee at {:p}]", self.ptr,
            );
            ptr::drop_in_place::<T>(self.as_mut());
        } else {
            eprintln!("[No need to drop the pointee at {:p}]", self.ptr);
        }
        Allocator::free(self.ptr);
    }
}}

Basic usage

Now, imagine it being tested with the following program:

fn main ()
{
    let s = String::from("Hello, World!");
    let mut p = Box::new(None);
    *p.get_mut() = Some(&s);
    dbg!(p.get());
} // p is dropped; then s => OK

This compiles and run fine.

More convoluted example

Let's invert the order of the p and s declarations:

fn main ()
{
    let mut p = Box::new(None);
    let s = String::from("Hello, World!");
    *p.get_mut() = Some(&s);
    dbg!(p.get());
} // s is dropped before p, but &String has no drop glue => OK

This no longer compiles:

error[E0597]: `s` does not live long enough
  --> src/main.rs:92:25
   |
92 |     *p.get_mut() = Some(&s);
   |                         ^^ borrowed value does not live long enough
93 |     dbg!(p.get());
94 | } // s is dropped before p, but &String has no drop glue => OK
   | -
   | |
   | `s` dropped here while still borrowed
   | borrow might be used here, when `p` is dropped and runs the `Drop` code for type `Box`
   |
   = note: values in a scope are dropped in the opposite order they are defined

emphasis on:

But in our case the only way we might use (deref) the pointee is when dropping it if it has drop glue, which &String does not have.

So, when there is no drop glue, the pointer may dangle. Luckily, this can be expressed with the dropck_eyepatch unstable feature:

unsafe impl<#[may_dangle] T> Drop for Box<T> { fn drop (&mut self) {
    /* [the Drop logic has not changed] */
}}

This compiles and runs fine

"This compiles and runs fine" ?? Road to UB-land

Now that we have the (unstable and) unsafe impl Drop<#[may_dangle] T>, with nothing else changed, we will find ourselves with the problem that without realizing it, we have told dropck to trust us too much: when T has drop glue, we do use the pointer!; and there is actually no place where we have told dropck that (remember that as far as dropck is concerned, all the implementations of all functions are loop {}, i.e., it can obviously not look at the code being actually run; it just interprets function signatures).

Let's abuse it in action, with the following program:

#[repr(transparent)]
#[derive(Debug)]
struct PrintOnDrop<T : Debug> (T);
impl<T : Debug> Drop for PrintOnDrop<T> { fn drop(&mut self) {
    eprintln!("Dropping {0:?} at {0:p}", self);
}}

fn main ()
{
    let mut p = Box::new(None);
    let s = String::from("Hello, World!");
    *p.get_mut() = Some(PrintOnDrop(&s));
    dbg!(p.get());
} // s is dropped before p, and PrintOnDrop<&String> has drop glue => UAF

This prints the following:

[src/main.rs:105] p.get() = Some(
    PrintOnDrop(
        "Hello, World!"
    )
)
[Dropping the Box at 0x7ffedcb20b20]
[Recursively dropping the pointee at 0x556876d62a40]
Dropping PrintOnDrop("\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}\u{0}orld!") at 0x556876d62a40

And there it is, a magnificent UAF, because dropck allows us to have Foo<T> exit a scope with a dangling T (such as T = PrintOnDrop<&'dangling String>) when:

  1. unsafe impl<#[may_dangle] T> Drop for Foo<T> is used,

  2. struct-wise, Foo<T> does not own T

    • when Foo<T> owns a T it is expected that T::drop() be called when Foo<T> is dropped (else the implementation would leak memory), thus "ignoring" the <#[may_dangle] T> attribute when T does have drop glue.

That second point is then paramount, and in our Box<T> case, all we had was a NonNull<T> pointer which on its own does not own T (it's just a newtype wrapper around *const T (with a non-zero lang attached to it)).

What does own T? Well, for instance, PhantomData<T> does.

FINAL FIXED VERSION

Hence, we must fix the library code with the following one:

/// Box<T> is simpler than Vec<T> (always len == capacity == 1)
#[allow(non_snake_case)]
pub
struct Box<T> {
    ptr: NonNull<T>,
    _owns_T: marker::PhantomData<T>,
}

impl<T> Box<T> {
    pub fn new (value: T) -> Self
    {
        let non_null = Allocator::alloc();
        unsafe { ptr::write(non_null.as_ptr(), value); }
        Self { ptr: non_null, _owns_T: Default::default() }
    }

    pub
    fn get (&'_ self) -> &'_ T
    {
        unsafe { self.ptr.as_ref() }
    }

    pub
    fn get_mut (&'_ mut self) -> &'_ mut T
    {
        unsafe { self.ptr.as_mut() }
    }
}

unsafe impl<#[may_dangle] T> Drop for Box<T> { fn drop (&mut self) {
    unsafe {
        // before freeing the allocated memory, we may need to recursively
        // drop the pointee, in case it owns data with drop glue
        // (else we would leak memory, with, for instance, a Box<Vec<_>>)
        eprintln!("Dropping the Box at {:p}", self);
        if mem::needs_drop::<T>() {
            eprintln!(
                "[Recursively dropping the pointee at {:p}]", self.ptr,
            );
            ptr::drop_in_place::<T>(self.get_mut());
        } else {
            eprintln!("[No need to drop the pointee at {:p}]", self.ptr);
        }
        Allocator::free(self.ptr);
    }
}}

This no longer compiles, and if we replace PrintOnDrop(&s) with &s it does still compile: yay!

error[E0597]: `s` does not live long enough
   --> src/main.rs:105:37
    |
105 |     *p.get_mut() = Some(PrintOnDrop(&s));
    |                                     ^^ borrowed value does not live long enough
106 |     dbg!(p.get());
107 | } // s is dropped before p, and PrintOnDrop<&String> has drop glue => UAF
    | -
    | |
    | `s` dropped here while still borrowed
    | borrow might be used here, when `p` is dropped and runs the `Drop` code for type `Box`
    |
    = note: values in a scope are dropped in the opposite order they are defined

Conclusion

Using PhantomData<T> for its dropck properties for some struct Foo<T> is only required when there is:

  • unsafe impl<#[may_dangle] T> Drop for Foo<T> (with the contract that T should then not be accessed except for a "potential" recursive T::drop())

  • Foo<T> does not own T struct-wise (for instance, when it only has a non-Unique pointer to T like *const T) and yet it drops T

    • Aside: when, by construction, the pointer is guaranteed to be NonNull, unaliased, and dropping the pointee, then (NonNull<T>, PhantomData<T>) is the right way to express it: that is, Unique<T>.
12 Likes

Wow ... I could not hope to get a better explanation that this! thank you so much :grinning:

The conclusion makes perfect sense and I can see how PhantomData and dropck_eyepatch work in tandem elegantly in std, making dropck less conservative while providing no-UAF guarantees in these very important cases.

cc. @Gankra, I hope this can be added to the nomicon in Ownership capture right after PhantomData Section.

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.