Is `PhantomData` useless to dropck now?


#1

There is a section about Phantom Data in rfcs/0769, it said any struct like Vec<T> should have some marker(Phantom Data) to express that values of that type own instances of T to satisfy the drop check rule.

so, if struct like Vec<T> but does not have a PhantomData field int it, the compiler should not apply drop check rule to that type. to prove my conclusion, I wrote some code:

use std::ptr;

struct MyVec<T> {
    ptr: *const T,
}

impl<T> MyVec<T> {
    fn new(t: T) -> Self {
        MyVec {ptr: ptr::null()}
    }
}

impl<T> Drop for MyVec<T> {
    fn drop(&mut self) {}
}

fn main() {
    let (a,b);
    a = 5;
    b = MyVec::new(&a);
}

the compiler complains:

error: a does not live long enough
–> src/main.rs:22:1
|
21 | b = MyVec::new(&a);
| - borrow occurs here
22 | }
| ^ a dropped here while still borrowed
|
= note: values in a scope are dropped in the opposite order they are created

there is only a raw pointer in MyVec<T> and no PhantomData in MyVec<T> , but it trigers compliers drop check rule.

so my question is: Is PhantomData useless to dropck now? .
if i was wrong, could some one give me some examples about the effect of ‘PhantomData’ to dropck.


#2

You’re triggering the borrow checker here, not drop. Your MyVec doesn’t own anything in the example because you’re passing it a reference to something. Compiler is just telling you that the reference you’re giving to MyVec may get invalidated when ‘a’ gets dropped, but that’s a borrow violation really.

Edit: this may be helpful to you, https://doc.rust-lang.org/nomicon/phantom-data.html, and the previous section there on dropck.


#3

You can try like this:

use std::ptr;

struct MyVec {
ptr: *const T,
}

impl MyVec {
fn new(t: T) -> Self {
MyVec {ptr: ptr::null()}
}
}

impl Drop for MyVec {
fn drop(&mut self) {}
}

fn main() {
let a;
let b;
a = 5;
b = MyVec::new(&a);
}


#4

Thanks for your reply.
I run your code, but it failed

error[E0412]: cannot find type T in this scope
–> src\main.rs:4:13
|
4 | ptr: *const T,
| ^ not found in this scope

error[E0412]: cannot find type T in this scope
–> src\main.rs:8:11
|
8 | fn new(t: T) -> Self {
| ^ not found in this scope


#6
use std::ptr;

struct MyVec<T> {
    ptr: *const T,
}

impl<T> MyVec<T> {
    fn new(t: T) -> Self {
        MyVec {ptr: ptr::null()}
    }
}

impl<T> Drop for MyVec<T> {
    fn drop(&mut self) {}
}

fn main() {
    let a;
    let b;
    a = 5;
    b = MyVec::new(&a);
}

#7

in your example, ‘a’ is strictly outlive ‘b’, so it compiles,

what I want to know is: what bad thing will happen to struct like std::Vec without a PhantomData field


#8

Thanks for your reply.

nomicon said:

The drop checker will generously determine that Vec does not own any values of type T. This will in turn make it conclude that it doesn’t need to worry about Vec dropping any T’s in its destructor for determining drop check soundness. This will in turn allow people to create unsoundness using Vec’s destructor.

I can not create unsoundness code with

struct Vec<T> {
    data: *const T, // *const for variance!
    len: usize,
    cap: usize,
}

can you show me some unsoundness code about it.


#9

:sob: I’m sorry, I do not know. I have a little idea, but I do not know how to describe it.


#10

I’m on my phone, but did you read this section there: https://doc.rust-lang.org/nomicon/dropck.html

The gist, AFAICT, is given a Vec of some generic type T, we must ensure that any reference(s) contained within a T instance strictly outlive the Vec itself. Since the Vec is dropping its Ts (it’s the owner of them), how do we ensure that when a given T is dropped that its references are still valid? After all, T::drop() may attempt to use those references and if they’ve already been dropped, you’ll get unsoundness. This is one of the things dropck verifies.

Now, given Vec’s definition as containing just an arbitrary pointer, dropck thinks that a Vec doesn’t own its Ts. If that’s the case, it won’t ensure that T’s references outlive the Vec itself, and will allow dripping to proceed “in the wrong order”. That’s where PhantomData comes in - it tells dropck that Vec actually owns the T, and then normal ownership/borrow rules come into play.

I’m basically paraphrasing that nomicon section above. There also other related reasons to use PhantomData, but the above is the one for generic types.


#11

I had read that section many times and I agreed with what you said, but i can not prove PhantomData is really necessary , please show me some code when you have time. Thanks


#12

I’ve been trying to come up with a distilled example, but not having much luck yet.


#13

I think you are right (PhantomData doesn’t matter, unless you need it to carry the type parameter in some way) and that https://github.com/rust-lang/rfcs/blob/master/text/1327-dropck-param-eyepatch.md is the new approach.


#14

Here is code demonstrating that you still need PhantomData, at least if you are also using #[may_dangle]:

// Illustration of a case where PhantomData is providing necessary ownership
// info to rustc.
//
// MyBox2<T> uses just a `*const T` to hold the `T` it owns.
// MyBox3<T> has both a `*const T` AND a PhantomData<T>; the latter communicates
// its ownership relationship with `T`.
//
// Skim down to `fn f2()` to see the relevant case, 
// and compare it to `fn f3()`. When you run the program,
// the output will include:
//
// drop PrintOnDrop(mb2b, PrintOnDrop("v2b", 13, INVALID), Valid)
//
// (However, in the absence of #[may_dangle], the compiler will constrain
// things in a manner that may indeed imply that PhantomData is unnecessary;
// pnkfelix is not 100% sure of this claim yet, though.)

#![feature(alloc, dropck_eyepatch, generic_param_attrs, heap_api)]

extern crate alloc;

use alloc::heap;
use std::fmt;
use std::marker::PhantomData;
use std::mem;
use std::ptr;

#[derive(Copy, Clone, Debug)]
enum State { INVALID, Valid }

#[derive(Debug)]
struct PrintOnDrop<T: fmt::Debug>(&'static str, T, State);

impl<T: fmt::Debug> PrintOnDrop<T> {
    fn new(name: &'static str, t: T) -> Self {
        PrintOnDrop(name, t, State::Valid)
    }
}

impl<T: fmt::Debug> Drop for PrintOnDrop<T> {
    fn drop(&mut self) {
        println!("drop PrintOnDrop({}, {:?}, {:?})",
                 self.0,
                 self.1,
                 self.2);
        self.2 = State::INVALID;
    }
}

struct MyBox1<T> {
    v: Box<T>,
}

impl<T> MyBox1<T> {
    fn new(t: T) -> Self {
        MyBox1 { v: Box::new(t) }
    }
}

struct MyBox2<T> {
    v: *const T,
}

impl<T> MyBox2<T> {
    fn new(t: T) -> Self {
        unsafe {
            let p = heap::allocate(mem::size_of::<T>(), mem::align_of::<T>());
            let p = p as *mut T;
            ptr::write(p, t);
            MyBox2 { v: p }
        }
    }
}

unsafe impl<#[may_dangle] T> Drop for MyBox2<T> {
    fn drop(&mut self) {
        unsafe {
            // We want this to be *legal*. This destructor is not 
            // allowed to call methods on `T` (since it may be in
            // an invalid state), but it should be allowed to drop
            // instances of `T` as it deconstructs itself.
            //
            // (Note however that the compiler has no knowledge
            //  that `MyBox2<T>` owns an instance of `T`.)
            ptr::read(self.v);
            heap::deallocate(self.v as *mut u8,
                             mem::size_of::<T>(),
                             mem::align_of::<T>());
        }
    }
}

struct MyBox3<T> {
    v: *const T,
    _pd: PhantomData<T>,
}

impl<T> MyBox3<T> {
    fn new(t: T) -> Self {
        unsafe {
            let p = heap::allocate(mem::size_of::<T>(), mem::align_of::<T>());
            let p = p as *mut T;
            ptr::write(p, t);
            MyBox3 { v: p, _pd: Default::default() }
        }
    }
}

unsafe impl<#[may_dangle] T> Drop for MyBox3<T> {
    fn drop(&mut self) {
        unsafe {
            ptr::read(self.v);
            heap::deallocate(self.v as *mut u8,
                             mem::size_of::<T>(),
                             mem::align_of::<T>());
        }
    }
}

fn f1() {
    // `let (v, _mb1);` and `let (_mb1, v)` won't compile due to dropck
    let v1; let _mb1;
    v1 = PrintOnDrop::new("v1", 13);
    _mb1 = MyBox1::new(PrintOnDrop::new("mb1", &v1));
}

fn f2() {
    {
        let (v2a, _mb2a); // Sound, but not distinguished from below by rustc!
        v2a = PrintOnDrop::new("v2a", 13);
        _mb2a = MyBox2::new(PrintOnDrop::new("mb2a", &v2a));
    }

    {
        let (_mb2b, v2b); // Unsound!
        v2b = PrintOnDrop::new("v2b", 13);
        _mb2b = MyBox2::new(PrintOnDrop::new("mb2b", &v2b));
        // namely, v2b dropped before _mb2b, but latter contains
        // value that attempts to access v2b when being dropped.
    }
}

fn f3() {
    let v3; let _mb3; // `let (v, mb3);` won't compile due to dropck
    v3 = PrintOnDrop::new("v3", 13);
    _mb3 = MyBox3::new(PrintOnDrop::new("mb3", &v3));
}

fn main() {
    f1(); f2(); f3();
}

(This is taken from a StackOverflow answer I wrote; that answer includes a full discussion of why PhantomData continues to be necessary on things like Vec for dropck, in addition to any purpose it might serve for variance stuff.)


#15

great answer, thank you very much.