Wrapped value and PhantomData

I'm working on a custom map, based on hashbrown's RawTable. It has a quirk: From the application's perspective, it is generic over K and V, just like the standard HashMap. But internally the RawTable wraps the value in a struct, which is explicitly allocated.

Very simplified:

struct Node<V> {
  value: ManuallyDrop<V>
}

struct InnerMyMap<K, V>
where
  K: Eq + Hash
{
  table: RawTable<(K, *mut Node<V>)>
}

pub struct MyMap<K, V>
where
  K: Eq + Hash
{
  inner: InnerMyMap<K, V>
}

InnerMyMap's Drop handler explicitly releases all Node entries, and it ManuallyDrop::drop()'s the values before doing so.

The nomicon states that this isn't sufficient:

The drop checker will generously determine that Vec<T> does not own any values of type T. This will in turn make it conclude that it doesn't need to worry about Vec dropping any T's in its destructor for determining drop check soundness. This will in turn allow people to create unsoundness using Vec's destructor.

Is there some simple practical example to illustrate what type of soundness problem one is trying to avoid by using PhantomData?

Do I understand it correctly that the most basic issue is that the compiler doesn't know what the pointer is representing -- whether it's ownership, a reference (lifetime, etc)?

So in my example, should the InnerMyMap have:

struct InnerMyMap<K, V>
where
  K: Eq + Hash
{
  table: RawTable<(K, *mut Node<V>)>,
  _marker: PhantomData<Node<V>>
}

.. (because the values in the RawTable contain owned values).

In the case of Vec<T>, the drop check comes up when you are dealing with a type annotated with a lifetime, e.g. Vec<T<'a>>. The question is then whether you require the destructor of the vector to run before the end of 'a or not.

1 Like

Here's an example, but see the rest of this reply.

Due to the RFC I linked above, there was a time when the compiler had an algorithm for determining whether Struct<Param> "owned" a Param or not. The nomicon's tale about using PhantomData here is that "owns" is a property that PhantomData<Param> would confer, for cases where the algorithm produced the wrong answer.

The idea was to allow Struct<HasLifetime<'x>> to drop after 'x if it wasn't going to observe anything invalid outside of 'x.

However, another RFC eventually came around that made it so Struct<Param> owns Param by default (though this can still be side-stepped in order to make structs unsafely more flexible [1]). So today, you don't actually need the PhantomData to get the "owns" property (unless you use the unstable #[may_dangle] attribute).

The RFCs were careful to leave the door open to changing the behavior back [2], but it seems pretty unlikely now. That said, it doesn't hurt to add the PhantomData, either (other than having to type more).

Here's a previous discussion on the topic, with some additional links.


  1. by promising not to look at your Params when you drop basically; Vec uses this for example ↩︎

  2. thus, presumably, the stale nomicon documentation ↩︎

2 Likes

This is incredibly stale documentation that has been misleading people for several months: as @quinedot pointed out, it is only relevant when you are also toying with perma-unstable unsafe impl<#[may_dangle] T> Drop shenanigans.

So not only can you dismiss it, but it would also be nice if it were removed altogether.

2 Likes