Is PhantomData needed without generics?

When we have a collection with generics, PhantomData is used to inform the compiler that the struct owns a T:

struct MyVec<T> {
    ptr: NonNull<T>,
    len: usize,
    _marker: PhantomData<[T]>,
}

impl<T> Drop for MyVec<T> {
    fn drop(&mut self) { ... }
}

I we had a non-generic collection where T is a very simple type (without references or destructor), such as:

struct MyString {
    ptr: NonNull<u8>,
    len: usize,
    _marker: PhantomData<[u8]>,
}

impl Drop for MyString {
    fn drop(&mut self) { ... }
}

Would that PhantomData<[u8]> be needed (to specify that we own u8 values)? u8 is a simple type that does not contain references nor it has a drop function that will be called when MyString is dropped.

I am asking this per the discussion regarding PhantomData at https://github.com/psychon/x11rb/pull/205

No.

The PhantomData<T> type is purely there to make rustc treat your type as containing a T without actually storing anything at runtime, for the purpose of type variables, variance, and alignment.

If you haven't already seen it, you may find this nomicon chapter useful.

2 Likes

Making T = u8, you could say that the PhantomData<u8> (or <[u8]>) type is purely there to make rustc treat your type as containing a u8 without actually storing anything at runtime.

That's why I am asking about the situation of a non-generic type. What about a more complex type. Let's say that instead of u8 I am using a MyType<'a, 'b>, with it's own drop. Would a PhantomData<MyType<'a, 'b>> be needed?

After reading https://doc.rust-lang.org/nomicon/dropck.html, it seems that for simple types (such as u8), it is not needed, but for complex types (or a generic T), it is needed to avoid self-referential structs.

Lifetimes are generics in your case, too. You have to use the lifetimes in your struct, anyway, so you may not even be able to get around PhantomData when working with raw pointers and generic lifetimes.

I don't think it matters in this case since u8 has no destructor, but I would probably have it anyway just for good measure.

1 Like

You use PhantomData when you want your type to behave like another type even though in reality it is implemented differently. I use them for lifetimes the most: say I'm interacting with a C library and the documentation states that some variable is only valid until another variable is freed. Using PhantomData, I can fake a borrow from the second variable in the first, and then the borrow checker will enforce the constraint I mentioned earlier.

2 Likes

Taking the "this compiles and runs fine" example of PhantomData and dropck confusion, and reducing its "type genericity" to a lifetime genericy, we still get a UAF when not using PhantomData:


In summary, there are three usages of PhantomData:

  • removing auto-traits, such as Send, Sync, Unpin (or Freeze). This does not require generics, so this is a clear case where usage of PhantomData would be "needed" (until, if ever, negative impls get stabilized).

  • removing accidental co[ntra]variance (in some way, one could perceive variance as a form of auto-traits, but for generic types / type constructors). But contraty to auto-traits, variance makes no sense outside of a generic context / type constructors. Thus, without generics, no variance whatsoever, and thus no need for that usage of PhantomData

  • "expressing ownership" (c.f., the linked post). The only case where that is useful, AFAIK, is when using the unsafe #[may_dangle] unstable feature (definitely the Rust feature with the most scary name, safety-wise :sweat_smile:). And given that such (terrifying) attribute can only be applied to a generic parameter, then the answer seems to go in your direction.

That being said, some people may consider that a type like &str is not generic (we know what it is: a reference to a str!), or at least not truly generic (contrary to something like <T> / Option<T>; etc.). But such genericy suffices to bring variance questions, as well as "ownership expressiveness" questions, as I showed at the beginning.

1 Like

If T is simple enough, then no, you don't need PhantomData.

I think, though, that for more complex types you might still need it. Maybe something like

struct PointlessStringWrapper {
    parts: [usize; 3],
    phantom: PhantomData<String>,
}

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.