PhantomData<T> vs PhantomData<fn(T) -> T>, what about Send and Sync?

jbe · April 3, 2022, 12:44pm

Hi all, I wondered about the following thing regarding PhantomData.

If I include PhantomData<T> in a struct, I understand that if T is !Send or !Sync, my struct will also be !Send or !Sync, respectively.

What if I instead use PhantomData<fn(T) -> T>. Then I get invariance. But how does it behave in regard to Send and Sync? I would say a function that takes a non-Send or non-Sync value isn't non-Send or non-Sync itself?

There is a table in the Rust reference ~~explaining (co/contra)variance in regard to different uses of PhantomData~~, but it doesn't talk about Send and Sync.

Edit: the section in the reference actually isn't about PhantomData.

Neither does the table in the Nomicon, and additionally, I believe the remark about drop check is outdated (unless you use eyepatch).

alice · April 3, 2022, 12:47pm

Your value will still be both send and sync.

use std::marker::PhantomData;

type T = std::rc::Rc<u8>;

fn assert_send<T: Send>() {}
fn assert_sync<T: Sync>() {}

fn main() {
    assert_send::<PhantomData<fn(T) -> T>>();
    assert_sync::<PhantomData<fn(T) -> T>>();
}

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 1.42s

jbe · April 3, 2022, 12:50pm

Maybe there should be a note added to these tables? As I (wrongly) used these different "styles" of PhantomData to change variance (only).

P.S.: I just noticed I only used them on lifetimes yet, so it won't be a problem with my previous uses.

Cerber-Ursi · April 3, 2022, 12:55pm

Well, that's not about the "styles". PhantomData<T> works "as if" the struct contains T. So, PhantomData<fn(T) -> T> works "as if" the struct contains a function pointer. Function pointers are always Send and Sync, no matter the types involved; so the corresponding PhantomData is Send and Sync, too.

jbe · April 3, 2022, 1:00pm

Yeah it makes sense. I still think a note that this affects Send and Sync might be helpful, because it's suggested in the context of using them for variance:

Table of PhantomData patterns

Here’s a table of all the wonderful ways PhantomData could be used:

Phantom type 'a T

PhantomData<T> - covariant (with drop check)

PhantomData<&'a T> covariant covariant

PhantomData<&'a mut T> covariant invariant

PhantomData<*const T> - covariant

PhantomData<*mut T> - invariant

PhantomData<fn(T)> - contravariant

PhantomData<fn() -> T> - covariant

PhantomData<fn(T) -> T> - invariant

PhantomData<Cell<&'a ()>> invariant -

alice · April 3, 2022, 1:02pm

But, it doesn't affect it with function pointers?

jbe · April 3, 2022, 1:05pm

It was confusing to me that turning PhantomData<T> to PhantomData<fn(T) -> T> affects more than variance: it will also allow my type to be Send or Sync even if T is not.

The table suggests that the difference is variance only. Of course that's just been a wrong conclusion by me, but I would assume other people might wrongly make the same wrong conclusion when looking at the table. Or not?

alice · April 3, 2022, 1:06pm

Perhaps others would also make the same assumption.

jbe · April 3, 2022, 1:15pm

So I decided to open an issue. But I found it was already reported previously:

github.com/rust-lang/nomicon

PhantomData patterns table should include interactions with `Send`/`Sync`

opened 07:09PM - 14 Oct 21 UTC

lilyball

According to the `PhantomData` patterns table, there seems to be no difference b…etween using e.g. `*const T` and `fn() -> T`. Both are covariant on `T` and have no lifetime. But there actually is a difference, which is that `PhantomData<*const T>` is not `Send` or `Sync`, whereas `PhantomData<fn() -> T>` is `Send` and `Sync`. This is something that users can figure out by thinking about how `PhantomData` implements auto traits according to its `T` parameter, but this isn't explicitly called out in the nomicon (or in the `std::marker::PhantomData` docs) and so it's easy to forget. For example, I changed some code from `PhantomData<T>` to `PhantomData<fn() -> T>` specifically to avoid the drop check (my type doesn't own a `T`, but instead it can produce `T`s). It just so happens that this makes my type `Send` and `Sync` too, which it should have been in the first place but nobody realized that until well after the change. This could be improved by updating the Nomicon to call out the fact that `PhantomData` will implement auto traits according to its type parameter, and to give an example of this by including `Send`/`Sync` columns in the table that lists whether the given pattern is `Send`/`Sync`. This would look something like **Phantom type** | `'a` | `T` | `Send` | `Sync` -|-|-|-|- `PhantomData<T>` | - | covariant (with drop check) | `T: Send` | `T: Sync` `PhantomData<&'a T>` | covariant | covariant | `T: Sync` | `T: Sync` `PhantomData<&'a mut T>` | covariant | invariant | `T: Send` | `T: Sync` `PhantomData<*const T>` | - | covariant | - | - `PhantomData<*mut T>` | - | invariant | - | - `PhantomData<fn(T)>` | - | contravariant | `Send` | `Sync` `PhantomData<fn() -> T>` | - | covariant | `Send` | `Sync` ...

H2CO3 · April 3, 2022, 4:40pm

So basically, the thing is: PhantomData isn't special as in "PD<fn> is treated differently compared to any other PD<T>". Rather, the compositional nature of types (what PD pretends to be, and what happens when a struct contains a fn pointer) gives rise to an idiom that you can use to achieve the specific effect you are observing.

Neither the language nor PhantomData was designed with this single behavior in mind, just like the language is not specifically designed to implement e.g. blockchains, but it happens to be used for that a lot, because its perf and safety characteristics fit the distributed-cryptography-and-database use case.

jbe · April 3, 2022, 5:09pm

Yeah, I see that PhantomData<whichever> just makes the struct behave like it contained some field of type whichever. That means there are more consequences than just variance on its type arguments (or allowing unused type arguments syntactically).

In that matter, the documentation isn't wrong. It was me drawing wrong conclusions.

Until now, I had the following strategy: If the compiler complains about an unused type argument T, then just add a PhantomData<T> to the struct. But apparently that's bad! One needs to think more careful about it (e.g. whether it's desired that T: !Send + !Sync will "infect" the struct being !Send and/or !Sync too).

H2CO3 · April 3, 2022, 5:16pm

Yes, exactly! For example, when I'm writing strongly-typed database IDs (aka struct Uid<T>(u64)), I'm never blindly slapping a PD<T> onto them, because that potentially makes poor raw integer newtype cease to be Send + Sync, which in turn is a huge pain for downstream code. So yes, you have to carefully consider how your type uses the type parameter, and choose the appropriate idiom.

LegionMammal978 · April 3, 2022, 6:24pm

Something which that table doesn't make clear is how to decide which variance is correct for one's own pointer-based struct. AFAICT, it depends on the operations that the struct's public interface allows. Perhaps the docs could be improved in that aspect.

jbe · April 3, 2022, 6:49pm

What would you use in your case then? PhantomData<fn(T) -> T>?

I have the following case:

/// Constraints on database (type argument `C` to [`Db`])
pub trait Constraint: 'static {
    /// Duplicate keys allowed?
    const DUPLICATE_KEYS: bool;
}

/// Type argument to [`Db`] indicating unique keys
pub struct KeysUnique;

impl Constraint for KeysUnique {
    const DUPLICATE_KEYS: bool = false;
}

/// Type argument to [`Db`] indicating non-unique keys
pub struct KeysDuplicate;

impl Constraint for KeysDuplicate {
    const DUPLICATE_KEYS: bool = true;
}

/// Database handle
#[derive(Debug)]
pub struct Db<K: ?Sized, V: ?Sized, C> {
    key: PhantomData<fn(K) -> K>,
    value: PhantomData<fn(V) -> V>,
    constraint: PhantomData<fn(C) -> C>,
    backend: ArcByAddr<DbBackend>,
}

/// Pointer types that can be converted into an owned type
pub trait PointerIntoOwned: Sized {
    /// The type the pointer can be converted into
    type Owned;
    /// Convert into owned type
    fn into_owned(self) -> Self::Owned;
}

/// Types that can be stored
pub unsafe trait Storable: Ord + 'static {
    /* … */
    /// Pointer to aligned version of Self
    type AlignedRef<'a>: Deref<Target = Self> + PointerIntoOwned;
    /* … */
}

/// Read-write or read-only transaction
pub trait Txn {
    /// Get reference to value in database
    fn get<K, V, C>(
        &self,
        db: &Db<K, V, C>,
        key: &K,
    ) -> Result<Option<V::AlignedRef<'_>>, io::Error>
    where
        K: ?Sized + Storable,
        V: ?Sized + Storable,
        C: Constraint;
    /// Get owned value from database
    fn get_owned<'a, K, V, C>(
        &'a self,
        db: &Db<K, V, C>,
        key: &K,
    ) -> Result<Option<<<V as Storable>::AlignedRef<'a> as PointerIntoOwned>::Owned>, io::Error>
    where
        K: ?Sized + Storable,
        V: ?Sized + Storable,
        C: Constraint,
    {
        Ok(self.get(db, key)?.map(|x| x.into_owned()))
    }
    /* … */
}

impl<'a> TxnRw<'a> {
    /* … */
    /// Delete all values from database that match a given key
    pub fn delete<K, V, C>(&mut self, db: &Db<K, V, C>, key: &K) -> Result<bool, io::Error>
    where
        K: ?Sized + Storable,
        V: ?Sized + Storable,
        C: Constraint,
    {
        /* … */
    }
}

Edit: Extended code excerpt to cover also get and get_owned methods.

Is PhantomData<fn(T) -> T> the right choice here?

H2CO3 · April 3, 2022, 7:36pm

For my database ID, it's PhantomData<fn() -> T>. That makes it covariant, not invariant. Basically, fn() -> T is very similar to T, but instead of being a value, it produces a value.

I'm not immediately sure about your code. I'll have a look later.

chrefr · April 3, 2022, 8:32pm

I generally think PhantomData is too overloaded, and we need to split it to multiple structs handling variance, dropck, auto traits etc.

Ryan1729 · April 3, 2022, 11:19pm

I think you could get a lot of the benefits of that for users of PhantomData just with type aliases like these:

type Contravariant<T> = PhantomData<fn(T)>;
type Covariant<T> = PhantomData<fn() -> T>;
type Invariant<T> = PhantomData<fn(T) -> T>;

I would guess that having one underlying thing would be easier for compiler developers, but that is just a guess.

chrefr · April 4, 2022, 12:16am

Indeed, but the problem is that nobody bothers to do that as it's easier to just go with PhantomData (there are crates that do these things, but they're not commonly downloaded). I think if it would've been in std people would be using it more.

Ryan1729 · April 4, 2022, 5:50pm

I’m unaware of a reason why these type aliases couldn’t be added to the standard library.

I am also under the impression that since such a change would mainly consist of documentation, and the only added code would be type aliases, that change could just be a PR to the main Rust repo. That is, the change probably doesn’t need to go through the RFC process.

I’m basing this on the following sentence from the RFC repo's README:

Many changes, including bug fixes and documentation improvements can be implemented and reviewed via the normal GitHub pull request workflow.

trentj · April 4, 2022, 6:58pm

IMO the biggest problem is not PhantomData's documentation but the fact that the compiler always suggests it when it encounters an unused generic parameter, even though the compiler itself has no way to know what kind of PhantomData is appropriate (obviously, since the whole point of it is to tell the compiler what it can't figure out on its own). PhantomData is not the best solution to the majority of cases where the error message suggests it, and PhantomData<T> is not the least error-prone suggestion it could make, but that's what it suggests anyway. This leads a lot of programmers to believe they should be using PhantomData a lot more than they really ought to, and discourages people from thinking critically about what it means for a struct to be generic (as opposed to a trait, an impl, or a function).

Adding type aliases wouldn't fix this problem; it would just be more type theory gobbledygook that most people would not bother to understand before blindly following the compiler suggestion.

Topic		Replies	Views
PhantomData Thread Safety help	8	1887	January 12, 2023
Make type Sync/Send/!Sync/!Send help	9	882	October 22, 2023
Option<fn(T)> over PhantomData<T> code review	10	758	June 15, 2023
PhantomData<fn() -> H> pattern in BuildHasherDefault source help	4	295	November 5, 2023
PhantomData and the drop check	3	463	June 16, 2021

Phantom type	`'a`	`T`
`PhantomData<T>`	-	covariant (with drop check)
`PhantomData<&'a T>`	covariant	covariant
`PhantomData<&'a mut T>`	covariant	invariant
`PhantomData<*const T>`	-	covariant
`PhantomData<*mut T>`	-	invariant
`PhantomData<fn(T)>`	-	contravariant
`PhantomData<fn() -> T>`	-	covariant
`PhantomData<fn(T) -> T>`	-	invariant
`PhantomData<Cell<&'a ()>>`	invariant	-

PhantomData<T> vs PhantomData<fn(T) -> T>, what about Send and Sync?

Related Topics