Typed IDs with PhantomData and serde

Rust Playground for what I'm attempting: Rust Playground

I have a project which already has a bunch of related entities that all have an id field (they get saved to/retrieved from a MongoDb, if you're curious). Right now they all have the same Id type[1], which is fine, but it means that you can swap an Id for one entity (e.g. User) for another (e.g. Group) and cause no errors.

I have a thought of trying to change this by adding a PhantomData to the Id type, so that whenever you hold onto an Id, it's always typed by the entity that it belongs to. Something like

struct Id<R> {
    pub ulid: String,
    pub id_type: PhantomData<R>
}

[2]

Question 1: Is this a reasonable/good use for PhantomData? Or am I overdesigning here? Would it be better to have an explicit newtype for each struct?

The next thing I'm trying to do is to serialize/deserialize the Id<R> type. I can use serde_derive fine on the struct; but when I try to serialize it using serde_json (as an example), the type parameter goes away. Thus, you can serialize an Id<User> to JSON and then deserialize it to an Id<Group>.

Question 2: Is there a way to make serde (or serde_json) preserve that type, such that a serialized Id<R> won't deserialize to a an Id<S>? Or will I have to implement such logic myself?


  1. Done for foreign trait implementation reasons ↩︎

  2. The actual type of the the ulid field is from the ulid crate, but I don't think it matters for this example ↩︎

Have you taken a look at GitHub - jetpack-io/typeid: Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs ? It solves precisely this problem and there are two Rust implementations available.

1 Like

Yes, but make it PhantomData<fn() -> R>, because that makes the type parameter used without unnecessarily affecting variance and dropck.

You can serialize the type_name or (a hash of) the TypeId and check it upon deserialization, but this is not reliable (not necessarily reproducible across compiler versions). A more robust way would be to derive your own type ID system and serialize/check that instead.

2 Likes

Can you explain that in more detail? I can sort of visualize how variance could be more flexible with that type signature but I'm not certain about dropck.

I don't think tropck is affected by this, unless (unstable) #[may_dangle] is used. And variance is the same either way, too, though sometimes you might not even want the covariance so PhantomData<fn(R) -> R> as a possible way to make it invariant is also an option.

Using fn() -> R in the PhantomData does however avoid unnecessary extra auto trait restrictions (e. g. for Send and Sync).

1 Like