Type-erasue? in Rust

Suppose we start with :

pub struct S<T> { ... }

impl <T> S<T> {
  .. partA .. funcs not using T ...

  ... partB .. funcs using T ...
}

Then we want a 'Untyped' version, of the following form:


pub trait Untyped_Parts_Of_S {
  .. we copy down part A...
}

impl <T> Untyped_Parts_Of_S for S<T> {
  .. forward calls for partA ...
}

At this point, is a &dyn Untyped_Parts_Of_S a 'type erasue' of S<T> ?

If not:

  • what is type erasure,
  • what is type erasure in the context of Rust
  • what is the above, when we take a S<T> and try to refactor out the fns it impls that do NOT depend on the T ?

Yes, but that's also true if you implemented part B too.

Coercing to a dyn Trait + '_, generally. One might also include passing around *mut () with some other way of tracking the type at runtime, I suppose.

I can't think of a specific phrase for factoring out the non-generic functionality (which doesn't mean such a phrase doesn't exist). It's related to composition though.

pub struct NonGenericParts { /* ... */ }
impl NonGenericParts { /* ... */ }

struct S<T> {
    ngp: NonGenericParts,
    // ...
}

// then
impl<T> S<T> { /* forward `NonGenericParts` methods */ }
// and/or
impl<T> AsRef<NonGenericParts> for S<T> { /* ... */ }
impl<T> AsMut<NonGenericParts> for S<T> { /* ... */ }

N.b. my understanding/use of the phrase was built organically, not from some authoritative source I can cite.

3 Likes

I do not understand this. Are you saying:

pub trait Part_B_Of_S<T> { ... }
S<T> -> &dyn Part_B_Of_S<T>

is also type erasure? What is being erased here? T occurs in both parts.

I mean it doesn't matter if the erased type has a generic or not, and if it does have a generic it doesn't matter if you utilized them in the implementation of the trait or not. By coercing it to dyn Trait, you've erased the base type, be it String, S<T> implemented without utilizing T, S<T> implemented using T, whatever.

1 Like

This clarifies. I think what I want is different from canonical "type erasure". Canonical "type erasure is"

some concrete Struct/Enum -> &dyn SomeTrait

What I am after is:

  • We have struct S<T>

  • I want to say this is a S<A> for some A, but I can't do that since the size of S<A> may depend on A

  • so then I say Rc<S<A>> , but that also makes no sense, because there are functions there whose type depends on A

  • so I refactor out the funcs of S<T> into partA -- does not depend on T and partB -- \does depend on T

  • the nwe say Rc<dyn UnTyped_Parts_Of_S>

So although this is a case of type erasure since it involves concrete type -> &dyn Trait, I'm going for something a bit more specific

Now it sounds like composition again. Why does UnTypedPartsOfS need to be a dyn _, instead of just a nominal struct? Especially if stashed in an Rc; you can just hand out a clone of the Rc'd field without consuming the S<T>.

struct NonGeneric {}

pub struct S<T> {
    n: Rc<NonGeneric>,
    t: T,
}

impl<T> S<T> {
    fn non_generic(&self) -> Rc<NonGeneric> {
        self.n.clone()
    }
}
1 Like

"type erasure" is just a fancy way to say dynamic dispatch. there's really nothing special about it, other than the phrase itself. type erasure just means erase the static type at compile. for example, in C, the most common and simplest way to erase a type is cast a typed pointer to void *.

the common C++ technique to do `type erasure`

I guess the confusion maybe comes from that, the C++ community often refers a common technique/pattern to do type erasure (i.e. using class template to generate wrapper types and the corresponding vtables) as "type erasure". this technique is often preferred to the traditionally OO inherit approach because the concrete implementation can be decoupled from the abstract interface, i.e. you don't need to inherit the "interface" (the interface class that defines the vtable structure doesn't even need to be public). you just make the method signature match the expected argument and return types (actually, it's enough to make them compatible, with regard to implicit type conversions).

this technique is sometime referred to as "compile type duck typing" in C++. there was even a draft (P0201) to add a std::polymorphic_value to the standard library to standardize this kind of technique.

here's a simple example.

/// this is the interface that defines the vtable structure
/// this doesn't need to be publicly accessible
struct Adder{
    virtual int add(int x, int y) const & = 0;
    virtual ~Interface() {}
};
/// this template wraps any duck typed `T` and generate a vtable for it
template <typename T>
class PolymorphicAdder: public Adder {
    T: const &inner;
public:
    PolymorphicAdder(T const &inner): inner(inner) {}
    /// due to the way C++ template works, this method only requires
    /// the inner type has a method named `add`, and it
    /// can be called with two `int` arguments, and the return type can be
    /// implicitly convert to int
    virtual int add(int x, int y) const & override {
        return inner.add(x, y);
    }
};

/// this function requires the argument `adder` to have a method that is compatible with the above interface
/// but the type doesn't need to inherit the `Adder` interface
template <typename Arg>
void foo(Arg const &adder) {
    auto const polymorphic_adder = PolymorphicAdder<Arg>(adder);
    // forward declaration
    extern void foo_impl(Adder const &);
    return foo_impl(&polymorphic_adder);
}
/// this function takes the abstract interface as argument
void foo_impl(Adder const &adder) {
    // this call is through the vtable
    int sum = adder.add(123, 456);
    // ...
}

/// another example using the `PolymorphicAdder` wrapper template:
/// note the constructor takes a duck typed `T`
class ContainsErasedAdder {
    erased_adder: std::unique_ptr<Adder>;
public:
    template<T>
    ContainsErasedAdder(adder: T): erased_adder(std::make_unique<PolymorphicAdder<T>>(inner)) {}
};

/// a duck typed implementation
/// notice the signature doesn't need to exactly match the interface
/// as long it's compatible to call with implicit conversions.
class MyAdder {
public:
    char add(int x, int y) const & {
        return '\xFF';
    }
}

well, whenever you coerce a value of concrete type to a trait object, you are erasing its type.

I don't know if there's an widely used terminology for it, I think it's similar to the OO refactor technique "extract superclass".

but whatever it might be called, it's orthogonal to type erasure. the refactor doesn't necessarily use dynamic dispatching.

so back to this question:

&dyn Untyped_Parts_Of_S indeed erases the type S<T>. actually, it erases any type that can be coerced to the trait object. but I don't think I ever heard the saying "Foo is a type erasure of Bar`, at least not in the C++ world.


an aside

in rust, there's less need to use "clever" tricks or abusing language features like C++, but if you want, you can kind of "emulate" the C++ duck typing trick in rust, there's no strict equivalence due to the difference in the type system of the languages though:

pub trait Adder {
    fn add(&self, x: i32, y: i32) -> i32;
}
pub struct PolyAdder<Inner>(Inner);
pub fn foo<T>(adder: T) where PolyAdder<T>: Adder {
    fn foo_impl(adder: &dyn Adder) {
        todo!()
    }
    foo_impl(PolyAdder(adder))
}
pub ContainsErasedAdder(Box<dyn Adder>);
impl ContainsErasedAdder {
    fn new<T>(adder: T) -> Self where PolyAdder<T>: Adder {
        Self(Box::new(PolyAdder(adder)))
    }
}

// C++ can use template to add blanket implementations for the duck type
// but rust generics must type check, so we must add the impl block manually
// here I can define a macro for it.
// note sometimes, these `into()`s can cause type interference trouble, if the
// duck type method signature contains generic types.
macro_rules! duck_type_adder {
    ($duck:ty) => {
        impl Adder for PolyAdder<$duck> {
            fn add(&self, x: i32, y: i32) -> i32 {
                self.0.add(x.into(), y.into()).into()
            }
        }
    };
}

struct MyAdder {}
impl MyAdder {
    fn add(&self, x: i32, y: i32) -> i8 {
        -1
    }
}
duck_type_adder!(MyAdder);
1 Like

if I would give it a name, I think "decomposition" might be it.

1 Like

Let's say we are writing a GUI toolkit and we are working on the hotkey manager part. We might have something like this:

pub struct Key { c: u8, shift: bool, ctrl: bool, alt: bool, meta: bool }

pub Hk_Manager<T> { inner: HashMap<Vec<Key>, T> }

pub enum EditorCmd {}
pub enum ReplCmd {}
pub enum TerminalCmd {}

HK_Manager<EditorCmd>, Hk_Manager<ReplCmd>, Hk_Manager<TerminalCmd>

impl <T> HK_Manager<T> {
  fn get(&self, hk: &[Key]) -> Option<&T>;

  fn print(&self) -> Vec<Vec<String>>
}

Now, I want a situation where I want to point to a Rc< Hk_Manager<A>> for some A (A determined at runtime and can change), and I want to call print on it.

So the "depends on T / not depends on T" split happens at the fn level, not the struct fields level.

Fair enough. Type erasure is what you want.

With this concrete example (which would have been beneficial from the start), perhaps what you're getting at is thiat the generic must be erased from any included functions as well. In this case the answer is just omission of irrelevant methods from the trait. I can't think of a name for that either; it just is how traits work.[1] You include the desired functionality and ignore anything extraneous. Maybe the fact that you're starting from the POV of a specific struct and not that of the desired functionality is throwing you off.

When a trait has generic return values and you type erase those, it's sometimes called an erased trait, as explored in this blog post. Or if you replace Hash with a DynHash trait that takes a &mut dyn Hasher instead of a &mut impl Hasher.[2]

If you wanted to generalize the entire functionality of the struct (i.e. including get), that might be relevant,[3] but apparently you don't and it's not, in this case.


  1. Sometimes people coming from OO call base types "instances" of some specific trait, but that always misdirects me (I think they mean a dyn Trait). String isn't "an instance of Display (and Debug and AsRef<str> and a dozen other traits)" as far as I'm concerned, it's an implementor of Display and all the others. "Instance" makes it sound like the methods of the trait under discussion are all that the base type can do. This iteration of the conversation reminds me of that. A trait usually doesn't encapsulate every ability a base type has. And it's not rare or special for a generic struct to implement a non-generic trait. ↩︎

  2. And then you can implement Hash for dyn DynHash. ↩︎

  3. though I wouldn't call it an erased trait, since there is no pre-existing trait to erase ↩︎

2 Likes

Hotkeys actually wasn't the original example. The original use case is this complicated mess (many more structs / traits) where we're trying to build typed Gui widgets. So we have something like

pub struct Widget<T: Some_Functionality>

where given T satisfies Some_Functionality, we can wrap it into a Widget via struct Widget<T>.

Then, I want to be able to elsewhere have:

currently_focused_widget: Rc<Widget< DAMN IT >>

and this becomes

currently_focused_widget: Rc< dyn Widget_Funcs_Independent_Of_T >

i.e. a bunch of stuff we can do to a widget independent of its contents (ex: draw, move, resize, hide, handle key press event, close, etc ...)

This still feels a bit awkward, but I think we have interactively explored all possible options. Thanks for the help. Cheers.