Some abstractions cannot be implemented

pub trait Codec {
    fn encode<T: Serialize>(&self, arg: T) -> Result<Vec<u8>, Error>;
    fn decode<T: DeserializeOwned>(&self, arg: &[u8]) -> Result<T, Error>;
}
 pub Client{
    codec:Box<dyn Codec >
 }

6 | pub trait Codec {
  |           ----- this trait cannot be made into an object...
7 |     fn encode<T: Serialize>(&self, arg: T) -> Result<Vec<u8>, Error>;
  |        ^^^^^^ ...because method `encode` has generic type parameters
8 |     fn decode<T: DeserializeOwned>(&self, arg: &[u8]) -> Result<T, Error>;
  |        ^^^^^^ ...because method `decode` has generic type parameters
  = help: consider moving `decode` to another trait
  = help: consider moving `encode` to another trait

Although safe, but almost lost programming flexibility, almost impossible to achieve extension... Why not consider implementing this? Why are simple abstractions so difficult

Are you asking for why this isn't supported, or how to get around it?

1 Like

yes, I wanted to implement an alternative universal encoder/decoder (by Serde), and it seemed that the compiler would not pass such a design。

Do you need to erase the codec type, or will a generic parameter work for you?

pub trait Codec {
    fn encode<T: Serialize>(&self, arg: T) -> Result<Vec<u8>, Error>;
    fn decode<T: DeserializeOwned>(&self, arg: &[u8]) -> Result<T, Error>;
}

pub Client<T:Codec> {
    codec: T
 }

Which of @alice’s two suggestions are you answering “yes” to?

generic parameter that is What I need, Because it's open to anyone using any serialization library

how to get around it?

that code is final impl code

pub enum Codecs {
    BinCodec(BinCodec),
    JsonCodec(JsonCodec),
    Custom(Box<dyn AnyCodec>),
}

impl Default for Codecs{
    fn default() -> Self {
        Self::BinCodec(BinCodec{})
    }
}

pub trait Codec {
    fn encode<T: Serialize + 'static>(&self, arg: T) -> Result<Vec<u8>, Error>;
    fn decode<T: DeserializeOwned + 'static>(&self, arg: &[u8]) -> Result<T, Error>;
}

pub trait AnyCodec {
    fn encode(&self, arg: Box<dyn Any>) -> Result<Vec<u8>, Error>;
    fn decode(&self, arg: &[u8]) -> Result<Box<dyn Any>, Error>;
}

pub struct JsonCodec {}

impl Codec for JsonCodec {
    fn encode<T: Serialize>(&self, arg: T) -> Result<Vec<u8>, Error> {
        match serde_json::to_vec(&arg) {
            Ok(ok) => { Ok(ok) }
            Err(e) => { Err(err!("{}",e)) }
        }
    }

    fn decode<T: DeserializeOwned>(&self, arg: &[u8]) -> Result<T, Error> {
        match serde_json::from_slice(arg) {
            Ok(v) => {
                Ok(v)
            }
            Err(e) => {
                Err(err!("{}",e))
            }
        }
    }
}

pub struct BinCodec {}

impl Codec for BinCodec {
    fn encode<T: Serialize>(&self, arg: T) -> Result<Vec<u8>, Error> {
        match bincode::serialize(&arg) {
            Ok(ok) => { Ok(ok) }
            Err(e) => { Err(err!("{}",e)) }
        }
    }

    fn decode<T: DeserializeOwned>(&self, arg: &[u8]) -> Result<T, Error> {
        match bincode::deserialize(arg) {
            Ok(ok) => { Ok(ok) }
            Err(e) => { Err(err!("{}",e)) }
        }
    }
}

impl Codec for Codecs {
    fn encode<T: Serialize + 'static>(&self, arg: T) -> Result<Vec<u8>, Error> {
        match self {
            Codecs::BinCodec(s) => { s.encode(arg) }
            Codecs::JsonCodec(s) => { s.encode(arg) }
            Codecs::Custom(s) => {
                s.encode(Box::new(arg))
            }
        }
    }

    fn decode<T: DeserializeOwned + Any + 'static>(&self, arg: &[u8]) -> Result<T, Error> {
        match self {
            Codecs::BinCodec(s) => { s.decode(arg) }
            Codecs::JsonCodec(s) => { s.decode(arg) }
            Codecs::Custom(s) => {
                let data = s.decode(arg)?;
                let t = {
                    match data.downcast(){
                        Ok(v)=>{
                            v
                        }
                        Err(e)=>{
                            return Err(err!("downcast fail! type_id = {:?}",e.type_id()))
                        }
                    }
                };
                Ok(*t)
            }
        }
    }
}

fwiw, this isn't possible in C++ either.

A Box<dyn Codec> uses virtual dispatch to call encode(), but methods with <T: Serialize> generics must be monomorphised otherwise we'd need to generate a vtable with infinite fields.

The technical term for this issue is Object Safety.

What you will need to do is erase the T type somehow. This should be possible using Serialize and Deserialize traits from the erased_serde crate.

5 Likes

Making a trait dyn safe

trait Codec {
    fn encode<T: Serialize> (&self, arg: &'_ T)
      -> Result<Vec<u8>, Error>
    ;
    fn decode<T: DeserializeOwned> (&self, arg: &[u8])
      -> Result<T, Error>
    ;
}

1- A simple (but verbose) solution for simple cases

Usually, the downstream usage will not need to be as generic as the generic trait API is. Say you'll want to serialize a String and/or a i32.

In that case, you can very easily write a helper trait with those specific hard-coded choices of generics / specific monomorphisations:

trait DynCodec {
    fn encode_str (&self, arg: &'_ str)
      -> Result<Vec<u8>, Error>
    ;
    fn encode_i32 (&self, arg: i32)
      -> Result<Vec<u8>, Error>
    ;
    fn decode_string (&self, arg: &[u8])
      -> Result<String, Error>
    ;
    fn decode_i32 (&self, arg: &[u8])
      -> Result<i32, Error>
    ;
}
// From the general `Codec`, get this narrower `DynCodec`
impl<T : Codec> DynCodec for T {
    fn encode_str (&self, arg: &'_ str)
      -> Result<Vec<u8>, Error>
    {
        self.encode::<&str>(&arg) // extra indirection because of missing `?Sized` on the original `<T>` generic.
    }

    fn encode_i32 (&self, arg: i32)
      -> Result<Vec<u8>, Error>
    {
        self.encode::<i32>(arg)
    }

    fn decode_string (&self, arg: &[u8])
      -> Result<String, Error>
    {
        self.decode::<String>(arg)
    }

    fn decode_i32 (&self, arg: &[u8])
      -> Result<i32, Error>
    {
        self.decode::<String>(arg)
    }
}

and then use Box<dyn DynCodec + …> for Fun And Profit (any Box<impl 'lt + Codec> will magically coerce to Box<dyn 'lt + DynCodec>).

This does have the drawback of being cumbersome to write, but when that happens with Rust macros can be quite effective at palliating it.


2- Generalizing/extending this solution

So, the previous approach had two issues:

  • it only handled a fixed number of types,

  • it gets unwieldy as the number of types grows / does not scale to increasing that fixed number of types.

Both aspects have a nice solution: only use one type! Indeed, it's not because you only use one type that you have to give up on polymorphism: that's exactly what dyn Traits are for!

Hence:

/// Let's only handle `encode` for the moment
trait DynCodec {
    fn dyn_encode (&self, arg: &'_ dyn Serialize)
      -> Result<Vec<u8>, Error>
    ;
    /* decode not handled yet */
}

impl<T : Codec> DynCodec for Codec {
    fn dyn_encode (&self, arg: &'_ dyn Serialize)
      -> Result<Vec<u8>, Error>
    {
        self.encode(&arg)
    }
}

And then you can make a Box<dyn DynCodec + …>, and call .dyn_encode(&some_str), or .dyn_encode(&some_integer) on it, etc. Indeed, a &some_str can coerce that one &dyn Serialize type, and so can &some_integer.

  • And more generally, for any T : Serialize, a &T can coerce to a&dyn Serialize … which means we can go back to featuring an ergonomic generic façade atop our dyn_encode method!

    impl Codec for dyn DynCodec + '_ {
        fn encode<T : Serialize> (&self, arg: &T)
          -> Result<Vec<u8>, Error>
        {
            self.dyn_encode(arg) // `arg as &dyn Serialize`
        }
    }
    

There is only one caveat, though… ::serde::Serialize is not dyn-safe / can't be made into a dyn Trait.

Seeing this error can be disheartening, since we seem to be back to square one. But it's actually not the case: we needed a dyn Codec-like thing, and we got it, provided we had a dyn Serialize-like thing. And since we now know the recipe to make a Trait become dyn Trait-like thing, we can actually rinse and repeat with Serialize:

  1. Find a restricted set of APIs to use in a non-generic fashion (if a dyn Trait can be found for that part, e.g., a dyn Serializer, then even better);

  2. Write a DynSerialize trait with only those non-generic methods, and a blanket impl for all the impl Serialize types.

  3. and so on…

Luckily, in our case, the ::serde framework itself has gotten us covered, since they do feature their companion crate:

which is basically all this approach already written in a maximally versatile and efficient manner, thanks to taking advantage of the foundations of the serde model being covered by a fixed number of root cases (serde bool, serde u8, serde sequence of serdables, serde map of serdables, etc.).

3- Fixing the dyn Serialize problem using ::erased-serde

/// Let's only handle `encode` for the moment
trait DynCodec {
-   fn dyn_encode (&self, arg: &'_ dyn Serialize)
+   fn dyn_encode (&self, arg: &'_ dyn ::erased_serde::Serialize)
      -> Result<Vec<u8>, Error>
    ;
    /* decode not handled yet */
}

impl<T : Codec> DynCodec for Codec {
-   fn dyn_encode (&self, arg: &'_ dyn Serialize)
+   fn dyn_encode (&self, arg: &'_ dyn ::erased_serde::Serialize)
      -> Result<Vec<u8>, Error>
    {
        self.encode(&arg)
    }
}

and voilĂ .

4- Handling .dyn_decode() in a polymorphic fashion.

This is the main thorny / genuinely challenging one.

  • For instance, notice how there is no Deserialize trait in ::erased-serde.

Indeed, we'd like to end up with generics once all this dance has been done, but if were to go down that road, we'd kind of need to provide some dynamic representation of the type we'd like to decode (e.g., some form of TypeId parameter), to then get a Box<dyn Decoded> kind of dynamic type, to then go back to downcasting or something. It wouldn't be pretty, nor efficient, nor nice.

So, how does ::erased-serde handle that / circumvent that problem? Thanks to, again, the fixed number of root case: "serde bool, serde i32, serde sequence of serdables, etc."

With it, we can actually have a dyn ::erased_serde::Deserializer, that is, a type-unified entity which can be queried for these root elements, and which thus makes it a ::serde::Serializer itself, that is, something that can eat generics for breakfast.

So, our "internally dyn-compatible" layer will be based of Deserializers, and then we'll go back to generic <T : DeserializeOwned> atop it:

trait Codec : DecodeMethod {
    // fn encode…

    // no decode here (see below for the default impl dance).
    // Instead, implementors are expected to provide the deserializer directly
    fn deserializer<'buf> (
        &'_ self,
        buf: &'buf [u8],
    ) -> Box<dyn ::erased_serde::Deserializer<'buf>> // + 'buf
    ;
}
/// `.decode()` default-implemented here:
impl<C : Codec> DecodeMethod for C {
    fn decode<T : DeserializeOwned> (
        &self,
        buf: &'_ [u8],
    ) -> Result<T, Error>
    {
        ::erased_serde::deserialize(&mut self.deserializer(buf))
    }
}
// where:
trait DecodeMethod {
    fn decode<'buf, T : DeserializeOwned> ( // this could be `T : Deserialize<'buf>` btw
        &self,
        buf: &'buf [u8],
    ) -> Result<T, Error>
    ;
}

And thus you'd need to adjust a bit the impls so that they yield the whole Deserializer rather than perform the deserializations directly:

pub struct JsonCodec {}

impl Codec for JsonCodec {
    // fn encode…

    fn deserializer<'buf> (
        &'_ self,
        buf: &'buf [u8],
    ) -> Box<dyn ::erased_serde::Deserializer<'buf>> // + 'buf
    {
        // 1- Get a concrete deserializer (a json one, here)
        let json_deserializer = ::serde_json::Deserializer::from_slice(buf);
        // 2- Box-dyn it, as shown in erased-serde.
        Box::new(<dyn ::erased_serde::Deserializer<'buf>>::erase(
            json_deserializer
        ))
    }
}

You'd get decode for free (incidentally factoring out the deserializer-agnostic code you had (the match arm etc.).

5- Bonus: avoiding the Box on the dyn Deserializer.

By using a callback-based / CPS / scoped API rather than returning the serializer. See

for more details about this approach.

Using the sugar from that crate, in pseudo-code, it would be about changing deserializer to be:

  fn deserializer (…)
-   -> Box<dyn Deserializer…
+   -> &'local mut dyn Deserializer…
More precisely
trait Codec : DecodeMethod {
    fn encode (&self, arg: &impl Serialize)
      -> Result…
    ;

    fn with_deserializer<'buf> (
        self: &'_ Self,
        buf: &'buf [u8],
        yield_: &'_ mut dyn for<'local> FnMut(
  /* -> */ &'local mut Deserializer<'buf>
        ),
    )
    ;
}
/// Default-impl of `decode`
impl<C : Codec> DecodeMethod for C {
    fn decode<'buf, T : Deserialize<'buf>> (
        self: &'_ Self,
        buf: &'buf [u8],
    ) -> Result<T, Error>
    {
        let mut ret = None;
        self.with_deserializer(buf, &mut |deserializer| ret = Some({
            ::erased_serde::deserialize(deserializer)
        }));
        ret.unwrap()
    }
}

with a concrete impl then being:

impl Codec for JsonCodec {
    // fn encode…

    fn with_deserializer<'buf> (
        self: &'_ Self,
        buf: &'buf [u8],
        yield_: &'_ mut dyn for<'local> FnMut(
  /* -> */ &'local mut Deserializer<'buf>
        ),
    )
    {
        // 1- Get a concrete deserializer (a json one, here)
        let json_deserializer = ::serde_json::Deserializer::from_slice(buf)
        // 2- "return" it `&mut dyn`ed
        yield_(&mut <dyn ::erased_serde::Deserialize<'buf>>::erase(
            json_deserializer
        ))
    }
}
12 Likes

Here is a final snippet that compiles (no erased_serde on the Playground):

use ::{
    anyhow::{
        anyhow as err,
        Result,
    },
    erased_serde::{
        Deserializer,
    },
    serde::{
        Deserialize,
    },
};

trait Codec {
    fn encode (
        self: &'_ Self,
        arg: &'_ dyn ::erased_serde::Serialize,
    ) -> Result<Vec<u8>>
    ;

    // no decode here (see below for the default impl dance).
    // Instead, implementors are expected to provide the deserializer directly
    fn with_deserializer<'buf> (
        &'_ self,
        buf: &'buf [u8],
        yield_: &'_ mut dyn for<'local> FnMut(
    /* -> */ &'local mut dyn Deserializer<'buf>,
        ),
    )
    ;
}
/// `.decode()` default-implemented here:
impl<C : ?Sized + Codec> DecodeMethod for C {
    fn decode<'buf, T : Deserialize<'buf>> (
        &self,
        buf: &'buf [u8],
    ) -> Result<T>
    {
        let mut ret = None;
        self.with_deserializer(buf, &mut |deserializer| ret = Some({
            ::erased_serde::deserialize(deserializer)
        }));
        ret .unwrap()
            .map_err(|e| err!("{e}"))
    }
}
// where:
trait DecodeMethod {
    fn decode<'buf, T : Deserialize<'buf>> (
        &self,
        buf: &'buf [u8],
    ) -> Result<T>
    ;
}

pub struct JsonCodec {}

impl Codec for JsonCodec {
    fn encode (
        self: &'_ Self,
        arg: &'_ dyn ::erased_serde::Serialize,
    ) -> Result<Vec<u8>>
    {
        ::serde_json::to_vec(&arg)
            .map_err(|e| err!("{e}"))
    }

    fn with_deserializer<'buf> (
        &'_ self,
        buf: &'buf [u8],
        yield_: &'_ mut dyn for<'local> FnMut(
    /* -> */ &'local mut dyn Deserializer<'buf>,
        ),
    )
    {
        // 1- Get a concrete deserializer (a json one, here)
        let json_deserializer = ::serde_json::Deserializer::from_slice(buf);
        // 2- `dyn`-erase it.
        yield_(&mut <dyn ::erased_serde::Deserializer>::erase(
            &mut { json_deserializer }
        ))
    }
}

fn check (b: Box<dyn Codec>)
  -> Result<String>
{
    let s = "Hello, World!";
    let bytes = b.encode(&s)?;
    dbg!(String::from_utf8_lossy(&bytes));
    b.decode::<String>(&bytes)
}

fn main ()
{
    assert_eq!(
        "Hello, World!",
        check(Box::new(JsonCodec {}))
            .unwrap()
        ,
    );
}

There was a disappointing surprise when writing it: a ::serde_json::Deserializer<…> is not a ::serde::Deserializer<'_>!! Only a &mut ::serde_json::Deserializer<…> is.

This makes it so the initial Boxed API cannot be featured directly, and so the "bonus" CPS approach kind of becomes mandatory :weary:.

I actually ran into this when implementing my own serialization format. It does not appear to be possible to implement Deserializer directly due to how ownership works, if a (de)serializer needs to keep mutable references to sub-serializers (which is common when e.g. de/serializing from/into a Value).

1 Like

I wonder if ::erased_serde::Serializer could be implemented for types T where &mut T : ::serde::Serializer or something along those lines :thinking:.

Anyhow, the self vs. &mut self for Serializer is indeed not what I expected; I imagine the API had to be designed like that for other reasons which my shallow experiments have not run into, but it is nonetheless a quite surprising thing to be honest :sweat_smile:

Thank you for your reply。 erased_serde might solve the problem this time, but if you work with other trait objects, you're back to square one. :sweat_smile:

If there is a library that provides a generic implementation of type erasure。

I have my doubts this would be practical. A generic "type erasure" library would need to abstract over everything that makes a trait not object safe. However, the things which make the trait not object safe are often integral to how the trait works and not easily abstracted unless unless you have domain-specific knowledge of how it's used (i.e. erased_serde).

In my mind, this is analogous to asking for a library that automatically rewrites OO code in a functional style. You could maybe do it for a curated subset of problems, but there's a reason programs that automatically port C# to Haskell aren't mainstream.

3 Likes

I have to say, it was very difficult, and now I have erased_serde which doesn't work with Box<serde::Serializer> :pensive:

I readed at the erased_serde implementation, and I still can't really abstract Codec at this point