Serde and trait objects (and associated types)

A problem that I now have for the second time is how to work with some trait both in a dynamic way (i.e. with trait objects) as well as being able to serialize/deserialize the "trait objects", aaand the trait also has an associated type.

Here is the concrete setup. I'm building some sort of server, and I want to build an abstraction for a Request+Response pair. The idea is the user should basically provide three things:

  • n "request types"
  • n "response types"
  • n handle() functions where the i'th handle function takes an object of request type i and spits out an object of reponse type i

From that data, I then provide basically some sort of server which handles everything, by deserializing an incoming request into the respective request type, calling the respective handle, and serializing the obtained response type and send the response.

I model this as follows:

trait Request {
    type Response;

    fn handle(self) -> Self::Response;
}

Then the user should implement Request on a bunch of types, let's call them ReqA, ReqB, ReqC, and also implement Serialize+Deserialize on those types and on the associated Response types. (I didn't put trait bounds for simplicity.)

Now one could imagine that I would create some enum with all the request types:

enum Req {
    A(ReqA),
    B(ReqB),
    C(ReqC),
}

Then my server should look roughly like this (pseudocode):

let request_binary = get_from_socket();
let request: Req = bincode::deserialize(request_binary);
let response = request.handle() // doesn't work, can't call function on enum variant,
                                // and type of `response` is not clear
let response_binary = bincode::serialize(response);
send_to_socket(response);

Now there are two problems:

  1. I don't know how to handle the abstraction of calling handle. One might think that one could implement Request for the enum Req, but that doesn't work because the return type of handle is different for each request. So I definitely need to do some trait object stuff I guess.

  2. But as soon as I do trait objects, I have a problem with serialization/deserialization. It somehow still needs to serialize/deserialize a Request as if it was an enum variant.

    (Not so with the response, because the client will know the type it expects to get responded, so it can directly (try to) deserialize into the respective response type.)

For the second problem I found typetag which also lists a quite similar example. However, it doesn't support associated types.

So I was wondering if you guys have any ideas. Maybe I could do something similar like typetag but customized for my own problem?

You can deserialize into serde_json::Value and further deserialize from that based on some runtime logic. (That includes compound types that have fields of serde_json::Value). Not sure if this is what you're asking, but it's one of the coolest features of serde.

Nahh, that's sort of what I did when I had the problem for the first time in the context of a web API, but this an internal thing with Unix sockets and I really want to go directly to/from binary encoding.

I think there is no way around either for example creating a .handle_and_serialize() function on the trait / an extension trait returning some value which is the same for all possible variants (i.e. bytes::Bytes or something along those lines), which may have a blanket impl for all T: Request (which will then "hide" the actual response type from the usage using momorphization), or actually using dynamic dispatch. There is no way to actually have "different" types based on runtime behavior, somewhat obviously ( that's what dynamic dispatch is for ). An alternative would be to create an enum from the response types as well, and returning that.

I am sorry if I misunderstood your usecase, feel free to correct me if I did :slight_smile:

No I definitely want to use dynamic dispatch.

I think the biggest problem I have is how to deserialize then. The binary message should consist of a discriminator (to indicate which request it is) followed by the encoded request data. So basically exactly like an enum would be decoded. It should deserialize into dyn Request.

To have the same type for all handle functions I think of making it return dyn Serialize.

Do you have a set binary format you want to use or do you have flexibility in choosing / creating one?

I'm free to use whatever. I chose bincode because it's fast and that's what similar crates use too. (Ideally whatever I'm doing abtracts over the format though. It would just use serde.)

1 Like

Is there any specific reason you wish to go for a trait for each request / response combination rather than having a enum Request and an enum Response? cuz with 2 enums it would probably be pretty trivial to implement - just trying to understand the problem you wish to solve :slight_smile:

If you just want the traits to have a way to actually write the handler and not wish to worry about the response enum inside that handler, I would do something along the lines of

trait Request
where
    Self: Deserialize
{
    type Response: Into<ApiResponse>;

    fn handle(self) -> Self::Response;
}

or optionally

trait Request
where
    Self: Deserialize
{
    type Response: Into<ApiResponse>;
    type Error: Into<ApiError>;

    fn handle(self) -> Result<Self::Response, Self::Error>;
}

(imho it's debatable if you actually want the deserialize / serialize / other trait bounds on the traits / assoc types since you could also just add them where you are using it)

and then have something like

enum ApiRequest {
    RequestA(RequestA),
    RequestB(RequestB),
    ...
}

impl Handler for ApiRequest {
    type Response = ApiResponse;
    type Error = ApiError;
    
    fn handle(self) -> Result<Self::Response, Self::Error> {
         match self {
              Self::RequestA(a) => a.handle().map(Into::into).map_err(Into::into),
              Self::RequestB(b) => b.handle().map(Into::into).map_err(Into::into),
              ...
         }
    }
}

If you want to avoid that code duplication, you can easily write something like

macro_rules! ApiRequest {
    ($($t: ty),*) => {
        #[derive(Deserialize)]
        enum ApiRequest {
            $($t($t)),*
        }
        
        impl Handler for ApiRequest {
            type Response = ApiResponse;
            type Error = ApiError;
            
            fn handle(self) -> Result<Self::Response, Self::Error> {
                 match self {
                     $(Self::$t(t) => t.handle().map(Into::into).map_err(Into::into)),*
                 }
            }
        }
    }
}

(no guarantee for an absence of mistakes on my part)
and use it like

ApiRequest!{
    RequestA,
    RequestB,
   ...
}

We actually need some kind of enum / other organization for possible requests and responses if you want to have any kind of discriminant, since if you don't have that there is no guarantee for assigned discriminants, especially for stability across different compiler invocations and compiler versions

And just another idea: Having a stable, versioned binary format is a pretty hard thing to do properly (add a version field at least!) so you might want to take a look at if you can adapt existing solutions that make this easier for your usecase ( I am thinking about something like protobufs), since if you for example ever want to remove a enum variant for the response or request you need to ensure that discriminants remain as they were to ensure compatibility, and there might be good tooling to manage such stuff available already.

When a client sends a request of type A then there is only one possible response type, namely the response type of A. I want to reflect this in the type system. It shouldn't be possible that a client sends a request of type A and gets a response of type B. It also saves (one byte) of encoding in the response when I don't need to encode the response type. The client knows it will get a response of a certain type and decodes it directly like this.

For example, there could be two type of requests:

  • SystemStatus, which has no more data associated to it. The response would be of type
    enum SystemStatusReponse {
         Offline,
         Starting,
         Live,
    }
    
  • RegisterUser, which sends along username: String and email: String. The response is of type for example Result<u32, SomeErrorType> (where the u32 is the user ID or so).

The client makes a request by sending 1 byte with discriminator for which kind of request it is, followed by the binary encoding of the request data. (So the server should read the first byte and then dispatch the correct deserialization, and put the deserialized thing into something like Box<dyn Request>.)
Then the server handles the request (by dynamically dispatching the handle() function and getting something like a Box<dyn Serializable>) and serializes it by only decoding the response, without any discriminator.
This is because the client knows which response type they will get, after all they sent the request themselves. They deserialize the response and done.

Yeah if I would make this into a public server crate or so then I totally get all the issues with incompatibility. But in this case, it's just for internal use, and client and server will be compiled simultaneously from the same crate/package. So there isn't really any API that's exposed or guaranteed to be stable. You should not think of what I'm building as a public API. It's just for inter process communication (IPC) simply because there is no other way of communicating. Think of I just want to get data from one thread to another, except that it's not just different threads but different processes. (And both processes are compiled from the same package so no compatibility issues ever.)

A typical example is a system service which has a CLI. Say NetworkManager. It has a service which is constantly running and doing some stuff, and then there is nmcli with allows you to change some settings and so on. Running nmcli spawns a new process. Then this has to communicate with the server process. This is often done through Unix Sockets (or TCP or what not). And this is pretty much exactly the usecase I have. (Building a system service with a CLI, both compiled together, and I need a way for them to communicate. The public, stable API is the CLI, the user should never use the socket themselves.)

In any case thank you very much for your inputs! I will definitely consider them.

1 Like

What do you mean with “it doesn't support trait objects”?


Here’s a small example stub of how it could be used:

/*
[dependencies]
bincode = "1.3.3"
erased-serde = "0.4.5"
serde = { version = "1.0.216", features = ["derive"] }
typetag = "0.2.19"
*/

// #####################################
// #  TRAIT STRUCTURE
// #####################################

use serde::{Deserialize, Serialize};
pub trait Request: DynRequest {
    type Response: Serialize + 'static;

    fn handle(self) -> Self::Response;
}

#[typetag::deserialize]
pub trait DynRequest: DynRequestImpl {}

pub trait DynRequestImpl {
    fn dyn_handle(self: Box<Self>) -> Box<dyn erased_serde::Serialize>;
}
impl<T: Request> DynRequestImpl for T {
    fn dyn_handle(self: Box<Self>) -> Box<dyn erased_serde::Serialize> {
        Box::new(self.handle())
    }
}


// #####################################
// #  EXAMPLE IMPLEMENTOR
// #####################################

#[derive(Deserialize)]
struct ReqA {
    field: String,
}
#[derive(Serialize)]
struct ResponseA {
    something: usize,
}

#[typetag::deserialize]
impl DynRequest for ReqA {}
impl Request for ReqA {
    type Response = ResponseA;

    fn handle(self) -> Self::Response {
        ResponseA {
            something: self.field.len(),
        }
    }
}

// #####################################
// #  PROOF-OF-CONCEPT SERVER CODE
// #####################################

type Req = Box<dyn DynRequest>;

pub fn server() -> Result<()> {
    let request_binary = get_from_socket();
    let request: Req = bincode::deserialize(&request_binary)?;
    let response = request.dyn_handle();
    let response_binary = bincode::serialize(&response)?;
    send_to_socket(&response_binary);
    Ok(())
}

fn get_from_socket() -> Vec<u8> {
    todo!()
}
fn send_to_socket(_: &[u8]) {}

type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>;

Sorry I wanted to write "associated types".

Thank you very much for your code, that looks promising!

One thing that confused me though is why you separate DynRequest and DynRequestImpl. Couldn't we put dyn_handle() directly into DynRequest and implement DynRequest for all T : Request? Or I guess then typetag complains about the return type of dyn_handle..?

Anyways thanks again very much! It's kind of hacked together I guess but at least it (probably) works. To get a fully satisfying solution I will probably have to understand how typetag works internally and adjust it to something like a DynRequest trait which directly includes dyn_handle.

The separation was there because the user must put the typetag macro themself on their concrete implementation, or else typetag won’t work. On the other hand, I wanted them not to have to define dyn_handle manually.


I’ve now simplified it a little bit though:

/*
[dependencies]
bincode = "1.3.3"
erased-serde = "0.4.5"
serde = { version = "1.0.216", features = ["derive"] }
typetag = "0.2.19"
*/

// #####################################
// #  TRAIT STRUCTURE
// #####################################

use serde::{Deserialize, Serialize};

#[typetag::deserialize]
pub trait Request: RequestDyn {
    type Response: Serialize + 'static
    where
        Self: Sized;

    fn handle(self) -> Self::Response
    where
        Self: Sized;
}
// separated into a supertrait to offer a convenient
// implicit implementation of `dyn_handle`
pub trait RequestDyn {
    fn dyn_handle(self: Box<Self>) -> Box<dyn erased_serde::Serialize>;
}
impl<R: Request> RequestDyn for R {
    fn dyn_handle(self: Box<Self>) -> Box<dyn erased_serde::Serialize> {
        Box::new(self.handle())
    }
}

// #####################################
// #  EXAMPLE IMPLEMENTOR
// #####################################

#[derive(Deserialize)]
struct ReqA {
    field: String,
}
#[derive(Serialize)]
struct ResponseA {
    something: usize,
}

#[typetag::deserialize]
impl Request for ReqA {
    type Response = ResponseA;

    fn handle(self) -> Self::Response {
        ResponseA {
            something: self.field.len(),
        }
    }
}

// #####################################
// #  PROOF-OF-CONCEPT SERVER CODE
// #####################################

type Req = Box<dyn Request>;

pub fn server() -> Result<()> {
    let request_binary = get_from_socket();
    let request: Req = bincode::deserialize(&request_binary)?;
    let response = request.dyn_handle();
    let response_binary = bincode::serialize(&response)?;
    send_to_socket(&response_binary);
    Ok(())
}

fn get_from_socket() -> Vec<u8> {
    todo!()
}
fn send_to_socket(_: &[u8]) {}

type Result<T> = std::result::Result<T, Box<dyn std::error::Error>>;

with Self: Sized restrictions, Request can become object safe itself. As annotated inline, a single level of supertrait stays, in order to facilitate an automatically-implemented dyn_handle. (You can’t use a default method body directly, as it’s going to complain that Self might be unsized.)

1 Like

Amazing, thank you very much!

I see you've had issues and then deleted the question (probably solving them)?

I've never practically used type tag myself either, and would be curious to learn more about challenges that might come up when using it. Hence (assuming it wasn't something entirely unrelated), if there's anything to learn from it, feel free to share what mistake you had made along the way of using it.

On the receiving side I was deserializing something like Box<dyn Request>, but on the serializing side I was sending a concrete type that implements Request. This doesn't work of course, since when you serialize a concrete type it doesn't add a discriminator in the beginning, but when you deserialize Box<dyn Request> it expects a discriminator at the beginning. So lesson is that when you use typetag you should make sure to use dyn Request objects in every place of (de)serialization!

1 Like

Ah, yes, in hindsight that makes a lot of sense, thanks for sharing. Looks like typetag::serde is much more useful than either of the one-way halves. At least to send it, you don't need to box it (as far as I understand &dyn Request should also get the Serialize by typetag::serde).

Btw, in case anyone is interested, after reading how typetag works internally I found it not so elegant and decided to instead write myself a macro called serde_dispatch which handles everything cleanly. This also provides a much better interface for remote procedure calls now, I'm quite happy with the result!

Here is the repo: GitHub - jbirnick/serde-dispatch: Minimalistic RPC in Rust

I choked on this :joy: :

TODOs (which probably will never be resolved)

1 Like