Designing a new actor framework emphasizing polymorphism

Intro

One of the purported benefits of the actor model is having by multiple actor types implementing the same message protocol, and so be substitutable for each other. A server project I'm building just now uses xactor which does this by offering the Caller<T> type, which is a (polymorphic, dynamically dispatched) handle to some actor that can receive a message of type T and return a response (of type <T as xactor::Message>::Response) asynchronously to the original sender. I want to keep the strongly typed messages and the option of responses, but the existing crate isn't doing all I need. I've wrote the following explanation partly for my own clarity in building a new one, but I'd also appreciate any feedback you have.

Existing issues

Firstly, in practice, trying to use xactor's structure hits a roadblock very quickly: a given Caller<T> instance can only deal with a single type T, and in many cases we want a handle that can process a group of possible messages. T can be an enum, but that still needs a lot of glue code for that to be ergonomic. I have mostly successfully managed to wrap all this glue into a macro called multiplex, which takes the following:

#[multiplex(...)]
impl NameDirectoryReq for NameDirectory {
	fn get_mailbox(&mut self, _: &mut Context<Self>, target: Identifier) -> Option<Sender<MailboxSenderData>> {
			/* ... */
	}
	async fn join_channel(&mut self, ctx: &mut Context<Self>, client: Addr<ClientActor>, client_nick: Nick, room: Identifier) -> Option<Caller<ChannelReqCallerData>> {
		/* ... */
	}
	fn nick_change(&mut self, _: &mut Context<Self>, old: Nick, new: Nick) -> Result<(), ()> { 
		/* ...Ignore the weird return type... */	
	}
	fn register(&mut self, _: &mut Context<Self>, client: Addr<ClientActor>) -> Nick {
		/* ... */
	}
	fn unregister_channel(&mut self, _: &mut Context<Self>, name: Identifier) {
		/* ... */
	}
	fn unregister_user(&mut self, _: &mut Context<Self>, nick: Nick) {
		/* ... */
	}
}

reproduces the block as is except as an inherent impl NameDirectory, and then adds on the rough equivalent of:

enum NameDirectoryReqData {
	get_mailbox(Identifier),
	join_channel((Adr<ClientActor>, Nick, Identifier)),
	register(Addr<ClientActor>),
	unregister_channel(Identifier),
	unregister_user(Nick)
}

enum NameDirectoryResponse {
	get_mailbox(Option<Sender<MailboxSenderData>>)
	join_channel(Option<Sender<ChannelReqCallerData>>)
	nick_change(Result<(), ()>)
	register(Nick),
	unregister_channel,
	unregister_user
};
impl Handler<NameDirectoryReqData> for NameDirectory {
	fn handle(&mut self, ctx: &mut Context<Self>, msg: NameDirectoryReqData) {
		match msg { 
				get_mailbox(msg) => NameDirectoryResponse::get_mailbox(self.get_mailbox(ctx, msg)),
				/* etc */
			}
	}
}
trait NameDirectoryReqCaller;
impl<T> NameDirectoryReqCaller for T 
	where
		T: Caller<NameDirectoryReqData> 
{

	async fn get_mailbox(&self, target: Identifier) -> xactor::Result<Option<Sender<MailboxSenderData>>> {
		let msg = NameDirectoryReqData::get_mailbox(Identifier);
		self.call(msg).await.map(|output| 
			if let NameDirectoryResponse::get_mailbox(response) = output {
				response
			} else { 
				unreachable!() 
			}
		)
	}
	/* etc */
}

but this does not entirely solve the problem, because this is actually only half of the story. Xactor makes a distinction between a Caller<T> and a Sender<T> of an actor receiving a T: a call produces a future with a response, whereas a send is fire-and-forget. (Which is not equivalent to calling a unit-returning method, as Future<Item=()> is not(). I'm less sure of this, but I think dropping the future from call would mean it's not sent at all.) The multiplex macro splits the methods based on whether they define a return type or not, and then generates both Caller and Sender implementations.

However, this means a single handle cannot provide both bidirectional messages and fire-and-forget ones. Unfortunately, this is exactly the use case I need. In the snippet above, I have Addr<ClientActor> values because of this issue, which are handles to the (concrete) type ClientActor, and this then means I cannot substitute ClientActor for testing, etc.

Additionally, there is no ability to compose the types of handles - that is, if an actor implements messages T and U, there is no relation between Caller<T>,Caller<U> and a Caller<T|U> of the sum of both. The only way to get the former is to have an Addr of the original concrete actor type, it is not possible to get them from aCaller of a combined T|U enum type. While this isn't currently a showstopper, it is inconvienient and it may make it difficult to retain the flexibility of passing interface handles as described above once my logic gets more complex.

I would also like to be able to set up the actors' mailboxes with bounded capacities, and this control over the underlying queue type seems to be missing from all the frameworks I've looked into, xactor included. Being able to do that would give me backpressure control within my system "for free", but I would prefer to not tie my implementation down by needing the capacity to be bounded if that's feasible. Similar logic applies to more varied behaviour, such as using a priority queue, with the additional issue that supporting a priority queue means that it must be possible to provide, e.g. an Ord implementation for the message type.

Even if xactor exposed the queue type, it would not be able to support that last use case, because of the way it handles dispatching heterogenous messages. (Represented by the actor implementing a trait, Handler<T> to receive a message of type T - a single actor is expected to implement Handler multiple times) Internally, the T value is passed to a closure that will run the actor's message handler with that value as the appropriate parameter, along with the actor's state and some metadata associated with the actor called a Context. This closure is then boxed, added to the message queue (which can then have the unified type Box<FnOnce(&mut self, &mut Context) -> ()>) and then executed when the event loop pops it off. This not only precludes the queue taking advantage of any structure the type T might have, but also likely has performance implications. (Every message sent is a heap allocation)

My attempt

A lot of limitations in other frameworks seem to come from message handling being modelled as a trait that the actor type implements. For more flexibility, I'm planning to use macros, but that rather awkwardly means that the actor definition must also include all of its message types since I don't know of any reliable way to share state across macro calls.

As a sketch for what that macro might look like:

actor! { 
	Olaf {
		alive: bool
	}
	impl Olaf { 
		fn new() -> Olaf {
		/* ... */
		}
	}
		
	performs Funcoot {
		fn marry(&mut self, child: Baudelaire) -> Option<Fortune> {
		/* ... */
		}
	}

    performs Stephano, Genghis, Dupin; 
}

This would produce all of the following items, assuming some sort of async runtime like tokio:

  • a struct Olaf, as written.

  • a struct OlafActor containing a Sender for every message type. More on that in a moment. This type is necessary because the one and only &mut Olaf is going to be owned by a new task, so there needs to be a separate type to pass out handles for.

  • impl blocks for the original type can just be reproduced as written

  • The performs block will, in addition to generating a queue field inside the Actor struct, generate

    • a trait and impl, impl Funcoot for OlafActor that defines a method, enqueue(&self, impl Into<FuncootData>). This is a "role", that multiple actors should be able to provide interchangeable implementations for, and the constructed OlafActor type has a separate Sender for each role.
    • FuncootData and FuncootResponse enums generated from the function signatures. There only need be one pair of these per role, since we don't have to maintain xactor's distinction between Callers and Senders.
    • this then allows multiplex-like glue to be defined on top of the trait to define a methodFuncoot.marry(child) that queues a message and returns (essentially) a promise for an Option<Fortune>, using the function body as the handler for incoming Baudelaire messages. (The &mut self is still a &mut Olaf, not &mut OlafActor) However, the enum variants corresponding to handlers that return values can be augmented with field containing a oneshot channel, and the handler wrapped so that the returned value is sent through that channel back to the point where the message was originally queued. This removes the split I had issues with in xactor.
  • I would like to be able to have the implementation blocks spread across multiple files, so I want to allow a performs statement with no block. Since the definition of the actor has no way to "know" about implementations outside the macro call, it would be obligatory to list all roles an actor has even if the implementations are elsewhere so that the appropriate fields, trait implementations, etc can be generated. The role implementation could then defined separately by another macro call that would look something like this:

    performs! { Funcoot by Olaf {
        fn marry(&mut self, child: Baudelaire) -> Option<Fortune> {
        /* ... */
       }
    }
    

    Only the type name is required by the struct definition, so the signature would only have to appear alongside the implementation rather in both places. This is also the place where a role might define its underlying queue type, etc.
    Some roles do not conceptually "belong" to any particular actor, so a similar syntax will exist to define the interface without any implementations, e.g.

    role! { Funcoot {
        fn marry(&mut self, child: Baudelaire) -> Option<Fortune>;
      } 
    }
    

    (I don't like the macro syntax forcing me to have two layers of braces, but I'm not sure how best to address that.)

  • the actor's event loop, as a standalone task that does a select! over all the Receivers corresponding to the Senders in the OlafActor type. This task owns the Olaf struct, mutably loaning it to the event handlers. This also allows async event handlers to run when the actor starts and stops, which is a xactor feature I found useful.

  • a function for starting the actor, given an Olaf value, and producing a (newtype of) Arc<OlafActor>. This is the handle to the actor and can be used to enqueue any message OlafActor implements.

  • This handle can be upcast to a more general Arc<Funcoot> role handle by the normal rules of the language.

    • Although I could use associated types so that I refer to Arc<OlafActor> as Newtype<Olaf>, which then stops the name OlafActor "leaking" out of the macro, I'm not sure how to do this while also allowing it to be upcast into a Newtype<Funcoot>

    • Edit: It is possible to do this but only with unstable features, which is extremely annoying.

  • As well as exposing the underlying enqueue, the role handle also provides a method add_stream that moves a Stream of FuncootData values into the actor and produces an AbortHandle. (xactor also does something similar, but its version goes through the Context which is only accessible during event handlers, which has complicated my code unnecessarily)

  • I have no current plans for global services or anything similar.

Thanks for reading this far, I'd really appreciate hearing any comments you have about this design, particularly whether it would fit any of your projects, any important features you think might be missing and whether a syntax like this is a good idea or whether it should look more like conventional Rust.

I like this idea and would probably consider using this project. A couple notes:

  • I like that this macro fills in boilerplate, but for some reason I don't trust DSLs in macros as much as I trust derives / attribute macros. Maybe this is because when things go wrong in DSL's the diagnostics are poor and DSLs-in-macros have poor support in rust-analyzer.
  • It's often important in actor systems to be able to have replies sent to your mailbox instead of blocking your receive loop having to wait for a reply in a message handler. This is one easy way to resolve deadlocks.

This is a fair point, and I might well start off with attribute macros in early versions. I thought about a DSL primarily because the actor declaration has a lot of pieces to it, and adding new meaning and requirements on top of existing syntax might be unhelpful. For instance, the example I gave earlier might be written,

#[actor]
mod olaf { 
	struct Olaf {
		alive: bool
	}

	#[perform]		
	impl Funcoot for Olaf {
		fn marry(&mut self, child: Baudelaire) -> Option<Fortune> {
			/* ... */
		}
	}

	#[perform]
    impl Stephano for Olaf {}
}

And already things have gone a bit outside the beaten path because, e.g. that module has to contain exactly 1 struct/enum definition, and that last impl block likely isn't a valid implementation and will not actually be output

I imagine that actor addresses can be parameters of messages as much as any other type, and that event handlers will have some way of retrieving a copy of their own address, so an actor that wants the response to go to its mailbox can simply send a handle to itself and the reciepient can use that handle to "manually" send the return message. Do you think there needs to be something more sophisticated involved, or do you mean that the framework should be handling this in the same way I talked about it handling direct-response messages?

Right, that's another way of approximating the same thing. What I'm thinking of specifically is Erlang's convenient gen_server:send_request/gen_server:check_response which also handles correlating requests to responses using a unique id (a reference()).

I guess I just think that 1) deadlock from interdependent calls is such a footgun in actor systems, and 2) these kinds of "asynchronous/non-blocking calls" are needed frequently enough that they might warrant more convenience than manual demuxing.

I don't quite understand enough Erlang to see how you'd actually use those functions in practice, but I think I can see the benefit of "give me a response to this but later" as a first-class notion. I'm less sure about what it'd look like to use though, and how much control you'd want over correlating responses to the requests they came from. Would you be able to give an example?

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.