Farming out asynchronous work chunks from an Actix Actor

#1

I am exploring the actix framework, and I wanted to know what the best approach would be to having an actor “farm” out work to a bunch of asynchronous tasks, and then combine all the results into a single result.

I know that the actor that is doing the distributing can be an ActorFuture, but I am not sure how to then distribute the work. The work being distributed can be from 1 chunk, to many. Is it a good approach to create a bunch of child actors (one for each chunk) to receive the work, and then stop them once they return their answer? Is it efficient to constantly create/stop actors like this? If not, what would be another approach to this?

Bonus round: The work chunks actually might have a DAG like structure, where some depend on each other, so it would be good if they can communicate amongst themselves as well.

Thank you for the help.

0 Likes

#2

I’ve made a crate called desync that provides a (semi?) novel approach to this kind of task. It differs from the usual threaded paradigm in that it provides a way to schedule tasks on data.

In FlowBetween, I’ve used the pipe() function and the basic stream functions to process/combine the results a lot to achieve this kind of scheduling.

There’s also fairly straightforward way to use the basic scheduling functions (sync(), desync(), after() and future()) to achieve the same thing in a way that makes it clear what’s going on. Say you have a set of data you want to operate on asynchronously, defined like this:

let data1: Arc<Desync<Data1>> = ...;
let data2: Arc<Desync<Data2>> = ...;
...

You can schedule some processing on them like this:

let future1 = data1.future(|data1| data1.get_result());
let future2 = data2.future(|data2| data2.get_result());
...

and also re-use results from one operation with another data object:

let future3 = data3.after(future2, |data3, future2_result| 
    data3.get_result(future2_result.unwrap()));

The sync() and desync() calls are also good for passing data around and can produce more straightforward code than using futures everywhere. Also note that desync has strict ordering semantics so after will wait for future2 to complete before any other operation can be performed on data3 here. I generally use pipe() when I want to treat the completion of the future as a fully asynchronous event. It’s also possible to use another Desync as a scheduler:

let scheduler = Desync::new(());
let future3 = scheduler.after(future2, move |_, future2_result|
    data3.sync(move |data3| data3.get_result(future2_result.unwrap())));

Finally, you can use select() to combine the results:

let combined_result = future1.select(future3);
1 Like