Best way to create a message-passing-heavy program with many more entities than available cores


#1

Hi everyone :slight_smile:,

A colleague of mine is doing something that involves him implementing the same solution to a problem in a number of different languages in order to compare their effectiveness. This particular solution is based around synchronous message passing via channels. He found that the Rust version he wrote was by far the slowest, which was a surprise. I volunteered to take a look at it, since I’m slightly more familiar with Rust than him and managed to get the program running about three times faster, but apparently, it is still relatively slow.

The major changes I made to it were to replace using spawn on std::threads with Rayon’s threadpool, and to switch from std::mpsc::channel to Crossbeam’s channels. I think the major issue is/was that this program was running directly on operating system-level threads, which are of course relatively heavyweight, whereas in this program each thread is doing relatively little work, and will probably spend much of its time waiting to send or receive via a channel, which I think causes this whole thing to run relatively slowly.

I figured that simply reducing the number of threads in the threadpool should help since I imagined that when something started waiting on a channel, it should yield control for the time being until the other end of the channel becomes available and the exchange can be scheduled. This turns out not to work, and I need to ensure that the threadpool is created with at least as many threads as there are supposed to be separate entities running. If I just let Rayon use the default number of threads in the pool, then the program deadlocks with entities waiting on one end of a channel - I am running this on a quad-core computer, but the smallest test run involves at least 32 separate logical entities. Crossbeam includes a standing-down mechanism so that this deadlock doesn’t really incur any actual CPU use (clearly, the OS threads are yielded to elsewhere), but no progress is made either.

Basically, so far as I can tell, it appears that what I’m really after is some sort of concept of lightweight/green threads. I’m aware that Rust used to have them and removed them, so of course, I looked around to see if any crates might provide what I need. All roads seemed to lead back to Tokio, but so far as I could see that is very much geared towards futures and IO-bound work, whereas what I’m looking at uses message passing and is CPU-bound. Moreover, even if the two are a good fit, I couldn’t work out how to use Tokio with Crossbeam’s channels.

I have modified the names of some parts of this program to entities from Norse mythology in order to (kinda) obfuscate details (the names weren’t chosen for consistency with the mythology, however - I am aware that the Jötnar probably wouldn’t normally be let into Asgard) and posted it on a GitHub Gist.

My questions are: Have I done this the best way possible in Rust as it currently stands? Is there some other crate I should use instead, or a different way of using the current crates? Or is this simply something that Rust isn’t really geared towards supporting at the present point in time?

Please do let me know if I need to clarify any point. Thanks in advance :smile:

EDIT: Looks like Tokio does indeed have some of what I’m looking for. I’ll have to give it a try. I would still love to hear from anyone who knows about how I might use Crossbeam’s channels with Tokio’s system (or get equivalent functionality), as the Crossbeam includes some extra functionality that I would love to use for something else.


#2

I think one fairly simple change would be to multiplex multiple entities over a single thread - so instead of # of threads = # of entities, you’d have (# of entities) / (# of cores you want to allocate to the program). You’d track which entities were allocated to which worker and send messages to that worker identifying the entity the message is for. This is, essentially, poor man’s MxN threading. You may get work imbalance, but it should be able to at least make progress.

Ultimately, you’re looking for fibers or coroutines, which tokio kind of is but, as you said, more geared towards async IO. There may be some Rust crates offering this, but I don’t have any personal experience with them.

Another thing you may want to look at is actix - it gives you an actor model, and you can likely model your entities as actors in that system - actix will manage the threading and allow you sending/receiving messages across the actors.


#3

I did know of Actix, but had completely forgotten about it… :flushed: Apparently, it turns out that Tokio also could probably do what I was looking for (see the EDIT in my post above), but I hadn’t found that part when I posted this. Thanks for your post :slight_smile: