What's the best way of creating a simple number of worker threads?


#1

So ive noticed rust has “multiple producer producer single consumer”. But what about “multiple consumer single producer”?

Or how would I send a task to a channel and have 1 of the threads pick it up?
I was thinking of having like 4 threads (or tied to how many cores you have) and they just pick some work up of the queue, if they’re all busy then the queue just fills up until one of them can do it.
Would like to see a simple implementation that isn’t just using threadpool, as I’ve been struggling with coming up with a working example.


#2

Taking a quick glance at threadpool's source it appears to be using an Arc<Mutex<mpsc::Receiver>>. Basically converting the “multiple producer single consumer” queue into a “multiple producer multiple consumer” queue via locking on the consumer side.

A simple implementation would probably be just the same as threadpool minus

  • being able to mutate the number of threads while it’s running.
  • the Sentinel for running new threads when a thread dies from a panic (in fact, it seems that threadpool could drop this by switching to std::panic::catch_unwind)

Even with those features threadpool is not very complicated, 368 lines with around half of those being documentation.


#3

Note that distributing work to many worker threads may be harder than you think, as there are many ways to do this that won’t scale to mildly stressful scenarios (e.g. your average 32-core dual-socket server processing a stream of short tasks originating from a 10 GBps network connection) due to either overload in the scheduling thread or contention on the work queue.

Even mature projects like GCC’s OpenMP implementation (libgomp) get this wrong by using non-scalable design patterns like a shared work queue synchronized through locking (instead of, for example, multiple SPSC work queues with a work stealing load balancing mechanism).

If this project becomes more serious at some point, you may want to spend some time studying how existing tasking libraries do it, both in a Rust context (threadpool, Rayon…) and elsewhere (TBB, HPX, StarPU…), and considering whether this is really something that you want to implement yourself.


#4

Thanks both, this isn’t really for a project per-se just more to educate myself on how to do it using rust. I’ve come from Go which lets you do this both ways.
I think in reality I would just use threadpool.

What is the reason Rust only does mpsc and not both?


#5

Don’t know why stdlib doesn’t have spmc variant. In general, the concurrency and parallelism primitives in the stdlib are currently fairly bare bones, intentionally so I believe (there are crates that are used for incubating ideas, such as the already mentioned Rayon and also crossbeam).

For a spmc scenario, a basic impl would be to take crossbeam’s MsQueue (offers blocking get()) and then frame it with a Sender and Receiver, but flip the Send and Clone traits around - Sender is Send and Receiver is Clone + Send.

You can also wrap one of the threadpool APIs in a similar manner.

As @HadrienG mentioned, a mesh of spsc queues with a mostly shared-nothing design tends to scale best as contention and cross-core communication is minimized. I know you were just asking for educational purpose, but thought I’d throw my vote behind that nonetheless :smile:.