Limiting number of rayon threads globally per machine

The problem

I have a CLI tool that internally makes use of multiple threads via rayon. But users of the tool may already parallelize their jobs by launching many instances of this tool (e.g. via xargs or GNU parallel). In that case I'd like to dial down usage of threads within each process, to keep machine-global number of threads reasonable (so it's closer to number of CPU cores, rather than cores squared) to reduce memory usage and total overhead. In my case launching 16 single-threaded copies of my program in parallel is better than running 16-threaded instances serially, but when 1 copy is running, I still want to maximize parallelizm. I'm OK with the limit causing less-than-optimal utilization for last few tasks/programs.

Solution

I'm not sure. I know there's jobserver that limits coarse-grained tasks globally, but it doesn't seem to be suitable for sizing thread pools and tiny tasks that rayon works with.

I guess I'd need some IPC. I'd rather not launch a global daemon to coordinate instances, since that's not expected from a simple CLI tool. Is there a portable solution? How would a unix-specific solution look like?

Maybe I am mis understanding problem:

every x seconds, run ps aux | grep equiv in Rust, count # of running instances as N, limit self to ceil(16 / N) threads ?

Would it make sense for you to have one central "server" which does all the work in parallel, then the CLI tool is just a client which sends jobs to the server and waits for the results?

Are you sure that using a custom threadpool in Rayon isn't what you're looking for?

Maybe you could interact with GitHub - Nukesor/pueue: Manage your shell commands. be calling your CLI through shell scripts.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.