The problem
I have a CLI tool that internally makes use of multiple threads via rayon
. But users of the tool may already parallelize their jobs by launching many instances of this tool (e.g. via xargs
or GNU parallel
). In that case I'd like to dial down usage of threads within each process, to keep machine-global number of threads reasonable (so it's closer to number of CPU cores, rather than cores squared) to reduce memory usage and total overhead. In my case launching 16 single-threaded copies of my program in parallel is better than running 16-threaded instances serially, but when 1 copy is running, I still want to maximize parallelizm. I'm OK with the limit causing less-than-optimal utilization for last few tasks/programs.
Solution
I'm not sure. I know there's jobserver
that limits coarse-grained tasks globally, but it doesn't seem to be suitable for sizing thread pools and tiny tasks that rayon
works with.
I guess I'd need some IPC. I'd rather not launch a global daemon to coordinate instances, since that's not expected from a simple CLI tool. Is there a portable solution? How would a unix-specific solution look like?