Limiting number concurrent LTO jobs without lowering compilation parallelism?

kornel · March 20, 2024, 2:56pm

I have a workspace with many binaries, but I don't have enough RAM to run more than one LTO job at once. Linking with LTO tries to link several of them at once, it runs out of memory, and everything grinds to halt until the OS kills ld and cargo.

OTOH if I run cargo build -j1, it takes forever to compile all the dependencies one by one.

Is there a way to keep full parallelism for building crates, but force LTO linking steps to be done one by one?

bjorn3 · March 20, 2024, 3:38pm

I don't think there is a way to do that. The jobserver protocol doesn't provide a way for a process to request all job tokens as would be necessary to prevent any other build process from running at the same time. You can only ask for a specific amount of tokens. If you wanted to limit the LTO step for a single process to a single thread while still allowing other processes to run, -Zno-parallel-llvm allows disabling all parallelism within the codegen backend of a single rustc process,

farnz · March 20, 2024, 4:57pm

Purely for my curiosity (since I'm not currently volunteering to implement anything to help), when you say "jobserver protocol", do you mean the GNU make jobserver protocol?

If so, would this problem be mitigated if cargo as the root jobserver measures available memory and provides an equivalent of GNU make's --max-load option requiring a certain minimum amount of available memory before allowing a process to claim a token, unless there's no child processes running?

There's a lot of detail that'd have to be considered - the obvious problem I can see in my design is that processes don't claim all the memory they're going to use immediately on startup, so we'd need to consider time since last token was issued as well as available memory to keep parallelism high without exhausting memory.

the8472 · March 20, 2024, 4:59pm

See Limiting the parallelism automatically · Issue #12912 · rust-lang/cargo · GitHub

Basically there are two issues

the jobserver manages CPU consumption, not memory. those are distinct resources
oversubscribed CPUs are non-fatal, OOM is. So the former is much more forgiving about transient excesses

If we want to manage memory consumption we need a new mechanism. Or if that's too difficult then treat the whole thing as fallible and add some recovery.

That's not a thing. The jobserver is distributed, like a single shared semaphore that all processes access through the same API. The processes don't negotiate individually with a central broker whether they can have a token.

kornel · March 20, 2024, 5:26pm

Is Cargo not aware when the linker is invoked? cargo build --verbose shows the invocation. Is that coming from rustc?

I thought it could be entirely independent and unrelated to jobservers. First Cargo could build dependencies and all the lib targets, and then serialize rest of the build graph.

bjorn3 · March 20, 2024, 6:15pm

Rustc invokes the linker. Cargo doesn't know anything about this. I don't see the linker invocation with cargo build -v. Also in rustc LTO is not done by the linker, but by rustc itself. (Unless -Clinker-plugin-lto is used.)

Topic		Replies	Views
How to force cargo to honor the specified number of parallel compile jobs? help	14	619	December 14, 2022
Install/Compile Consuming lots of memory	4	886	May 26, 2023
[solved] How to enable LTO only in final build step? help	2	791	January 12, 2023
Increasing the Number of `rustc` Commands Generated by Cargo for Distributed Compilation	7	195	October 17, 2023
Is it normal for rust to only compile on every second CPU thread?	2	406	August 30, 2022

Limiting number concurrent LTO jobs without lowering compilation parallelism?

Related Topics