Dramatic Increase in Compile Time with Fat LTO in Release Build: Causes and Troubleshooting?

0xtyls · September 7, 2023, 6:12am

We're working on a Rust project and have noticed a significant difference in compile times between the debug and release builds. When compiling to a debug binary, it completes in roughly 6 minutes. However, when we compile for a release build using fat LTO, the compile time dramatically increases to approximately 90 minutes. Notably, the LTO process alone accounts for more than 1 hour of this duration.

What potential reasons might be causing this disparity in compile times? Are there any recommended methods or tools for troubleshooting this issue?

blonk · September 7, 2023, 7:13am

Fat LTO tends to have this effect. When we build python (C) with LTO enabled on an embeded platform it builds for several hours, as opposed to 20 minutes or so without LTO.

I believe it's in the nature of fat LTO rather than a problem to be solved.

0xtyls · September 7, 2023, 8:24am

Thank you for sharing your experience with LTO and Python (C) builds. I understand that fat LTO can naturally lead to longer compile times. However, in our other projects, the discrepancy between debug and LTO release compile times wasn't as pronounced. I'm keen to identify factors that could amplify this gap so that we might adjust our codebase to mitigate such extensive compile times.

blonk · September 7, 2023, 9:13am

I would suggest you experiment with using different linkers, though I'm not sure about LTO maturity with different linker alternatives out there. See Configuration - The Cargo Book

alice · September 7, 2023, 11:39am

In my experience, that's just fat LTO tends to be.

bjorn3 · September 7, 2023, 12:01pm

If you want most of the runtime perf benefit of Fat LTO (and in some cases actually better runtime perf) with a much smaller compile time hit you should use ThinLTO instead. Fat LTO effectively merges all code into a single LLVM module and then optimizes and codegens it as a single unit, which leads to getting hit by some optimizations being quadratic in input size and only allows using a single cpu core. ThinLTO on the other hand keeps all codegen units separate with separate parallel optimizing and instead only has a single serial pass where it collects merges summaries of all codegen units to be used to optimize all other codegen units. These summaries can for example contain the llvm ir of candidates for inlining and other information that helps optimizations. ThinLTO is capable of using all cpu cores for optimization, thus making it much faster.

FRGH · September 7, 2023, 3:29pm

Compiler optimizations are computationally complex - that's why most compilers by default will only apply them to code on a per-object basis. LTO will apply optimizations to your entire codebase.
You should expect that the increase in time will be superlinear. 90 minutes doesn't seem to bad to me, at least it finishes without OOMing

system · December 6, 2023, 3:30pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Significant Increase in Fat LTO Time Post Update to rustc 1.75.0-nightly help	4	257	January 24, 2024
Status of LTO option help	8	997	December 19, 2020
Release builds are twice as fast compared to debug builds for my project	9	1221	January 12, 2023
Latest compile time profiling / troubleshooting guide? help	4	663	January 12, 2023
Noob: Why is the performance of release build so much better? help	13	5081	March 18, 2021

Dramatic Increase in Compile Time with Fat LTO in Release Build: Causes and Troubleshooting?

Related Topics