The rustc manual says that codegen-units=1 disables thin-local lto.
lto
- When
-C ltois not specified:
codegen-units=1: disable LTO.opt-level=0: disable LTO.
codegen-units
This flag controls the maximum number of code generation units the crate is split into. It takes an integer greater than 0.
When a crate is split into multiple codegen units, LLVM is able to process them in parallel. Increasing parallelism may speed up compile times, but may also produce slower code. Setting this to 1 may improve the performance of generated code, but may be slower to compile.
The default value, if not specified, is 16 for non-incremental builds. For incremental builds the default is 256 which allows caching to be more granular.
So Enable thin-local lto or set codegen-units to 1, which of the two methods can generate programs with better performance?