When running Profile Guided Optimization, I would presume that it would become clear where most of the CPU time is spent. I would like tell the PGO to for example, optimize as much code for size as possible while ensuring that at most 1% of CPU time (as measured by the test run that was profiled) is spent on the code optimized for size. However, I cannot find any information on whether:
This is a totally standard optimization that is enabled by default.
I think when PGO marks code paths as unlikely to be taken it should naturally optimize them for size by avoiding expensive hot-code optimizations like loop unrolling and inlining, but I don't know how this compares to explicit global opt-level = "s".
On stable Rust you have Cargo's profile overrides that can select opt-level = "s" for crates (which doesn't apply to generic/inlined code compiled outside of the crate).
There's also #[cold] that prevents inlining, uses calling convention cheaper on the caller's side, marks blocks its in as unlikely to be taken. I think it also moves the function to a cold section of the executable and optimizes function body for size, but I don't have a reliable source to confirm this (high-level docs are non-committal about the details).
Thanks. Its seems that PGO has sensible defaults in this case. While fiddling with things can make things go faster and smaller, just trusting PGO to make the right decision seems like a good place to start. I can always try overriding it when (if) I get time for micro-optimizations