I'm encountering a weird situation. I have a very large module A which has lots of code. To make it cleaner, I moved some code from module A to its submodules A1, A2 and A3. However, this change significantly downgrade the performance of the code by 2-3%. The worse part is that the benchmark of some functions in module B (a sibling of A) regress by 10%. I compared the 2 branches multiple times and confirm that no other changes are introduced. This does not make any sense to me. I wonder has anyone met a similar situation before?
Rust 1.63.0.
Benchmark using criterion + criterion-perf-events which counts CPU cycles instead of wall/cpu time. It gives stable benchmark results and have been reporting reasonable benchmark results.
I doubt this is "weird", rather it is to be expected. If you change anything, then you can expect significant changes in performance. For example, it may be that the code now has a different layout in memory and doesn't fit in some memory cache any more.
A quick sanity check ... we are discussing a release build. Correct?
I suspect adjusting codegen-units will help. The combination of breaking up your code and a non-zero value for codegen-units means the optimizer has less visibility.
If you've got 40 minutes, here's a good talk on code layout and related things, which Hyeonu mentioned as the likely culprit:
If you don't have 40 minutes, the short of it is that changing A can change B in ways that significantly affect its performance even if they're supposedly unrelated.