I noticed that if I initialize a variable with constant input inside a loop, compiler will automatically hoist it out of the loop to avoid the unnecessary cost. However, if the initialization requires external source, such as I/O, the compiler will correctly not hoist.
This godbolt illustrates what I mean: https://godbolt.org/z/M8n6hz3q4. Notice if it's just newing OsStr from string literal, the assemblies are exactly the same. Not if it's taking runtime environment variable.
Is this behavior guaranteed? Where can I learn more about this? Thanks!
Compiler optimizations are generally not guaranteed to happen. If the semantics of your code depends on such behavior, it's likely a bug.
This behavior exists even if I don't specify "-O", so I thought it's some sort of feature (like the famous zero-cost abstraction). If not, I can easily hoist the variable myself, not a problem.
Some optimizations happen even without
-O. Although it's not technically guaranteed, I don't think you should bother manually performing such micro-optimizations upfront. Most of the time they just don't matter, and you can always benchmark and see if they start to, once you determined that your code isn't fast enough.
Note that manually hoisting a variable initialization out of the loop isn't just a busywork micro-optimization, it also isn't guaranteed to stay that way in code. The compiler is free to move your initialization back into the loop if it thinks it helps (although I find it hard to imagine where it could be beneficial). The compiler can arbitrarily move around your code, duplicate it or remove entirely, as long as it can prove that it doesn't affect the observable behaviour of your program.
Now, you can try to hint it by structuring the code in a certain way, thus making some optimizations more likely, but it's a very counterintuitive and brittle technique. Often the compiler will transform your code in entirely unanticipated ways, and that may change based on some seemingly random details, such as compiler version, code in other modules or even whitespace (I believe I saw a post a few weeks ago where performance changed significantly due to code layout, since it affected some random parameters of compiler inlining heuristics).
So as a rule, write code to be correct and maintainable first, think about big performance ptifalls second (use correct datastructures, minimize memory allocations), and only microoptimize when you have real benchmarks to prove that the performance issue exists and that your changes make an improvement.
It's called rematerialization. It saves a register / stack space.
Here is an example:
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.