I found that my matrix multiplication code will be optimized to using SSE instruction set by default. That's not what I want. How to disable the auto vectorization during compilation time.
There's some codegen flags you can set to disable (parts of) autovectorization.
However, note that on x86-64 platforms, you can't disable the use of SSE; this is not just the vector ISA, it's also the ISA for all floating-point operations. You will thus see SSE and SSE2 instructions for any floating point usage.
Is there something particular that you're trying to do, like target a particular old CPU?
I believe you can set a target CPU with flags to the compiler.
Looks like floating point operations will use SSE and SSE2 instructions by default on modern x86-64 platforms as you mentioned before. I misunderstood that floating-point matrix multiplication may use FMUL instructions. I looked to the Intel sdm vol but couldn't find the useful notice. Do any docs or resources to reference?
It's something that gets driven by your platform ABI; for example, the Linux ABI for x86-64 says that floating point parameters and results are passed in and out of functions in SSE registers. Similarly, the Windows x64 calling convention puts FP in SSE registers.
On top of this, the x87 FPU is unchanged since the last 32-bit processor, while all x86-64 platforms have double the SSE registers compared to 32-bit mode, and must support SSE and SSE2 instructions; because SSE2 is easier to work with than x87 if you're a 1990s or later compiler design, most compilers stick to SSE2 for FP maths.
Thanks for your advice. I will give a try. What you mean is when target a old CPU without SSE、SSE2 support it won't use SSE instructions.
Unless I didn't follow what you meant, then feel free to extend your thoughts.
It helps a lot!