macOS performance issues (much slower than Linux/Windows!)

As a little follow-up to:


Why is performance (throughput) of my Rust application so much worse on macOS? :anxious_face_with_sweat:

Here are some test results:

(all values in MiB/s)


The corresponding flame graphs:

flamegraph-gnu|690x0

flamegraph-macos|690x0


Actually, the single-thread performance of the M2 Pro should blow away my Ryzen CPU.

At least according to Cinebench:


I understand that, on the PC (x86-64), using -Ctarget-cpu=native allows the compiler to use AES-NI, which certainly gives a big boost to my code that builds upon the AES block cipher.

Still, why is the macOS build so much slower, even in the "default" case?

Also, doesn't the Apple M2 CPU support ARM Crypto instructions that should accelerate AES computations in the -Ctarget-cpu=native case? But, apparently, on macOS, using adding the -Ctarget-cpu=native option to RUSTFLAGS doesn't make any difference at all...

So, is rustc just not optimized that well for Apple Silicon, or am I missing something here?

Best regards.

Was the AES code you're using written to use those native instructions?

Yes, Rust aes should support AES-NI and ARMv8 intrinsics.

BTW: The documentation talks about an "aes_armv8" feature, but this does not seem to exist, neither as a crate feature nor as a -Ctarget-feature option in RUSTFLAGS :thinking:

Nonetheless, the graph shows that aes::armv8::expand::expand_key is used on macOS.

If you want to dig into where the performance issues are you'll need to do some instruction-level profiling. The Instruments tool in XCode is probably the easiest way do do that on macOS.

Since I'm not an expert in assembler code nor a compiler developer, I don't even know what I would be looking for, or how this could help me to resolve the performance issue... :thinking:

The aes_armv8 feature was removed a while ago in aes: enable ARMv8 backend by default; MSRV 1.72 by tarcieri · Pull Request #395 · RustCrypto/block-ciphers · GitHub , but it looks like this has not quite been released yet. You could try the latest release candidate version of 0.9.0-rc.2 to see if it makes any difference.

1 Like

I see!

Actually, I'm already using version 0.9.0-rc.2, because version 0.8.x has a dependency on "generic-array" (version 0.x), which is now deprecated and causes many deprecation warnings.