Im a college kid learning Rust and diving into implementing high performance math.
I tried implementing Gaussian Elimination in both Rust and C++, The Rust code is ~4x slower on my computer (using target-cpu=native / march=native)
(with par_azip as shown in the gist it beats cpp, but thats using 16 cores vs 1)
I'm mostly curious if there is something i could be doing to make it autovectorize as gcc is able to for c++
80ms for parallel rust (ryzen 5800x)
400ms for serial rust
100ms for c++ (not parallel)
Project setup for running everything. i dont use an external tool for benchmarking, just use std::chrono in c++ and tempus_fugit in rust