Benchmark to compare performance of Rust with C/C++?

Dear Rustaceans,

I need to prepare a benchmarking result which compares the relative performance(running time) of Rust compared to C/C++. One suggestion I got for benchmarking is the "CoreMark" benchmark for C.
However, the "CoreMark" benchmark rather seems like a benchmark aimed to test the performance of the hardware CPU, and not the language itself.

My experience with Rust is not long(3 months), so I'd really appreciate it if I could get some advice here! What would be a good target benchmark to reasonably compare performance of Rust code with C/C++ code ?? Also, what properties should a benchmark have in order to highlight the weakness of Rust in terms of running time (compared to C/C++) ??

I don't exactly understand what you want to do, but it already exists here:

That might give you some ideas.


In general, you should get about the same speed, especially for "hot" parts of the code that are hand-tuned. Like C/C++ Rust is compiled natively and gives fine-grained control over memory usage. It uses LLVM, and is more-or-less on par with performance-related features of C and C++.

However, it's hard to generalize performance comparison to the whole language, because idiomatic Rust and idiomatic C have different programming styles. For example, Rust has many more high-level constructs than C, so users may accidentally write programs with more small overheads (e.g. allocating a new String instead of writing bytes to some fixed-size buffer). Rust has generics, which generate more code, but also optimize better than in C (and should be comparable to C++ templates).

OTOH, Rust offers much more mature and safer libraries for parallelism, so you could expect idiomatic Rust programs to take advantage of multi-core CPUs more often than in C/C++, where this is a risky proposition.


Thank you for your answer :slight_smile: I'm in a situation where I have to verify and explain the pros and cons of using embedded Rust (compared to c/c++). I am aware of 'Benchmarksgame', but I wanted to find out other benchmarks that can compare performance of Rust and C/C++ in an embedded device. I should definitely look into the benchmark programs on benchmarksgame. Again, thanks for the response!

In this case I suggest to compare C/C++ with C/C++ :slight_smile: If you don't use clang, then compile your existing embedded project on C/C++ with clang and your usual compiler. That give you idea how fast will be Rust code on exactly your device with exactly your company code, because of Rust compiler and clang share the same backend (llvm) that responsible for optimization and code generation.


Wow, that is actually a very interesting advice !
It sounds quite plausible (to me at least) !
Thanks a lot for your response :slight_smile:

Thanks for the response.
I am considering using Rust for a project which must be competitive in it runtime.
Can you share some insight as to the specific language features that make rust slower for the following benchmarks: fasta, fannkuch-redux, reverse-complement, mandelbrot, regex-redux, k-nucleotide
That will help me understand whether Rust is a good fit for my project

I've written up my thoughts about it here:


@kornel Thank you for sharing your insights! Your blog post is a true gem! :smiley:

The below paragraph is from the blog post:

Rust strongly prefers register-sized usize rather than 32-bit int . While Rust can use i32 just as C can use size_t , the defaults affect how the typical code is written. usize is easier to optimize on 64-bit platforms without relying on undefined behavior, but the extra bits may put more pressure on registers and memory.

Does this mean that APIs of Rust std generally have usize types for their parameters, while C std APIs generally have int types for their parameters ??

What also came to my head at this paragraph was that Rust mandates use of usize types for array indices. I thought there could be runtime overhead to this policy (compared to using int as array indices in C programs), but experiment on Godbolt seems to show that there is no extra runtime overhead for the policy.. (one with good knowledge of low-level assembly would've not thought like me in the first place :slightly_frowning_face: )

Interesting. In an effort to optimize one Rust program I arranged for it to be buildable using u32 or u64 for the bulk of it's work. usize for all array indexing though.

It runs over 3 times faster using u32 than u64. On an x86_64 machine.

$ RUSTFLAGS="-C opt-level=3 -C debuginfo=0 -C target-cpu=native" cargo build --release --features=use_u32 --features=serial
$ time ./target/release/tatami_rust 200
Using 32 bit integers.
real    0m0.652s
user    0m0.625s
sys     0m0.000s

$ RUSTFLAGS="-C opt-level=3 -C debuginfo=0 -C target-cpu=native" cargo build --release --features=use_u64 --features=serial
$ time ./target/release/tatami_rust 200=0
Using 64 bit integers.
real    0m2.312s
user    0m2.281s
sys     0m0.016s

Code is here if anyone wants to play:

Of course when I want to run that code on a bigger problem it needs 64 bit integers to get the right results.

As your quote hints, selecting integer size for optimal performance should be done carefully for each application. Assuming one does not need 64 bits to actually get the job done.

1 Like

C stdlib tends to use size_t, but because of implicit conversions int is usable pretty much everywhere. Array indices generally don't care what type you use.

size_t is not "infectious" in C, so user code and libraries use whatever they like (which may be int, or unsigned int, or long, or sometimes they're nice enough to use size_t).

1 Like