I have repeatedly found that code that is translated from C to Rust will perform pretty much the same on my x86-64 PC. Sometimes even better.
However on the ARM processor the Rust version performs substantially worse that the C.
Let's take a simple example. The recursive Fibonacci algorithm:
In C:
#include "stdio.h"
int fibonacci(int n) {
if (n < 2) {
return n;
}
return fibonacci(n - 1) + fibonacci(n - 2);
}
int main () {
int n = 24;
printf("fibo(%d) = %d\n", n, fibonacci(n));
}
In Rust:
fn fibonacci(n: i32) -> i32 {
match n {
0 => 0,
1..=2 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
fn main () {
let n = 24;
println!("fibo({}) = {}", n, fibonacci(n));
}
For which I get run times like this on the PC:
$ rustc -C opt-level=3 fibo.rs
$ time ./fibo
fibo(24) = 46368
real 0m0.021s
user 0m0.000s
sys 0m0.000s
$ gcc -Wall -O3 -o fibo_c fibo.c
$ time ./fibo_c
fibo(24) = 46368
real 0m0.027s
user 0m0.000s
sys 0m0.016s
Meanwhile on the ARM of the Raspberry Pi 3 I get this:
pi@aalto-1:~ $ rustc -C opt-level=3 fibo.rs
pi@aalto-1:~ $ time ./fibo
fibo(24) = 46368
real 0m0.011s
user 0m0.011s
sys 0m0.000s
$ gcc -Wall -O3 -o fibo_c fibo.c
pi@aalto-1:~ $ time ./fibo_c
fibo(24) = 46368
real 0m0.008s
user 0m0.000s
sys 0m0.009s
Note how C and Rust change places in run time.
I know that it's rude, crude and a bit silly to measure such short execution times with "time" but I have played with this in other ways and with different programs and seen the same phenomena.
The latest case is a much more substantial C to Rust translation and described in this thread: Transcribing C code to Rust - #10 by H2CO3 In that case Rust is 30% faster than C on x86-64 but 30% slower on the ARM.
This is a huge difference and I wonder what is going on?