Not optimized properly?

VaibhavDS19 · February 27, 2021, 11:01am

I was just messing around to see which is optimized more and saw the following results.

Overall
C++ with O3 6037338
C++ with Os 11049853
Rust with alias 11361628
Rust with noalias 11535222

Average
C++ with O3 6037.338
C++ with Os 11049.853
Rust with alias 11361.628
Rust with noalias 11535.222

Standard deviation
C++ with O3 7285.095556803356
C++ with Os 6220.168290439657
Rust with alias 14224.509900436498
Rust with noalias 7729.57509380147

Rust code
use std::{io, time::Instant};
fn main() {
    let mut s = String::new();
    io::stdin().read_line(&mut s).expect("Parsing error");
    let inp = s.trim().parse::<i32>().unwrap();
    let mut a: [i32; 100_000] = [0; 100_000];
    let mut b: [i32; 100_000] = [0; 100_000];
    let mut c: [i32; 100_000] = [0; 100_000];
    let n = Instant::now();
    for i in 0..inp {
        a[i as usize] = i;
    }
    for i in 0..inp {
        b[i as usize] = i;
    }
    for i in 0..inp {
        c[i as usize] = a[i as usize] + b[i as usize];
    }
    let now = n.elapsed();
    println!("{:?} {}", now.as_nanos(), c[inp as usize- 1]);
}

Is there some reason why the alias version is having the same or similar level of speed as non-alias version? Plus C++ being much faster without any aliasing disabled? The program is a 100000 integer array and I used the same code in Rust and C++ too, so at least in Rust no alias should be somewhat faster?

IndianBoy42 · February 27, 2021, 11:05am

We can't really say anything without the code

But a few thoughts:

C++ has type base aliasing analysis, so it's not technically correct to say "no aliasing disabled"

Whether noalias has any impact at all really depends on your program (which again we have no idea what yours is)

Did you use cargo run/build --release?

jhpratt · February 27, 2021, 11:05am

With no code whatsoever, it's impossible for anyone to say. My best guess would be that your two programs are not actually doing the same thing.

VaibhavDS19 · February 27, 2021, 11:08am

I used cargo run --release plus no aliasing enabled for rustnoalias and cargo --release for rustalias

SkiFire13 · February 27, 2021, 1:46pm

inp may be bigger than 100_000, meaning those array accesses have to be bound checked. Bound checks prevent auto vectorization, which may significantly speed up things. Auto vectorization may also be disabled in -Os, so that may explain why C++ with -Os performs like Rust.

Usually performance issues like this can be solved in 2 ways:

By moving the assertion before the loop and relying on LLVM to remove the bound check from the loop body. This approach however is not always the best and often can fail to trigger the optimization, so I wouldn't rely on it. In fact I would expect LLVM to be able to optimize the bound checks in this code, but it doesn't.

    let inp = inp as usize;
    assert!(inp <= 100_000);
    for i in 0..inp {
        a[i] = i as i32;
    }
    for i in 0..inp {
        b[i] = i as i32;
    }
    for i in 0..inp {
        c[i] = a[i] + b[i];
    }

By using iterators. This is the most idiomatic way and often they optimize better than plain indexing. Here I use for-loops because it doesn't make a difference, but in general internal iteration (e.g. for_each and fold) can often be optimized even better.

    let inp = inp as usize;
    for (i, ai) in a[0..inp].iter_mut().enumerate() {
        *ai = i as i32;
    }
    for (i, bi) in b[0..inp].iter_mut().enumerate() {
        *bi = i as i32;
    }
    for ((ci, ai), bi) in c[0..inp].iter_mut().zip(&a).zip(&b) {
        *ci = *ai + *bi;
    }

kornel · February 27, 2021, 1:52pm

Don't use [] in Rust. It's generally slow. Use iterators instead.

[i] in Rust is equivalent of vector.at(i) in C++.
.get_unchecked(i) is equivalent of [i] in C++.

jeffsmith82 · February 28, 2021, 1:08pm

This code with the same benchmarks got posted to /r/cpp before it was removed if you want to have a compare. The reserves where recommended to speed it up.

#include
#include
#include
#include<unistd.h>
using namespace std;

int main() {
vector a, b, c;
cin >> n
//int n = 1000000000;
//a.reserve(n);
//b.reserve(n);
//c.reserve(n);
auto start = chrono::steady_clock::now();
for(int i=0;i<n;i++) {
a.push_back(i);
}
for(int i=0;i<n;i++) {
b.push_back(i);
}
for(int i=0;i<n;i++) {
c.push_back(a[i]+b[i]);
}
auto end = chrono::steady_clock::now();
cout << chrono::duration_cast<chrono::milliseconds>(end-start).count()
<<" ";
cout << c[n-1] << "\n";
return 0;
}

This was a rust version of the code someone said should be fast and is pretty standard code.

use std::time::{Duration, Instant};

fn main() {
    let now = Instant::now();
    let n = 1000000000;
    let a: Vec<_> = (0..n).collect();
    let b: Vec<_> = (0..n).collect();
    let c: Vec<_> = a.iter().zip(b.iter()).map(|(a, b)| *a + *b).collect();
    println!("{}", now.elapsed().as_millis());
    println!("{:?}",c.len());
}

On my machines the rust code is 2.2 seconds and the cpp code ( using -O2) takes 6.7 seconds where both are hardcoded with an input of 1000000000 and the cpp one reserves the vectors.

I also tried switching on noalias with this but its slower.

RUSTFLAGS="-Z mutable-noalias=yes" cargo +nightly run --release

VaibhavDS19 · March 1, 2021, 2:33pm

Is there any reason why it is slower when no alias is used? I thought it should be a tiny bit faster, at least because it doesn't have extra checks.

jeffsmith82 · March 1, 2021, 5:50pm

Not 100% sure why but I'm getting this.

Cargo run --release = 2467ms
cargo +nightly run --release = 2372ms
RUSTFLAGS="-Z mutable-noalias=yes" cargo +nightly run --release = 2504ms

system · May 30, 2021, 5:51pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Is LLVM well-optimised towards C++ compared to our beloved Rust? community	6	326	February 24, 2025
Possible Rust-specific optimizations	5	1259	November 13, 2022
Rust-specific code optimisations vs other languages help	11	1419	January 3, 2021
Looking for help understanding Rust's performance vs C++ community	28	8220	November 1, 2019
Non-aliasing guarantees of &mut T and rustc optimization help	17	1202	March 11, 2020

Not optimized properly?

Related topics