For crate sha2, comparing after Deref or .get_unchecked(), which is faster?

bilibili_UID_1702461 · May 13, 2023, 6:51am

use sha2::*;
fn main() {
    let a = [0u8; 32];
    let h = Sha256::new_with_prefix(&a).finalize();
    assert!(a == *h);
    for (i, v) in a.iter().enumerate(){
        unsafe{assert!(*v == *h.get_unchecked(i))}
    }
}

Code will panic, but don't care about that, it's not the question. The question is: I don't know how GenericArray works, it's in 3-party crate. The 1st assert! uses Deref operator, the 2nd assert! uses unchecked indexing, which is faster?
If using unchecked indexing for both a and h:

for i in 0..32{
    unsafe{assert!(*a.get_unchecked(i) == *h.get_unchecked(i))}
}

Can it run faster?

Michael-F-Bryan · May 13, 2023, 7:37am

The best way to answer this question is to just measure it. Google something like "Rust benchmarks" or check out Criterion for a nice benchmarking framework.

You'll probably find that going through Deref with assert_eq(a, h) will be a lot faster than your unsafe version. You are checking each byte one-by-one and possibly triggering a panic after each check, which isn't terribly efficient.

The standard library will implement PartialEq for [u8] in a way that compares multiple bytes at a time using SIMD, which lets you process data a lot faster.

vague · May 13, 2023, 7:43am

assert!(a == *h); might be faster: Compiler Explorer

Since GenericArray in generic_array - Rust derefs to slice, you're actually comparing array == slice with naive iterator.

They're not exactly equivalent here, because

when lengths differ: the first will assertion will fail, but the second might not
array: PartialEq<slice> will let slice try into the array in the first step, and compare the entire array at once if BytewiseEq

H2CO3 · May 13, 2023, 10:31am

You shouldn't use unsafe for such a trivial comparison. Don't mind "which is faster". This is probably completely meaningless outside a more specific context anyway. Just use slice equality and move on.

kornel · May 14, 2023, 1:49am

If you use it in context of encrypted data, fast may be inappropriate (a potential timing attack/oracle), and you might need constant_time_eq.

For seeing which construct generates nicer code I recommend rust.godbolt.org, but remember to add -O -C target-cpu=native to the flags field, because default debug code is always terrible.

bilibili_UID_1702461 · May 18, 2023, 8:21am

let mut a = [0u8; 65536];
    for i in &mut a{
        *i += 1
    }

let mut a = [0u8; 65536];
for i in 0..a.len(){
        unsafe{*a.get_unchecked_mut(i)=*a.get_unchecked(i)+1}
    }

Use "cargo build --offline --release" to build them in release mode, the unsafe version is smaller, which has 4429088 bytes. The output file of safe version is a little larger.

bjorn3 · May 18, 2023, 10:10am

Both variants compile to the exact same code for me with optimizations enabled. In fact LLVM merges both functions together if both are put in the same file: Compiler Explorer

bilibili_UID_1702461 · May 19, 2023, 7:54am

I compiled them on ARM platform, and found they're different.

zirconium-n · May 19, 2023, 8:01am

What's exact target you are using? I tried some typical ARM targets and the outputs are the same.

system · August 17, 2023, 8:01am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Is sha256 hashing in rust slower than go? code review	16	1552	December 11, 2023
`copy_from_slice` benchmarking very slow help	8	1088	August 30, 2020
Performance of array access vs C	32	4547	August 30, 2020
Which implementation is faster and why?	2	524	June 9, 2020
Performance over safety help	6	418	September 5, 2019

For crate sha2, comparing after Deref or .get_unchecked(), which is faster?

Related Topics