For crate sha2, comparing after Deref or .get_unchecked(), which is faster?

use sha2::*;
fn main() {
    let a = [0u8; 32];
    let h = Sha256::new_with_prefix(&a).finalize();
    assert!(a == *h);
    for (i, v) in a.iter().enumerate(){
        unsafe{assert!(*v == *h.get_unchecked(i))}

Code will panic, but don't care about that, it's not the question. The question is: I don't know how GenericArray works, it's in 3-party crate. The 1st assert! uses Deref operator, the 2nd assert! uses unchecked indexing, which is faster?
If using unchecked indexing for both a and h:

for i in 0..32{
    unsafe{assert!(*a.get_unchecked(i) == *h.get_unchecked(i))}

Can it run faster?

The best way to answer this question is to just measure it. Google something like "Rust benchmarks" or check out Criterion for a nice benchmarking framework.

You'll probably find that going through Deref with assert_eq(a, h) will be a lot faster than your unsafe version. You are checking each byte one-by-one and possibly triggering a panic after each check, which isn't terribly efficient.

The standard library will implement PartialEq for [u8] in a way that compares multiple bytes at a time using SIMD, which lets you process data a lot faster.


assert!(a == *h); might be faster: Compiler Explorer

Since GenericArray in generic_array - Rust derefs to slice, you're actually comparing array == slice with naive iterator.

They're not exactly equivalent here, because

  • when lengths differ: the first will assertion will fail, but the second might not
  • array: PartialEq<slice> will let slice try into the array in the first step, and compare the entire array at once if BytewiseEq

You shouldn't use unsafe for such a trivial comparison. Don't mind "which is faster". This is probably completely meaningless outside a more specific context anyway. Just use slice equality and move on.


If you use it in context of encrypted data, fast may be inappropriate (a potential timing attack/oracle), and you might need constant_time_eq.

For seeing which construct generates nicer code I recommend, but remember to add -O -C target-cpu=native to the flags field, because default debug code is always terrible.

let mut a = [0u8; 65536];
    for i in &mut a{
        *i += 1
let mut a = [0u8; 65536];
for i in 0..a.len(){

Use "cargo build --offline --release" to build them in release mode, the unsafe version is smaller, which has 4429088 bytes. The output file of safe version is a little larger.

Both variants compile to the exact same code for me with optimizations enabled. In fact LLVM merges both functions together if both are put in the same file: Compiler Explorer


I compiled them on ARM platform, and found they're different.

What's exact target you are using? I tried some typical ARM targets and the outputs are the same.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.