Hash digest performance (Rust vs. Python)

Hi, I recently began to learn Rust (after using mostly Python for the last couple of years). As part of my training, I translate some system scripts to Rust. Among these is a script which traverses a directory tree, creates hash digests for each file, and compares the digest to a baseline. I noticed that the Rust version of this script takes 3x longer to execute, and tracked the difference down to the hashlib / sha256 performance. The Python (test) script looks like this:

#!/usr/local/bin/python3
import hashlib

def get_hash(file_path):
	myhash = hashlib.sha256()
	with open(file_path,'rb') as file:
		content = file.read()
		myhash.update(content)
		return myhash.hexdigest()

if __name__ == '__main__':
	filepath = "/path/to/file"
	get_hash(filepath)

The Rust code looks almost identical:

use sha256::try_digest;
use std::path::Path;

fn get_hash(filepath: &str) -> String {
    let input = Path::new(filepath);
    let hash_digest = try_digest(input).unwrap();
    hash_digest
}

fn main() {
    let filepath = "/path/to/file";
    get_hash(filepath);
}

The test file at /path/to/file is 229 MB (to make the difference more pronounced), and a release build of the above Rust code takes 0.99 second to complete, while the Python code finished in 0.19 seconds. I am aware that Python relies on a C implementation for hashlib, but I still wonder why Rust takes more than 5x longer (in this isolated example) and 3x longer (in the real world scenario with many small files described above).

I am grateful for any hints. This was posted twice, as Akismet hid my first attempt after an initial reply pointing to caching. Unfortunately, while Rust does profit a bit from caching (0.83 after the initial 0.99), Python remains much faster (0.19) even if executed first.

1 Like

What if you read the file once, then iterate hundreds of times hashing, and output the total time? This would reduce the measurement noise (from program loading and file reading), and make it easier to find precisely what is slower.

Edit: and even better, measure time inside the program instead of using an external tool.

The Rust version reads the entire file into memory then calculates the hash. 229 MiB is a big chunk of memory. I suspect the Python version uses a fixed size buffer. That's going to make a difference.

1 Like

The sha256 crate is just a wrapper around sha2::Sha256. That type supports io::Write, so you could use it with io::copy to stream the data.

Tried that, but this created a different problem. sha256::try_digest handles reading the file and calculating the digest, while sha256::digest requires a string (or byte string). So I tried

    let buf = std::fs::read(filepath).unwrap();
    let content = std::str::from_utf8(&buf).unwrap();

which causes an Utf8Error (the file in question is a gzipped database dump, which I usually would not convert to a string, but I found no way to derive a digest from the Vec directly). I still need to learn a lot about Rust.

It does not require a string.

I made a quick comparison by the way, reading the file into memory as a single blob, and measuring only the computation of the hash in-process. The results: Python 0.4s, Rust's sha256 crate: 0.6…0.7s.

I think this sha256 crate is simply a slower implementation. Try using another crate and see it it's faster. There's no intrinsic reason why the Rust version should be slower than the C underlying Python's hashlib.

They meant something like

use sha2::{Sha256, Digest};
use std::path::Path;
use std::fs::File;
use std::io;

fn get_hash(filepath: &Path) -> String {
    let mut file = File::open(filepath).unwrap();
    let mut hasher = Sha256::new();
    io::copy(&mut file, &mut hasher).unwrap();
    let digest = hasher.finalize();
    format!("{:x}", digest)
}

But I tried both this and the "read the whole file" approach and the Python hashlib version is still faster. I think it's just a more optimized implementation (for the architecture I tried it on at least). RUSTFLAGS="-C target-cpu=native" and LTO and a single codeunit helped some, but not enough to catch up.

1 Like

What OS, CPU and rustc version is this on? Are you running in release mode?

On my system (Ryzen 5900X, Arch Linux with export LD_PRELOAD=/usr/lib/libjemalloc.so and rustc 1.69.0-nightly) Rust is slightly faster:

➜  hash_perf git:(main) ✗ hyperfine ./hash.py          
Benchmark 1: ./hash.py
  Time (mean ± σ):     138.6 ms ±   2.7 ms    [User: 110.9 ms, System: 27.5 ms]
  Range (min … max):   133.7 ms … 142.2 ms    21 runs
 
➜  hash_perf git:(main) ✗ cargo build --release; hyperfine ./target/release/hash_perf
    Finished release [optimized] target(s) in 0.00s
Benchmark 1: ./target/release/hash_perf
  Time (mean ± σ):     125.5 ms ±   1.5 ms    [User: 98.7 ms, System: 26.5 ms]
  Range (min … max):   121.9 ms … 128.4 ms    23 runs
1 Like

Fixed my mistake:

let bytes = std::fs::read(filepath).unwrap();
let hash = sha256::digest_bytes(&*bytes).unwrap();

Thanks! I found a post on StackOverflow which provides both a solution for my misguided attempt at using sha256 with Vec and a working example for using io::copy in this context.

But as others pointed out in the meantime, the baseline (in terms of speed) is still the same.

sha2 also has an "asm" feature, but I don't know how much difference that makes.

2 Likes

Yes, I tried using sha2:Sha256 directly, without any significant improvement (0.77s instead of 0.83).

Yes, release mode, on a Mac (MacBookPro M1), rustc 1.67.1.

Also note that the sha256 crate depends on an older version of sha2

You should try using sha2 directly with the aarch64 asm feature. Give me a second to whip up the PoC.

Makes the Rust version pretty much the same, on my box.

1 Like

The asm feature apparently doesn't work with MSVC currently

Try this one: GitHub - ambiso/rust-sha2-example-with-aarch64-asm

Notice the asm features and the updated code:

4 Likes

Thank you! Now I am down to 0.17-0.19s, on par with Python's C-based lib (and learned quite a lot along the way).

FWIW, someone on the duplicate post (now unlisted) suggested using ring instead of sha2. The code is indeed slightly faster – 0.15-0.16s – (and a bit simpler) with ring:

let hasher = ring::digest::digest(&digest::SHA256, &bytes);
println!("{}", hex::encode(hasher));

Thanks again to everyone answering to my post (and not berating me for my ignorance).

10 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.