Read and hash (SHA1) at the same time?

Is there a crate (or builtin way) that can wrap a Reader, be a Reader itself and hash (SHA1) the data while it is being read?

I feel like this must be a common task but a quick search didn't turn up anything.

Maybe it's easy to implement (I haven't tried yet) and that's why there's no crate?

It should be easy to write a small wrapper around any writer using a SHA-1 crate that supports incremental updating, e.g. sha-1:

use std::io::{ Write, Result as IoResult };
use sha1::Sha1;

pub struct Sha1Writer<W> {
    writer: W,
    hasher: Sha1,
}

impl<W> Sha1Writer<W> {
    fn new(writer: W) -> Self {
        Sha1Writer { writer, hasher: Sha1::new() }
    }

    fn into_digest(self) -> impl AsRef<[u8]> {
        self.hasher.finalize()
    }
}

impl<W: Write> Write for Sha1Writer<W> {
    fn write(&mut self, buf: &[u8]) -> IoResult<usize> {
        self.hasher.update(buf);
        self.writer.write(buf)
    }

    fn flush(&mut self) -> IoResult<()> {
        self.writer.flush()
    }
}

You might want to implement all the default methods of io::Write as well, for the sake of performance.

Furthermore, if you can choose the hashing algorithm, don't choose SHA-1 – it's not considered secure anymore. Try at least SHA-256 or some of the SHA-3 candidates instead, such as BLAKE2 or Keccak.

3 Likes

Thanks for the solution. I have a few questions:

You implemented a Write, I was talking about a Read. That's just for example purposes, right? Or are you suggesting I should use a Write and somehow copy the read data into it?

Just to make sure I get this right: In my Sha1Reader I should implement for example read_vectored() that calls read_vectored() of the wrapped reader. If I don't do that, by implementing Read for Sha1Reader I get the default implementation, but that would just call read() instead of read_vectored(). Correct?

I'm reading a file from a remote service and they provide a SHA1 hash for error detection. So I can't choose it, no, and I think for error detection SHA1 is still fine anyway?

Same as MD5, SHA1 checksum cannot detect malicious middleman nowadays. But it should be fine still for detecting network anomalies or the cosmic ray bit flip.

I think I just misread that – the implementation for an io::Read wrapper would be very similar.

Yes, correct.

Indeed in this case, you don't control the algorithm, and for detection of random errors, SHA-1 will probably be fine.

Something like this could with the crate sha-1 (which probably shouldn't be used) could easily be made more generic (through the Digest trait) and/or just change the wrapped hasher. Other advice above apply.

use std::{fs::File, io, path::Path};
use sha1::Digest;

#[derive(Default)]
struct HashWriter(sha1::Sha1)

fn hash_file(file: &Path) -> io::Result<[u8; 20]> {
    let mut hasher = HashWriter::default();
    io::copy(&mut File::open(file)?, &mut hasher)?;
    Ok(hasher.0.finalize())
}

impl io::Write for HashWriter {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        self.0.update(buf);
        Ok(buf.len())
    }

    fn flush(&mut self) -> io::Result<()> {
        Ok(())
    }
}

std::io::copy :

This function will continuously read data from reader and then write it into writer in a streaming fashion until reader returns EOF.

PS: note that std::io::copy will skip IO interruption errors and continue reading.

These two hash functions are still perfectly secure against preimage attacks. Collisions can't be used for MITM unless the attacker manages to trick the sender to send colliding half of a pair generated by the attacker, and then it can be used to flip certain bytes, but not to replace the payload. So these functions are broken for digital signatures, but integrity verification of non-attacker-controlled data is as strong as ever.

1 Like

For posterity, here's the reader implementation (the rest is identical to the writer implementation above):

impl<R: Read> Read for Sha1Reader<R> {
	fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
		let n = self.reader.read(buf)?;
		self.hasher.update(&buf[..n]);
		Ok(n)
	}
}
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.