How to read gzip binary file?

Hello
I'm new to rust and I open a binary file and process its contents byte by byte with the following code:

fn main() {
    let file = "file.bin";
    let bytes = std::fs::read(file).unwrap();
    // for example, decode bytes[0..2] as u16
    let id = u16::from_be_bytes([bytes[0], bytes[1]]);
    println!("id = {}", id);
}

1: Is this method correct and is there no method that has better performance?

2: What should the above code change when the file is compressed and in gz format?
I tried to rewrite the above code snippet using flate2:

use flate2::read::GzDecoder;

fn main() {
    let file = "file.bin.gz";
    let bytes = std::fs::read(file).unwrap();
    let mut gz = GzDecoder::new(&bytes[..]);
}

But I don't know how to convert gz to a vector or an array and process it like when the file is uncompressed.

You should use one of the methods of the Read trait, probably for byte in file.bytes() { ... } in your case ? (example)

2 Likes

I think I found the solution:

fn main() {
    let file = "file.bin.gz";
    let bytes = std::fs::read(file).unwrap();
    let mut gz = GzDecoder::new(&bytes[..]);
    
    let mut d = Vec::new();
    gz.read_to_end(&mut d).unwrap();
}

d contains the decoded contents of file.bin.gz.

You should probably use:

let file = File::open("path").unwrap();
let file = BufReader::new(file);
let mut file = GzDecoder::new(file);
let mut bytes = Vec::new();
file.read_to_end(&mut bytes).unwrap();

That would use allocations more wisely. Allocating to avoid too many read syscalls and avoiding the intermediary allocation of std::fs::read.

7 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.