How to read a gzipped tar archive to `Vec<u8>` without decompressing it?

I'm trying to upload a byte vector to cloud storage.

This byte vector should be a compressed archive. To achieve this I need to obtain a Vec<u8> by reading the compressed archive which I have created. I know that gzipped files do not contain their size and when I try to read it normally I don't get all the bytes.

It seems that it only reads the header because the resulting vector is 10 bytes.

Example

use std::io::Read;
    
fn main() {
    
    // Creates the archive and compresses it.
    let file = std::fs::File::create("example.tar.gz").unwrap();
    let encoder = flate2::write::GzEncoder::new(file, flate2::Compression::default());
    let mut archive = tar::Builder::new(encoder);
    archive.append_dir_all("example_dir", "path/to/example_dir").unwrap();
    archive.finish().unwrap();

    // I see that this does not work since it reads a wrong length.
    // But I don't know how to achive it.
    let example_bytes : Vec<u8> = std::fs::read("example.tar.gz").unwrap();
    dbg!(example_bytes.len());
    
    // Corrupt
    std::fs::write("rewritten.tar.gz", example_bytes).unwrap();
}

If I try with BufReader,

    let file = File::open("example.tar.gz").unwrap();
    let mut file = std::io::BufReader::new(file);
    let mut bytes = Vec::new();
    file.rewind().unwrap();
    file.read_to_end(&mut bytes).unwrap();
    // Corrupt
    // The resulting file is not 10 bytes this time but,
    // 392 bytes less than the original amount. 
    // The corrupt file ends with the sequence 
    // FF D3 E5 FF 3B F6 5F A3 F8 if it means something.
    std::fs::write("rewritten.tar.gz", bytes).unwrap();

Is there a way to get the raw bytes of this compressed archive so I can upload it to cloud storage?

Resolved in stack overflow.

This should not be the case. You should be able to read any file fully without further ado. Files do not need to "contain their length". Most files don't – the file system of the OS knows the size of each file.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.