flate2::GzEncoder does not compress data, because of a typo

I wanted to post a question but suddenly found the reason of the bug. So here's the post anyway, in case someone runs into this too.

Task: create a gz file and write compressed data into it. I used fs::File, io::Writer and GzEncoder from flate2. The code worked correctly without problems, but the file contained raw text, not binary bytes of gzip.

I googled, found other topics, tried wrapping GzEncoder with BufWriter, but it changed nothing.

I have the answer what was wrong, but try to guess it yourself -- the answer is hidden below the code.

use csv::Writer;
use std::{fs::File, io::BufWriter};
use flate2::{read::{GzDecoder, GzEncoder}, Compression};
use serde::{Deserialize, Serialize};

let ds = vec![
    (1, 2, 3),
    (4, 5, 6),
    (7, 8, 9),
]
let zip = BufWriter::with_capacity(128 * 1024, GzEncoder::new(File::create("my_file.csv.gz")?, Compression::new(5)));
let mut wr = Writer::from_writer(zip);
wr.serialize(("a", "b", "c"))?;
for row in ds.iter() {
    wr.serialize(row)?;
}
drop(wr);
Answer

The offender was this import:

use flate2::{read::{GzDecoder, GzEncoder}, Compression};

Instead, it should have been

use flate2::{read::GzDecoder, write::GzEncoder, Compression};

Turned out that in flate2, there's more than one GzEncoder -- there's another one for reading files, but it also implements Write trait for pushing into it. And there's one more GzEncoder in bufread module.

2 Likes

By the way, here are three better ways to write this:

// dereference the reference to tuple
for &row in ds.iter() { 
    wr.serialize(row)?;
}

// if the row contained `String`s or other non-`Copy` types
for row in ds.iter().cloned() { 
    wr.serialize(row)?;
}

// to avoid cloning entirely, if `ds` is not needed further
for row in ds.into_iter() {
    wr.serialize(row)?;
}
2 Likes

Correct. The tuple was just a remainder from the real code that had more fields.