Streaming a reqwest::get paylod into a flate2 decoder

I'm writing a toy utility that downloads an OCI image, layer by layer, and constructs a tar that can be use with docker load.

The layers are transported gzipped, and I always want to gzip them.

Currently I'm reading the entire layer to memory and then running it through flate2::read::GzipDecoder, which works fine.

I am wondering though, if it is somehow possible tonot load the entire image to memory.

Reqwest offers chunk(), so in theory I can feed the decoder with data piece by piece as I get it from the network, but in practice the decoder requires an input Read, which doesn't compose well with async.

The only solution I thought of was to use a unix pipe, but this feels a bit hackish.

Can this be achieved cleanly?

There's a crate for async compression. Does this work for you?

1 Like

Haven't noticed it :slight_smile:

reqwest doesn't seem to implement AsyncBufRead, but that is much more solvable on my end.

Thank you!

For what it's worth, I have an example of transforming it from a while back:

It is possible to use write::GzDecoder to perform the gzip decoding as the file is streamed, but tar doesn't support the Write interface that would allow it to be forwarded directly to the tar file.

So you would need to collect the uncompressed data into an intermediate Vec<u8>.

let mut builder = tar::Builder::new(file);
let mut writer = Vec::new();
// Fictitous interface to `tar::Builder` to replace the `Vec`:
// let mut writer = builder.append_write(&header);
let mut decoder = GzDecoder::new(writer);
let mut res = reqwest::get("").await?;
while let Some(chunk) = res.chunk().await? {
// Existing interface to tar:
builder.append(&header, &writer)?;

In your case, it is better to store the compressed data as you do now, than to store the uncompressed data like this.

But this solves for anyone whose destination implements Write.