Process stream in seperate thread


#1

I need to process some file, for example ungzip a file. I’d like to do it in the parallel process to make use of many cores.
In raw linux/glibc I would: open() a file, dup()licate descriptors, fork() separater process and <exec()ute gunzip or use libz>.
Parent process just reads uncompressed content from a descriptor.

How do this in Rust? Is there usefull library for piping?
I’d like to do this:

let mut gz_file = match File::open("compressed.gz")  {
  let mut ungz_file = decompress(gz_file);
  // read uncoompressed content directly from ungz_file.
}

#2

What you’re describing is two different things:

  1. decompressing in another process/thread to avoid blocking the current thread
  2. decompressing using multiple cores at once (which may block the current thread).

For case 1 in Rust you can use std::thread::spawn() and join it later or perhaps use std::mpsc to communicate with the thread.

For case 2, you can’t do that with standard libz. Decompression of gzip streams is a quite single-threaded problem. Parallel decompression is possible only if you have compressed the stream in a special way (as separate blocks) and you have a special decoder that can find these blocks. Are you sure you have this rather unusual case?

If you have many gzipped files, then multicore is easy. You can use the rayon crate to decode them all in parallel.

https://crates.io/crates/libflate
https://crates.io/crates/libz-sys