Sharing access to an output stream

I'm not sure what the most idiomatic solution is to a problem I'm facing.

I have a set of four or five instances of different structs, which take turns producing data that is mostly appended to the end of a Vec, but occasionally one will need to backpatch a byte it produced earlier.

At the moment I've implementations of two of the structs, which currently just borrow a reference to an output vector for the duration of their existence. However, I want to start interleaving output from several of them without constantly creating new instances. So far as I can tell, I need to do one of

  • move the output vector into each instance at the start of each "period of ownership", and back out again when that period is over.
    or
  • have all the instances share a reference to the output vector through a RefCell or similar.
    or
  • just pass the output vector down through the chain of methods that is doing the writing, rather than have it owned by the encoder instances.

Context:

I'm updating a file compression utility I've written that produces executables for the Commodore 64 that decompress themselves after they've been loaded. I have a few different compression codecs with different space/time trade offs, which also vary in the type of data they are good at compressing. While some of them are byte oriented, others produce a mixed stream of bytes, bits, bitpairs, and nybbles (separating out the latter three into three separate streams means the decoder can spend fewer cycles watching for byte boundaries when decoding the stream).

Currently there's just one codec used for the bulk of the file, with a second used for a minor cleanup task at the end. But I want to start switching between several of them dynamically, depending on which best suits the current portion of the data.

Most of the codecs are LZ variants (with different encodings for the copy lengths and offsets), so the output phase consists of a run of calls to encode_token on the current codec, which call things like push_interleaved_exp_golomb_k, which in turn then call lower level functions for outputting bits and bytes. So, I'd have to pass the output vector down through a few layers of methods if I wanted to avoid the encoder holding a reference to the output stream.

From the bitstream writer:

if stream.bits_written == 8 {
    stream.bits_index = self.encoded_stream.len();
    self.encoded_stream.push(b << 7);
    stream.bits_written = 1;
} else {
    self.encoded_stream[stream.bits_index] += b * (128 >> stream.bits_written);
    stream.bits_written += 1;
}

Note the access to self.encoded_stream[stream.bits_index]

Any thoughts?

Looks like I'll be going with the first option, using something like this:

pub fn set_v(&mut self, v: Vec<u8>) {
    self.v = v;
}
pub fn retrieve_v(&mut self) -> Vec<u8> {
    std::mem::replace(&mut self.v, Vec::new())
}

You may want to consider storing a Option<Vec<u8>>, and using Option::take() to take the vec out. This might be a bit better because if you forget to put the vec back in place, then you'll presumably panic later on when trying to get a vec out of the option, expecting it to be there. That seems better than silently getting a different vec out.

However, you mentioned the following approach:

It might seem more laborious, but is also more straightforward and prevents mistakes in the vec ownership transfer dance. So I'd personally not give up on it quickly if its only (perceived) issue is having to thread it through.

Hmm, good points. I'd been considering setting a boolean on set_v to later test if there's a valid vec available, but an option would indeed be more idiomatic, and less error prone. Thanks for the hint about take() - I've not had to do much "move"ing in my various tiny rust projects thus far.

But also yes, I'm probably being overly avoidant about the threading. I noticed a day or two ago that I already posted here in 2016 about another issue I was having with the same project for which the solution was to pass the vector rather than the encoder's "self" in to the bit writing functions…

(I've recently resurrected the project to add some extra bits and bobs).