Hello everybody!

Today i'm trying to figure out how I can make some code suitable for parallelization with a single processor in Rust. I'm building a decoder of u64s and i want to make each symbol decoding process distinct.

This is the current code implementation:

```
/// Decodes the whole sequence given as input.
pub fn decode_all(&self) -> Vec<RawSymbol> {
let mut states = self.states; // is a [u64; 4]
let mut decoded = vec![
unsafe { MaybeUninit::<u64>::uninit().assume_init() };
self.sequence_length as usize
];
let mut norm_bits = self.normalized_bits.iter(); // is a Iter<u32>
let mut current_symbol_index: usize = 0;
while ... { // some condition
decoded[current_symbol_index] = self.decode_sym(&mut states[3], &mut norm_bits);
decoded[current_symbol_index + 1] = self.decode_sym(&mut states[2], &mut norm_bits);
decoded[current_symbol_index + 2] = self.decode_sym(&mut states[1], &mut norm_bits);
decoded[current_symbol_index + 3] = self.decode_sym(&mut states[0], &mut norm_bits);
current_symbol_index += 4;
}
decoded
}
```

In my understanding of the whole thing, i have to isolate each `decode_sym`

call but that mutable reference given to each call prevents the instruction pipelining. The fact is that this is something that i have to do since the logic inside `decode_sym`

involves, sometimes, sequentially pulling out elements from that vector to use them in the method itself.