I have a legacy system which I'm writing a new Rust utility around. One of the many items to support is a binary database format consisting of little-endian unsigned 32-bit integers.
The file format in a simple table:
+-------+-----------+
| count | offset... |
+-------+-----------+
Here's some example code where I'm trying to write the serialization logic:
use std::io;
use std::mem::size_of;
type Offset = u32;
const OFFSET_SIZE: usize = size_of::<u32>();
struct Offsets(Vec<Offset>);
impl Offsets {
fn serialize(&self, output: &mut impl io::Write) -> io::Result<usize> {
// write out the length as u32 in little-endian, then each u32 in the
// vector in little-endian
output.write([self.0.len() as u32]
.iter()
.chain(self.0.as_slice())
.map(|i| i.to_le_bytes())
.flatten()) // FIXME can I get a &[u8] out of this?
}
}
fn main() {
let offsets = Offsets(vec![0, 8, 16]);
let mut output: Vec<u8> = Vec::new();
let bytes_written = offsets.serialize(&mut output).expect("certain death");
assert_eq!(16. bytes_written);
assert_eq!(16, output.len());
assert_eq!(3u32.to_le_bytes(), output.as_slice()[0..4]);
assert_eq!(0u32.to_le_bytes(), output.as_slice()[4..8]);
assert_eq!(8u32.to_le_bytes(), output.as_slice()[8..16]);
assert_eq!(16u32.to_le_bytes(), output.as_slice()[16..20]);
}
Essentially, I'd like to take the little-endian bytes of the length and each entry of the internal vector and write them to the output: &mut impl io::Write
with as little overhead as possible.
This fails to compile with the following error message:
error[E0277]: `[u8; 4]` is not an iterator
--> src/main.rs:19:14
|
19 | .flatten()) // FIXME can I get a &[u8] out of this?
| ^^^^^^^ borrow the array with `&` or call `.iter()` on it to iterate over it
|
= help: the trait `std::iter::Iterator` is not implemented for `[u8; 4]`
= note: arrays are not iterators, but slices like the following are: `&[1, 2, 3]`
= note: required because of the requirements on the impl of `std::iter::IntoIterator` for `[u8; 4]`
error[E0308]: mismatched types
--> src/main.rs:15:22
|
15 | output.write([self.0.len() as u32]
| ______________________^
16 | | .iter()
17 | | .chain(self.0.as_slice())
18 | | .map(|i| i.to_le_bytes())
19 | | .flatten()) // FIXME can I get a &[u8] out of this?
| |______________________^ expected `&[u8]`, found struct `std::iter::Flatten`
|
= note: expected reference `&[u8]`
found struct `std::iter::Flatten<std::iter::Map<std::iter::Chain<std::slice::Iter<'_, u32>, std::slice::Iter<'_, u32>>, [closure@src/main.rs:18:18: 18:37]>>`
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0277, E0308.
For more information about an error, try `rustc --explain E0277`.
error: could not compile `playground`.
To learn more, run the command again with --verbose.
How can I get a Vec<u8>
out of these values, or, better yet, an &[u8]
that I can pass to the write
method of the output? How can I convert u32::from_le_bytes
's array results into slices that can be joined into a &[u8]
?
Bonus Points: I've been thinking through it, and I'm not sure if there's a way to do this without allocating.
If I'm on a little-endian system, in theory I could do some unsafe sorcery to create a slice that would reference the underlying data in the
Vec
probably usingVec::as_ptr
, but I would still need to prefix this with the length.At the very least, if I have to allocate a temporary vector, I could at least allocate
Vec::with_capacity
to get the exact size that I need.Can anyone see a way to optimize this? I'm assuming that zero-copy optimization isn't possible.