Populating an uninitialized buffer and turning it into a Vec<u8>

I need to receive packet which will be used to populate chunks of memory. Once the entire chunk (all packets) has been received, the chunk is converted to a Vec<u8>. Sort-of this:

struct Chunk {
  bmap: Bitmap,
  size: usize,
  buf: *mut u8
}

impl Chunk {
  fn add_part(&mut self, idx: usize, part: &[u8]) {
    // .. stick `part` into `buf` at the appropriate offset ..

    self.bmap.set(idx);
  }
  fn is_done(&self) {
    self.bmp.is_done()
  }
  fn finalize(self) -> Result<Vec<u8>, Error> {
    if !self.is_done() {
      Err(Error::Incomplete("Incomplete chunk".into()))
    } else {
      Ok(unsafe { Vec::from_raw_parts(self.buf, self.size, self.size) })
    }
  }
}

There's one caveat with this: Chunk sizes can be zero. While I could make zero-sized chunks a special case to the caller, it would be a great deal easier if Chunk would allow zero sized buffers. I've read that Vec can avoid allocating any buffer for the actual data array, so I figured that Vec is compatible with that.

However, trying to pass in a std::ptr::null_mut() to from_raw_parts() makes miri upset. RawVec apparently uses NonNull::dangling() to represent the "no buffer" case, but that seems to be an implementation detail that's out of reach to mere Vec mortals.

Should I just go ahead and allocate a dummy buffer for the "no buffer" case when turning it into a Vec, or is there some way I can construct a truly empty Vec<u8>while not generating an UB warning from miri?

You can get a pointer that definitely works for a capacity-zero vector like this:

let ptr = ManuallyDrop::new(Vec::new()).as_mut_ptr();

The vector returned by Vec::new() is guaranteed to have capacity zero.

4 Likes

Wow, I'm such an idiot. I was so hyper-focused on using from_raw_parts() that not even while typing all that out did it occur to me that I don't actually need to use it for the null case. :grimacing:

Thank you -- I just needed to see that reply to get my brain out of the track it was stuck in.

NonNull::dangling() is just an example of an always valid pointer which is easy to construct. It serves no special purpose in representing an empty vector, which is encoded in zero size and capacity. The safety requirements of Vec::from_raw_parts is that the pointer is non-null and properly aligned, but other than that there are no restrictions. You can manually pass in NonNull::dangling() as the pointer argument. Alternatively, if you ensure that your self.buf pointer is always non-null and properly aligned (you do ensure that, right?), you can just pass it in unconditionally.

A simple solution would also be to check for zero buffer length, and return Vec::new() in this case.

Yet another option is to return Option<Vec<u8>>. The layout optimization guarantees that it will take the same space as Vec<u8>, with the None variant corresponding to an all-zero struct.

1 Like

Irrelevant-to-the-original-question aside: If the capacity and the size are always equal, you could return Box<[u8]> instead to communicate that fact to your callers.

That makes for a smaller return value, and it's O(1) non-reallocating to turn a boxed slice into a vector.

1 Like