Why does Vec::from_raw_parts require same size and not same size*capacity

The Vec::from_raw_parts documentation outline some requirements for safe use:

  • ptr needs to have been previously allocated via String / Vec<T> (at least, it’s highly likely to be incorrect if it wasn’t).
  • T needs to have the same size and alignment as what ptr was allocated with. ( T having a less strict alignment is not sufficient, the alignment really needs to be equal to satisfy the dealloc requirement that memory must be allocated and deallocated with the same layout.)
  • length needs to be less than or equal to capacity .
  • capacity needs to be the capacity that the pointer was allocated with.

I understand most of them, but I'm curious about the "T needs to have the same size" requirement. Why isn't the requirement that size*capacity is the same?

For example, if I have a:

#[repr(C)]
struct MyStruct {
  a: f32,
  b: f32,
}

what would be the problem with reinterpreting Vec<MyStruct> as Vec<f32>, if we double len and capacity? Both should have the same Layout when deallocated (and we can assert at runtime that alignment and size indeed matches our expectations, to be sure). What exactly would break / cause UB when doing this?

Thanks!

1 Like

I think the wording is more strict than necessary and there isn't a problem with your usage. Would be best to send a PR to update the docs

2 Likes

Doesn't the Layout also include the size of the element, meaning that it cannot be changed?

The alignment and size checks can be at compile time. But you need a run-time check to ensure the product of size and capacity matches, as there is no guaranteed way to get a precise capacity.

1 Like

Vec allocates a contiguous blob, not a blob per element. The Layout of the blob can't change. The argument is that element size times capacity being the same is enough to ensure this.

Ah, so the relevant Layout is the size and alignment of the entire memory block. I'll keep that in mind; it might be useful later if I ever have to do some type punning.

This is a great point — so in general it makes more sense to go from Vec<MyStruct> to Vec<f32>, but not the other way around.

After all when you create a Vec<f32> you're not guaranteed that its capacity will be properly divisible to turn it into a Vec<MyStruct>. But turning Vec<MyStruct> into Vec<f32> should generally always work, since in that case you're multiplying capacity, not dividing it. Am I thinking about this correctly?

Thanks all for the help. I might make a PR to clarify the docs a bit.

I generally find that it is more robust to transmute only slices into the vector than to transmute the vector itself. For example, to view an Vec<f32> as an &[MyStruct], you could do this:

#[repr(C)]
struct MyStruct {
  a: f32,
  b: f32,
}

fn from_f32_slice(slice: &[f32]) -> &[MyStruct] {
    let len = slice.len();
    let ptr = slice.as_ptr();
    
    unsafe { std::slice::from_raw_parts(ptr.cast(), len / 2) }
}

The above doesn't care about the capacity being even at all.

5 Likes

Thanks, I'm aware and I've been doing that already, but for my particular use case it's really better to change the Vec itself.

That said, the documentation for Vec::from_raw_parts should probably include something like your comment, because I agree that in many cases transmuting slices is preferable. I'll include that when making a PR!

1 Like

Thanks all, I posted a PR here: https://github.com/rust-lang/rust/pull/95016 Feedback welcome!

2 Likes

This is an interesting point btw. I dug a little deeper, and it seems that it is guaranteed by GlobalAlloc; it's just that the new unstable Allocator API allows for the allocator to give you a bigger buffer than you asked for. So I suspect that we cast in the reverse direction if we check that the allocator_api unstable feature is disabled (is there a way to do that) or if the global allocator is used (from the type). Does that sound right?

That's ok, though, because it just needs to "fit".

You can always deallocate with the same layout that you passed to allocate, even if the allocator actually gave you more capacity than you originally requested.

2 Likes

Is the GlobalAlloc's guarantee stable or just "currently existing", on the other hand?

Huh, very interesting. I would at least keep an assert around, without a stronger Vec specific guarantee. Although...

...if it were clarified in the Vec docs that

  • something you passed to reserve_exact satisfies "the same size as the pointer was allocated with"

then I would be confident. Perhaps roll it into your PR?

This is all fascinating. Here's something more.. I wonder if the reserve_exact docs are correct or not. They say "Note that the allocator may give the collection more space than it requests. Therefore, capacity can not be relied upon to be precisely minimal."

However, as far as I can tell, even if the allocator gives you a bigger buffer than you need, this has no effect on Vec::capacity — it's still the same as you requested. We seem to completely throw away the information of how big the actual buffer was that was allocated. If that is indeed so, then the "Therefore, capacity can not be relied upon to be precisely minimal" might be wrong — perhaps you can rely on Vec::capacity to be precisely minimal, you just can't rely on the actual memory usage to be precisely minimal.

I'm probably missing something here though! :slight_smile:

1 Like

For those interested, these are the slice and Vec casting functions that I came up with.. Feedback welcome! Casting improvements by janpaul123 · Pull Request #144 · Zaplib/zaplib · GitHub

I strongly suggest using one of the existing libraries, since I immediately see unsoundness here.

For example, cast_slice::<u8, bool>(&[3, 4, 5]) is unsound because of the validity invariants involved (see Two Kinds of Invariants: Safety and Validity).

Here's the error from MIRI, which you can see for yourself in playground (tools -> miri):

error: Undefined Behavior: type validation failed: encountered 0x04, but expected a boolean
  --> src/main.rs:35:8
   |
35 |     if x[1] {
   |        ^^^^ type validation failed: encountered 0x04, but expected a boolean
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `main` at src/main.rs:35:8
3 Likes

Example of an existing library.

You'll still need some logic to extend the transformations to Vec, as the bytemuck version doesn't support differently sized types yet. Your PR is basically about clarifying the validity of doing so.

Do look at their current implementation and read the comments about, for example, the perils of std::mem::forget in this scenario.

1 Like

I know that is unsound, see the comment further down. I think Rust should implement something like an Arbitrary trait so we can catch this in the type system. I'll probably implement a derive macro or so instead in the meantime.