Why does `Vec::from_raw_parts<T>` require T to have the same size and alignment as it was allocated with?


#1

In the from_raw_parts docs it says:

This is highly unsafe, due to the number of invariants that aren't checked:

1.    ptr needs to have been previously allocated via String/Vec<T> (at least, it's highly likely to be incorrect if it wasn't).
2.    ptr's T needs to have the same size and alignment as it was allocated with.
3.    length needs to be less than or equal to capacity.
4.   capacity needs to be the capacity that the pointer was allocated with.

I understand requirements #3 and #4 but not #1 and #2.

Here’s a rust playground example where I use from_raw_parts to change a Vec<u8> into a Vec<big_struct> (where big_struct has a different size).

Questions:

  • is the usage of from_raw_parts in that snippet I linked to safe? if not, why not?
  • why does T needs to have the same size and alignment as it was allocated with? is there a simple example I can use to see how I could run into problems if T doesn’t have the same size?
  • why does it say “ptr needs to have been previously allocated via String/Vec”?

#2

I’m guessing this is because de/allocations go through routines that take a Layout to compute de/allocation requirements, and layout is based on the T.

In your example, I think you’re likely leaking memory. It allocates capacity for 560 elements of type i32. Then you’re telling the allocator that you have 7 80 byte values, but that’s a smaller total allocation. When those Vecs are dropped the excess bytes (rounded up to some allocation size, perhaps - depends on the allocator) are likely leaked.


#3

Hmm, I think my original example was unclear – I updated it so that it’s allocating a 560-element Vec<u8> instead (and so it shouldn’t be leaking memory). here’s the updated version.


#4

I do not believe there is any way to sensibly meet #4 while changing the size of T. And sure enough, your second example completely disregards the capacity of the original vector.

You used with_capacity, but that’s only used as a lower bound.
…I think.
…Or perhaps I am wrong.


Addendum: While I can’t confess to know the reasoning behind that odd rule, as a data point, I basically chickened out of writing Vec functions when I wrote slice_of_array. I figured, slices are good enough for most things, and for those rare cases where you need to own the transformed data, I doubt that allocating a new vector with to_vec() is really that big a deal…

BUT… take a look at what Box<[T]>::into_vec does…


#5

This is a good point! I guess I don’t understand #4 either, then :).


#6

As to 1, it means that you can’t use, like, malloc or mmap to get memory, because Rust’s allocator may be different from malloc or mmap. The reason that the size and alignment needs to be the same (although you can kinda wonk around with the size, as long as size * capacity is the same) is because jemalloc may use alignment to do fun optimizations.


#7

Vec’s docs actually promise quite a few things: https://doc.rust-lang.org/std/vec/struct.Vec.html#guarantees


#8

thanks all!

I write another gist to explain to myself why alignment is important: https://play.rust-lang.org/?gist=1f70b0b98d1717b5959c8aca6dea1c53&version=stable.

my understanding now is that if you take a byte-aligned Vec<u8> and cast it to a Vec<u64>, you can end up with undefined behaviour because u64s need to be word aligned (I think??)


#9

Right. I think there are roughly two areas here:

  1. Playing nice with the allocator in terms of the raw storage. This is what’s mostly been discussed here thus far.
  2. The transmute like behavior of reinterpreting the bits in the storage as another type. Even with same alignment, you can run into UB by, eg, reinterpreting bits as an enum value that doesn’t exist. Then endianness can come into play, and all the other transmute fun.

There’s certainly UB to be had if the compiler believes storage is aligned one way but in reality it’s aligned differently. Here’s a cool example of that.


#10

I’m still curious about alignment-preserving transforms though. I.e. Vec<T> to Vec<[T; 3]>, or to Vec<Newtype<[T; 3]>>. (Notice that arrays always have the same alignment as the element type, and newtypes always preserve both size and align).

From the looks of how functions like the Box<[T]> conversions are implemented, and how Layout is ultimately nothing more than a (size, align) pair, it seems that this conversion should not cause trouble under the current implementation of Vec (assuming that you adjust cap appropriately to a conservative value). The question is whether the explicit guarantees of Vec and its API are strong enough for us to prove that this conversion is supported and that it will not break in the forseeable future.


#11

https://doc.rust-lang.org/alloc/raw_vec/struct.RawVec.html#method.with_capacity mentions that it creates heap storage with exact alignment and capacity for [T; cap] - that seems pretty precise.


#12

RawVec::with_capacity is unstable, however, and Vec makes no guarantees about how it uses RawVec.


#13

That’s true although my understanding is that RawVec will become stable at some point (probably when the allocator API will). Afterall, a systems language, at some point, needs to allow more flexible/controlled memory management without resorting to unsafe :slight_smile:.

However, Vec docs have this:

The fact you can turn it into a Box<[T]> without any layout adjustments implies the same thing. I don’t know whether this counts as a guarantee, however.


#14

I am also not certain what that guarantees. IMO it very strongly suggests that Vec::from_raw_parts(p, len, len) is a valid operation, and I think it’s likely that unsafe code authors may have read it that way (in which case it could be dangerous to let that change!).

I find it interesting how it explicitly says len == capacity. This makes me wonder if it is still unsafe to convert a vec that has any unused remainder. (perhaps due to its potential use as scratch space).

But in any case, now that I think more about it, I think the final piece of the puzzle is simply to show that Box<[T]> can safely be turned into Box<[[T; 3]]>. There are already safe APIs to handle the rest.


#15

If it has a remainder then this will leak it since the Box won’t know about that memory. Otherwise that remainder is dead space - Box can’t make use of it, and since the Vec is gone, nothing will use it.

I’m pretty sure layout-wise this is guaranteed.


#16

(oops, up until now I just assumed allocators were responsible for remembering how much memory was allocated for any given pointer; but that would be kind of silly, wouldn’t it?)


#17

I think the truth is slightly nuanced - it depends on whether sized deallocation is used, which I believe Rust uses as much as possible. I might be wrong on this - will need to look into it (or someone more familiar with internals of how de/allocation is hooked up may clarify).