Is this code sound for arbitrary implementations of Transmutable? Are there any types where this would not be a valid transformation, or any known API where this would break the safety assumptions?
Whether a mutable pointer is invalidated or not is defined at the memory model level (e.g. stacked borrows), and could cause problems. The library consideration is, as I understand it, does Vec have the same strong ownership qualities (noalias) as Box does?[1] As far as I know that's an undecided question, and thus it would be unsound to rely on it not being true in the implementation today.
More discussion here and here, though they're sort of lengthy.
I see, as_mut_ptr()/from_raw_parts() definitely seems sounder than an actual transmute.
Anyway, I'm now realizing a simpler and strictly more general version would be:
pub fn transmute_vec<T, U>(mut vec: Vec<T>) -> Vec<U> {
const { assert!(std::mem::size_of::<T>() == std::mem::size_of::<U>()) };
const { assert!(std::mem::align_of::<T>() == std::mem::align_of::<U>()) };
vec.clear();
let mut vec = std::mem::ManuallyDrop::new(vec);
let len = vec.len();
let cap = vec.capacity();
let ptr = vec.as_mut_ptr();
unsafe { Vec::from_raw_parts(ptr.cast(), len, cap) }
}
Per the docs of Vec::from_raw_parts, really the core requirement is that the size and alignment of the allocation are correct. The actual type doesn't even matter. Not sure how I didn't think of this earlier.
Edit: technically it's unclear whether the pointer returned from as_mut_ptr() is valid to pass to from_raw_parts() if the vector didn't allocate, but I imagine it's probably fine.
EDIT: this part is wrong, see the discussion below about it It turns out that in the source of Vec::clear, a panic in T::drop will cause the remainder of the elements to be leaked. However, the Vec will still free the buffer, making the change worthwhile.
When I said āproblemsā in the first version I really meant segfaults etc., but that didnāt really make sense since the rest of the sentence was about UB. I edited my message (mostly to make it defer to yours), and thanks for the description and the links.
Well, the pointer if there's no allocation is just the alignment transmuted into a pointer type (RawVecInner::new_in). That means that the pointer was definitely not allocated with the global allocator. The global allocator can definitely be changed to one that will never accept[1] pointers in, say, the zero page and just never return them from allocating methods.
You can simply fix this by adding an assert!(vec.capacity() > 0), but take care to put it before you move the Vec into the ManuallyDrop.[2]
Also, ZSTs could be a problem since Vec<ZST>'s capacity is set to usize::MAX and the pointer is dangling. assert!(size_of::<T>() != 0) should work here. Since the whole purpose of your function is to reuse the capacity of the vector, then ZSTs should never be used with it (since then the allocation would be zero-sized).
I find this point quite interesting. Can we provably assume this? I had sketched a crate for the remainder of the operation well before these kinds of associated types were possible. If such a safe trait is sound then it'd be easy to construct the token that the crate relies on for the Vec's clear-then-reassemble operation. The reuse of the storage only depends on type layout, that trait is the only question where lifetimes play a role specifically.
EDIT: the point Iām trying to make here is wrong, see the discussion below
pub fn clear(&mut self) {
let elems: *mut [T] = self.as_mut_slice();
// SAFETY:
// - `elems` comes directly from `as_mut_slice` and is therefore valid.
// - Setting `self.len` before calling `drop_in_place` means that,
// if an element's `Drop` impl panics, the vector's `Drop` impl will
// do nothing (leaking the rest of the elements) instead of dropping
// some twice.
unsafe {
self.len = 0;
ptr::drop_in_place(elems);
}
}
This is the source code of Vec::clear. Vec::drop will still drop a slice, but one of zero length. clear() has to set the length to zero before dropping the elements to prevent them from being potentially dropped twice (which is what the second bullet of the SAFETY comment says).
Is the problem that the comment in clear is misleading? It is talking about dropping of the elements by the Vec Drop impl, but this makes no sense to me.
I tried (Playground) and it doesn't work, but now I don't know why. Let me walk through what I thought my playground would do:
vec.clear() is called
It creates a pointer *mut [T] to the buffer
vec.len is set to 0
The slice pointer is drop_in_placed, which starts by dropping the first element. It prints "Dropping Printer #1", and then panic!s.
That panic triggers an unwind, which unwinds up out of Printer::drop, [T]::drop, Vec::clear, and then main, causing the Vec to be dropped.
Since the len was previously set to zero, all Vec::drop does is free the underlying buffer of the Vec without dropping any elements.
main unwinds, exiting the program.
In this model, Printer(2) is never dropped. But in reality, it is dropped while unwinding and causes an abort. What did I not think of above? Please correct me.
(Since elems in Vec::clear is a raw pointer, it won't drop the slice. The backtrace reports Printer(2) being dropped in Vec::clear though, is there some strange magic in [T]::drop/drop_in_place?)
The slice pointer is drop_in_placed, which starts by dropping the first element. It prints "Dropping Printer #1", and then panic!s.
(Differences start here)
Unwinding is caught
Attempt to drop the slice continues, dropping the second element. It prints "Dropping Printer #2", and then panic!s
A backtrace is printed and the process is aborted (not unwound) due to panicking a second time
I don't know where the implementation is, but see the description in the PR I linked above (which is basically the bullet points above). looks awhile....Probably this is the code, which I think came from here.
Dropping things is the main thing that happens during unwind, where's the safety concern? The panic happened while dropping one of the slice element, this element isn't touched again[1], but the logic continues in dropping the remaining elements whose destruction hadn't even been started yet.
The only possible "bad" effect is that another panic will abort then, but panicking destructors are discouraged anyway, and apparently the choice here is that it's more desirable to risk an abort than unnecessarily leaking values.
It's the same everywhere. On panic, local variables continue to be dropped. On panic of one field in a struct, other fields continue to be dropped. On panic in a hash map, other elements continue to be dropped. Exceptions are very rare.