The title really says it all. The layout of char and u32 should be identical so it would make sense if this were possible, but I know that there is no guarantee that the same struct with different generics will have the same field order, so I could also believe that it's UB. I tried it and it worked, but that doesn't prove anything.
Even if it is safe, I know transmute should only be used if there is no reasonable alternative. Is there a better way to convert a Vec<Vec<char>> to Vec<Vec<u32>> in O(1) time and O(1) space? If not, what about O(n) time and O(1) space?
I can't think of something that uses constant space. You can convert from Vec<char> for Vec<u32>, using the technique suggested in the docs for Vec::from_raw_parts.
x.into_iter().map(|v| v.into_iter().map(|c| c as u32).collect()).collect()
manages to run in O(1) space [1], and optimize out the inner loop, so it’s O(n) where n is the size of the outer vec only. E.g., you can see that these two functions generate the same assembly:
I don't know the answer to your question. I would think the sane thoughts you are and have the same question.
But I'm wondering whether there is any other approach that doesn't involve doubt the conversation at all? If there is an external interface that requires specifically Vec<Vec<u32>> I guess you have no choice, but if you have some level of control, could you maybe adjust the interface to use iterators or to be generic in T: Copy + Into<u32>?
I don’t think we are, no. As I’ve noted (after an edit) in the footnote, there’s specialization on .collect() that makes chains such as .into_iter().map(…).collect() avoid allocating a new Vec (as long as the size and alignment of the item types match, which they should between Vec<char> and Vec<u32>).
The assembly generated for this function really confuses me, because an otherwise identical function will compile very differently when x: Vec<u32>. Any idea why one of them appears to be O(1) and the other appears to be O(n)? They are both equivalent to just returning their parameter, and I don't see why the compiler can figure that out for some types but not others.
No, it isn't. It's not safe to transmute any Vec. The layout of Vec is #[repr(Rust)], so it's not guaranteed that Vec<U> has the same layout as Vec<T> for U != T.
In general, you should basically never transmute pointer-like types, and instead use pointer casting to perform a single level of type punning. (I don't believe there is ever a good reason and/or a sound way to hop through multiple levels of pointers while type punning. Just do the naïve thing first, and try to be clever and unsafe only when it makes a measurable difference. More often than not, it doesn't.)
Assuming I didn’t make any mistakes, this code should be sound, and it optimizes properly
use std::mem::ManuallyDrop;
pub fn convert(x: Vec<Vec<char>>) -> Vec<Vec<u32>> {
// fallback implementation in case of weird future changes to (unstable) `Vec` layout
if std::alloc::Layout::new::<Vec<u32>>() != std::alloc::Layout::new::<Vec<char>>() {
return x.into_iter().map(|v| v.into_iter().map(|c| c as u32).collect()).collect();
}
// inline the definition of (still unstable) Vec::into_raw_parts:
let mut x = ManuallyDrop::new(x);
let (x_ptr, len, cap) = (x.as_mut_ptr(), x.len(), x.capacity());
// ^ end of Vec::into_raw_parts
let end = unsafe {
x_ptr.add(len)
};
let mut x_ptr_i = x_ptr;
while x_ptr_i < end {
unsafe {
let v = x_ptr_i.read();
// inline the definition of (still unstable) Vec::into_raw_parts:
let mut v = ManuallyDrop::new(v);
let (v_ptr, len, cap) = (v.as_mut_ptr(), v.len(), v.capacity());
// ^ end of Vec::into_raw_parts
let v_ptr = v_ptr.cast::<u32>();
let v = Vec::from_raw_parts(v_ptr, len, cap);
x_ptr_i.cast::<Vec<u32>>().write(v);
x_ptr_i = x_ptr_i.add(1);
}
}
unsafe {
Vec::from_raw_parts(x_ptr.cast::<Vec<u32>>(), len, cap)
}
}