Is it safe to recreate an empty vector with `Vec::from_raw_parts` with a different type of the same size and alignment?

The documentation of Vec::from_raw_parts mentions some invariants that must be upheld:

  • ptr needs to have been previously allocated via String/Vec<T> (at least, it's highly likely to be incorrect if it wasn't).
  • ptr 's T needs to have the same size and alignment as it was allocated with.
  • length needs to be less than or equal to capacity .
  • capacity needs to be the capacity that the pointer was allocated with.

I think this list is incomplete because the following code upholds all the invariants listed above, but is clearly undefined behaviour:

fn get_vec_with_dangling_pointer<'a>() -> Vec<&'a u32> {
    let x : u32 = 1;
    let mut vec = Vec::new();
    vec.push(&x);
    
    let ptr = vec.as_mut_ptr() as usize;
    let length = vec.len();
    let capacity = vec.capacity();
    std::mem::forget(vec);
    unsafe {
        Vec::from_raw_parts(ptr as *mut &u32, length, capacity)
    }
}


fn main() {
    let y = 2;
    let mut vec = get_vec_with_dangling_pointer();
    vec.push(&y);
    let _no_good = *vec[0];   // Dereference a variable that is already out of scope.
    println!("{}", _no_good); // Prints a different number each time on my machine.
}

I think the requirement "ptr 's T needs to have the same size and alignment as it was allocated with" is not strict enough. Because Vec::from_raw_parts can be used to transmute, there is at least the requirement that it must be safe to transmute the first length elements of the original vector to the type T.

I would guess that this is also sufficient. But I have read that compiler optimization can have unexpected side effects when dealing with references, so I am not sure.

I want to re-use the memory allocated by Vec for another Vec that holds elements of the same size and alignment, but not necessary the same type. Can I safely do this if I clear the original Vec before extracting the ptr, length (which is then 0) and capacity?

Miri seems to be ok as long as you clear the Vec. Your example triggers an error that disappears if you add a clear before pushing &y.

Also I don't think the doc has to change, your code is UB not because of from_raw_parts but because of the cast (without it you'll keep the lifetime and get a compile error). This is called an unbounded lifetime.

1 Like

Yes, this requirement, I think, was deemed so obvious that it hasn't been written:

  • ptr needs to point to a valid [T; length]

This implies (is equivalent to?) another requirement, which is missing from the docs:

  • length must verify that the first length elements ptr points to have been initialiazed. / length has the same requirements that Vec::set_len has (with old_len = 0)

Else we would be able to define

fn non_unsafe_uninitialized<T> ()
    -> T
{
    let mut v = Vec::<T>::with_capacity(1);
    let (ptr, capacity) = (v.as_mut_ptr(), v.capacity());
    assert!(capacity >= 1);
    mem::forget(v);
    let mut v = unsafe { Vec::from_raw_parts(ptr, 1, capacity) };
    v.pop().unwrap() // yields an uninitialized T, which is UB
}
3 Likes

Can I conclude that if length == 0, I do not need to worry about what data type was stored before in the vector?

In particular, can you tell if the following code is free from undefined behaviour?

fn main() {
    let mut x : u32 = 1;
    let mut mut_vec : Vec<&mut u32> = Vec::new();
    mut_vec.push(&mut x);
    mut_vec.clear(); // now mut_vec.len() == 0
    
    let mut_ptr = mut_vec.as_mut_ptr() as usize;
    let length = mut_vec.len();
    let capacity = mut_vec.capacity();
    std::mem::forget(mut_vec);
    
    let ptr = mut_ptr as *mut &u32;
    let mut vec : Vec<&u32> = unsafe {
        Vec::from_raw_parts(ptr, length, capacity)
    };
    
    let y = 1;
    vec.push(&y);
    let other = &y;
}

(Note: I start with a Vec<&mut u32> and end with a Vec<&u32>.)

My concern is that the "uniqueness" of a pointer cannot change and hence this code may translate to something like the following llvm, which I write in "pseudo-llvm".

let x : int = 1;
let v : unique pointer to unique pointer = malloc(4);
*v = &x;
let y : int = 1;
*v = &y;
let other = &y; // maybe undefined behaviour?
                // `other` and `*v` both refer to the same, but `*v` is supposed to be a unique pointer

You change its type, from that point onward you have a Vec<&u32> and play with its rules. LLVM doesn't really matter here, as long as you follow Rust's abstract machine rules you're fine or there is a bug in Rust or LLVM.
You also might want to use transmute, it would be "cleaner" in my opinion. playground

Using transmute is UB here -- Vec is a repr(Rust) struct, so there's no guarantee that Vec<A> and Vec<B> have compatible layouts (unless A = B, obviously).

Using from_raw_parts is the correct approach.

2 Likes

Due to allocator restrictions, you must also check that capacity == 0 or that the size and alignment of T is the same as U to convert between Vec<T> and Vec<U>.

i.e.

fn try_convert<T, U>(vec: Vec<T>) -> Vec<U> {
    use std::alloc::Layout;

    if vec.capacity() > 0 && Layout::new::<T>() == Layout::new::<U>() {
        let mut vec = std::mem::ManuallyDrop::new(vec);
        let len = vec.len();
        let cap = vec.capacity();
        let ptr = vec.as_mut_ptr();
        unsafe {
            Vec::from_raw_parts(ptr, len, cap)
        }
    } else {
        Vec::new()
    }
}
1 Like

Thanks all for your answers!

I experimented with miri as @leudz suggested. Everything goes fine as long as I do not try to reallocate. When I try to reallocate, miri gives an error:

Code:

fn main() {
    let ptr: usize;
    let capacity: usize;
    {
        let mut x = 1;
        let mut v = Vec::<&u32>::new();
        v.push(&mut x);
        v.clear();
        assert_eq!(0, v.len());
        ptr = v.as_mut_ptr() as usize;
        capacity = v.capacity();
        std::mem::forget(v);
    }
    {
        let a = 1;
        let new_ptr = ptr as *mut &u32;
        let mut vec = unsafe {
            Vec::from_raw_parts(new_ptr, 0, capacity)
        };
        vec.push(&a);
        vec.push(&a);
    }
}

Error:

pieter@machine:/tmp/testje$ cargo +nightly-2019-09-03-x86_64-unknown-linux-gnu miri run
   Compiling testje v0.1.0 (/tmp/testje)
error[E0080]: Miri evaluation error: a memory access tried to interpret some bytes as a pointer
   --> [...]/src/liballoc/alloc.rs:128:5
    |
128 |     __rust_realloc(ptr, layout.size(), layout.align(), new_size)
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Miri evaluation error: a memory access tried to interpret some bytes as a pointer
    |
    = note: inside call to `std::alloc::realloc` at [...]/src/liballoc/alloc.rs:187:22
    = note: inside call to `<std::alloc::Global as std::alloc::Alloc>::realloc` at [...]/src/rust/src/liballoc/raw_vec.rs:668:21
    = note: inside call to `alloc::raw_vec::RawVec::<&u32>::reserve_internal` at [...]/src/liballoc/raw_vec.rs:495:15
    = note: inside call to `alloc::raw_vec::RawVec::<&u32>::reserve` at [...]/src/liballoc/vec.rs:459:9
    = note: inside call to `std::vec::Vec::<&u32>::reserve` at [...]/src/liballoc/vec.rs:1104:13
note: inside call to `std::vec::Vec::<&u32>::push` at src/main.rs:21:9
   --> src/main.rs:21:9
    |
21  |         vec.push(&a);
    |         ^^^^^^^^^^^^

Now for my use case, I do not need to reallocate. But it may point at a larger problem. Any ideas?

Edit: miri does not complain when ptr has type *mut i8 instead of usize. Maybe that's a better idea.