Typecasting a Vector without allocating a new one

Hi everyone!

I'm working on a legacy program that commonly makes use of a struct containing a Vec<usize> field. I have to integrate some interface to a C library, which uses arrays of int, so I'd need a Vec<i32> eventually. The solution I currently have is the following:

extern {
    fn fn_from_c_lib(v1: *const c_int, v2: *const c_int, length: c_int) -> c_int;
}

pub fn call_external_fn(vec_v1: Vec<usize>, vec_v2: Vec<usize>, length: usize) -> usize {
    
    let tmp_v1:Vec<i32> = vec_v1.iter().map(|x| *x as c_int).collect();
    let tmp_v2:Vec<i32> = vec_v2.iter().map(|x| *x as c_int).collect();

    let v1_ptr = (&tmp_v1[..]).as_ptr();
    let v2_ptr = (&tmp_v2[..]).as_ptr();

    let result_from_lib = unsafe {
        
        fn_from_c_lib(v1_ptr as *const c_int, v2_ptr as *const c_int, length)

    };

    result_from_lib 
}

The problem is that the input can be quite long, and call_external_fn is called very frequently so I'd like to avoid unnecessary allocations as much as possible, and so get rid of the allocating tmp_v1, and tmp_v2. How could I go about that?

Thanks!

you can use my vec-utils crate to do just this. It will try and reuse the allocation where ever possible (i.e. if the layouts of the input and output types match).

There is an RFC for transmuting a Vec<_> in this way, but it still hasn't been accepted yet. (my crate was spawned as part of the discussion of this RFC).

https://github.com/rust-lang/rfcs/pull/2756

3 Likes

Oh sweet, just what I needed. Thanks a lot :slight_smile:

On platforms where usize and c_int have the same layout, you don't even need to map over the vec; you can just cast pointers from one type to the other. So if your program is running on platforms like 32-bit Linux or Windows, where c_int and usize are both 32 bits, you can do something like this:

pub fn call_external_fn(v1: Vec<usize>, v2: Vec<usize>, len: usize) -> usize {
    if Layout::new::<c_int>() == Layout::new::<usize>() {
        unsafe {
            fn_from_c_lib(
                v1.as_ptr() as *const c_int,
                v2.as_ptr() as *const c_int,
                len as c_int) as usize
        }
    } else {
        // TODO: insert conversions here if you care about other platforms
        unimplemented!()
    }
}

Unfortunately I'm on 64bits, but it can certainly be useful in the future

No it is not, Vec<_> uses the global allocator and which is controlled through the GlobalAlloc trait. Note the docs on the dealloc method

Safety

This function is unsafe because undefined behavior can result if the caller does not ensure all of the following:

  • ptr must denote a block of memory currently allocated via this allocator,
  • layout must be the same layout that was used to allocate that block of memory,

Specifically point 2, the layouts must match. Now, a layout as in core::alloc::Layout. Importantly, this includes alignment.

You can use my crate to do something like this,

#[repr(C, align(4))]
struct Bytes([u8; 4]);

let x: Vec<u32> = vec![...];

let x = unsafe {
    x.map(|x| std::mem::transmute::<u32, Bytes>(x))
};

To reuse the allocation.

It is never possible to go from a Vec<u32> to a Vec<u8>, even if you make sure the lengths are corrected because of the alignment requirements.

edit: as @pcpthm noted, you can temporarily reuse the Vec<u32> as a Vec<u8>, but it is extremely dangerous to do so, as even a push can reallocate your vector and break the invariants set by GlobalAlloc

Aha, I want to extend this question a little more.

If I've a VecDeque/Vec, and I want to transform it into Vec/VecDeque. What's the most efficient way?

I agree to this point.
However, you can reuse an allocation temporary if my understanding is correct.


// Let's restrict to non-Drop types for simplicity
fn with_reused_buffer<T: Copy, U: Copy, R>(
    mut v: Vec<T>,
    mut f: impl FnMut(T) -> U,
    callback: impl FnOnce(&mut [U]) -> R,
) -> R {
    assert!(std::mem::size_of::<U>() <= std::mem::size_of::<T>());
    assert!(std::mem::align_of::<T>() % std::mem::align_of::<U>() == 0);

    let ptr_t = v.as_mut_ptr();
    let len = v.len();
    let cap = v.capacity();
    std::mem::forget(v);

    let ptr_u = ptr_t as *mut U;

    for i in 0..len {
        unsafe {
            // Because size_of<U> <= size_of<T>, ptr_t[i] is still not overwritten
            let t = ptr_t.add(i).read();
            let u = f(t);
            ptr_u.add(i).write(u);
        }
    }

    let ret = callback(unsafe { std::slice::from_raw_parts_mut(ptr_u, len) });

    unsafe {
        // is it okay?
        drop(Vec::from_raw_parts(
            ptr_t as *mut std::mem::MaybeUninit<T>,
            len,
            cap,
        ))
    }

    ret
}

type c_int = i32;
extern "C" {
    fn fn_from_c_lib(v1: *const c_int, v2: *const c_int, length: c_int) -> c_int;
}

pub fn call_external_fn(vec_v1: Vec<usize>, vec_v2: Vec<usize>, length: usize) -> usize {
    with_reused_buffer(
        vec_v1,
        |x| x as c_int,
        move |v1| {
            with_reused_buffer(
                vec_v2,
                |x| x as c_int,
                move |v2| unsafe {
                    fn_from_c_lib(v1 as *mut _ as _, v2 as *mut _ as _, length as c_int)
                } as usize,
            )
        },
    )
}

Probably RAII can be used instead of callback based API.
It is really okay?

Yes, you can temporarily reuse the allocation. But this is hard to track and almost impossible to encapsulate in general so I didn't put it as part of vec-utils. (because as soon as you can get a unique reference to the vec, you can call std::mem::replace and drop it)

When sizeof(usize) = 8 and sizeof(c_int) = 4 conversion is not a no-op even assuming all elements are less than c_int::MAX.

[0x12345678, 0x12345678] : [usize] is stored as (assuming little-endian)
[78 56 34 12 00 00 00 00 78 56 34 12 00 00 00 00]
while
[0x12345678, 0x12345678] : [c_int] is stored as
[78 56 34 12 78 56 34 12]
1 Like

The best way to convert from one to the other is Vec::from(vec_deque) and VecDeque::from(vec)

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.