Typecasting a Vector without allocating a new one

Magalame · October 29, 2019, 1:42am

Hi everyone!

I'm working on a legacy program that commonly makes use of a struct containing a Vec<usize> field. I have to integrate some interface to a C library, which uses arrays of int, so I'd need a Vec<i32> eventually. The solution I currently have is the following:

extern {
    fn fn_from_c_lib(v1: *const c_int, v2: *const c_int, length: c_int) -> c_int;
}

pub fn call_external_fn(vec_v1: Vec<usize>, vec_v2: Vec<usize>, length: usize) -> usize {
    
    let tmp_v1:Vec<i32> = vec_v1.iter().map(|x| *x as c_int).collect();
    let tmp_v2:Vec<i32> = vec_v2.iter().map(|x| *x as c_int).collect();

    let v1_ptr = (&tmp_v1[..]).as_ptr();
    let v2_ptr = (&tmp_v2[..]).as_ptr();

    let result_from_lib = unsafe {
        
        fn_from_c_lib(v1_ptr as *const c_int, v2_ptr as *const c_int, length)

    };

    result_from_lib 
}

The problem is that the input can be quite long, and call_external_fn is called very frequently so I'd like to avoid unnecessary allocations as much as possible, and so get rid of the allocating tmp_v1, and tmp_v2. How could I go about that?

Thanks!

RustyYato · October 29, 2019, 2:13am

you can use my vec-utils crate to do just this. It will try and reuse the allocation where ever possible (i.e. if the layouts of the input and output types match).

There is an RFC for transmuting a Vec<_> in this way, but it still hasn't been accepted yet. (my crate was spawned as part of the discussion of this RFC).

https://github.com/rust-lang/rfcs/pull/2756

Magalame · October 29, 2019, 2:22am

Oh sweet, just what I needed. Thanks a lot

mbrubeck · October 29, 2019, 2:39am

On platforms where usize and c_int have the same layout, you don't even need to map over the vec; you can just cast pointers from one type to the other. So if your program is running on platforms like 32-bit Linux or Windows, where c_int and usize are both 32 bits, you can do something like this:

pub fn call_external_fn(v1: Vec<usize>, v2: Vec<usize>, len: usize) -> usize {
    if Layout::new::<c_int>() == Layout::new::<usize>() {
        unsafe {
            fn_from_c_lib(
                v1.as_ptr() as *const c_int,
                v2.as_ptr() as *const c_int,
                len as c_int) as usize
        }
    } else {
        // TODO: insert conversions here if you care about other platforms
        unimplemented!()
    }
}

Magalame · October 29, 2019, 2:52am

Unfortunately I'm on 64bits, but it can certainly be useful in the future

RustyYato · October 29, 2019, 3:34am

No it is not, Vec<_> uses the global allocator and which is controlled through the GlobalAlloc trait. Note the docs on the dealloc method

Safety

This function is unsafe because undefined behavior can result if the caller does not ensure all of the following:

ptr must denote a block of memory currently allocated via this allocator,

layout must be the same layout that was used to allocate that block of memory,

Specifically point 2, the layouts must match. Now, a layout as in core::alloc::Layout. Importantly, this includes alignment.

You can use my crate to do something like this,

#[repr(C, align(4))]
struct Bytes([u8; 4]);

let x: Vec<u32> = vec![...];

let x = unsafe {
    x.map(|x| std::mem::transmute::<u32, Bytes>(x))
};

To reuse the allocation.

It is never possible to go from a Vec<u32> to a Vec<u8>, even if you make sure the lengths are corrected because of the alignment requirements.

edit: as @pcpthm noted, you can temporarily reuse the Vec<u32> as a Vec<u8>, but it is extremely dangerous to do so, as even a push can reallocate your vector and break the invariants set by GlobalAlloc

AurevoirXavier · October 29, 2019, 3:44am

Aha, I want to extend this question a little more.

If I've a VecDeque/Vec, and I want to transform it into Vec/VecDeque. What's the most efficient way?

pcpthm · October 29, 2019, 3:51am

I agree to this point.
However, you can reuse an allocation temporary if my understanding is correct.


// Let's restrict to non-Drop types for simplicity
fn with_reused_buffer<T: Copy, U: Copy, R>(
    mut v: Vec<T>,
    mut f: impl FnMut(T) -> U,
    callback: impl FnOnce(&mut [U]) -> R,
) -> R {
    assert!(std::mem::size_of::<U>() <= std::mem::size_of::<T>());
    assert!(std::mem::align_of::<T>() % std::mem::align_of::<U>() == 0);

    let ptr_t = v.as_mut_ptr();
    let len = v.len();
    let cap = v.capacity();
    std::mem::forget(v);

    let ptr_u = ptr_t as *mut U;

    for i in 0..len {
        unsafe {
            // Because size_of<U> <= size_of<T>, ptr_t[i] is still not overwritten
            let t = ptr_t.add(i).read();
            let u = f(t);
            ptr_u.add(i).write(u);
        }
    }

    let ret = callback(unsafe { std::slice::from_raw_parts_mut(ptr_u, len) });

    unsafe {
        // is it okay?
        drop(Vec::from_raw_parts(
            ptr_t as *mut std::mem::MaybeUninit<T>,
            len,
            cap,
        ))
    }

    ret
}

type c_int = i32;
extern "C" {
    fn fn_from_c_lib(v1: *const c_int, v2: *const c_int, length: c_int) -> c_int;
}

pub fn call_external_fn(vec_v1: Vec<usize>, vec_v2: Vec<usize>, length: usize) -> usize {
    with_reused_buffer(
        vec_v1,
        |x| x as c_int,
        move |v1| {
            with_reused_buffer(
                vec_v2,
                |x| x as c_int,
                move |v2| unsafe {
                    fn_from_c_lib(v1 as *mut _ as _, v2 as *mut _ as _, length as c_int)
                } as usize,
            )
        },
    )
}

Probably RAII can be used instead of callback based API.
It is really okay?

RustyYato · October 29, 2019, 3:53am

Yes, you can temporarily reuse the allocation. But this is hard to track and almost impossible to encapsulate in general so I didn't put it as part of vec-utils. (because as soon as you can get a unique reference to the vec, you can call std::mem::replace and drop it)

pcpthm · October 29, 2019, 4:00am

When sizeof(usize) = 8 and sizeof(c_int) = 4 conversion is not a no-op even assuming all elements are less than c_int::MAX.

[0x12345678, 0x12345678] : [usize] is stored as (assuming little-endian)
[78 56 34 12 00 00 00 00 78 56 34 12 00 00 00 00]
while
[0x12345678, 0x12345678] : [c_int] is stored as
[78 56 34 12 78 56 34 12]

RustyYato · October 29, 2019, 4:19am

The best way to convert from one to the other is Vec::from(vec_deque) and VecDeque::from(vec)

system · January 27, 2020, 4:19am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Current META converting Vec<U> -> Vec<T> where	7	1877	March 28, 2023
Vec layout / transmute help	5	813	June 15, 2022
Pass a Vec from Rust to C help	13	7132	July 31, 2021
From/Into traits and mem::transmute help	3	438	August 5, 2020
Is it safe to transmute Vec<Vec<char>> to Vec<Vec<u32>>? help	15	1035	October 1, 2022

Typecasting a Vector without allocating a new one

Related topics