Passing vector of vectors buffer to C

I'm trying to pass a vector of vectors buffer from Rust to C adapting an answer on stackoverflow (rust - How do I return an vector of dynamic length in a pub extern "C" fn? - Stack Overflow) for individual vectors. Unfortunately somewhere my mental model of what's going on is obviously wrong and I'm getting a segmentation fault when I try to access the second vector. Can anyone point me in the right direction on how to solve the issue?

I'm new to Rust and only very occasionally dabble in C so apologies if I'm missing something obvious:

lib.rs

#[repr(C)]
pub struct DynArray {
    array: *mut *mut i32,
    length: libc::size_t,
    v_lengths: *mut i32,
}

#[no_mangle]
pub extern "C" fn rust_alloc() -> DynArray {
    let mut v: Vec<Vec<i32>> = vec![vec![1, 2, 3], vec![1, 2, 3]];
    let mut l: Vec<i32> = vec![3, 3];

    let result = DynArray {
        array: v.as_mut_ptr() as *mut *mut i32,
        length: v.len() as _,
        v_lengths: l.as_mut_ptr(),
    };

    std::mem::forget(v);
    std::mem::forget(l);

    result
}

#[no_mangle]
pub extern "C" fn rust_free(array: DynArray) {
    if !array.array.is_null() {
        unsafe {
            Box::from_raw(array.array);
        }
    }
}

main.c

#include <stdio.h>
#include <stdint.h>

struct DynArray {
    int32_t** array;
    size_t length;
    int32_t* v_lengths;
};

struct DynArray rust_alloc();
void rust_free(struct DynArray);

int main () {
  struct DynArray vec = rust_alloc();
  size_t n = vec.length;
  printf("length of containing vector = %d\n", n);
  int32_t **values = vec.array;
  int32_t *lengths = vec.v_lengths;
  for (int i = 0; i < n; i++) {
    printf("length of component %d = %d\n",i, lengths[i]);
  }
  printf("first component entry 0 = %d\n", values[0][0]);
  printf("first component entry 1 = %d\n", values[0][1]);
  printf("first component entry 2 = %d\n", values[0][2]);
  
  //segfault here
  printf("second component entry 0 = %d\n", values[1][0]);
  rust_free(vec);
}

Unfortunately a Vec<i32> is not represented in the same way as *mut i32, so the v.as_mut_ptr() as *mut *mut i32 cast is incorrect. You probably ended up trying to use the length or capacity of the vector as a pointer. Note that Vec<i32> does not have #[repr(C)], so the runtime layout is undefined and you can't rely on the fields being in any particular order (Rust has optimizations that reorder fields of structs in some cases).

You would have to create an extern function that C can call to ask Rust to convert a pointer to a Rust vector into a pointer to the contents. Alternatively create a Vec<*mut i32>.

3 Likes

Thank you. That makes sense. I'll play around and see where I get. Cheers.

I couldn't work out exactly how to implement the above so in the end I've turned the nested vector in to a single individual one with concat() and passed this and the individual component lengths through. i.e.

#[repr(C)]
pub struct DynArray {
    array: *mut i32,
    length: libc::size_t,
    v_lengths: *mut i32,
}

#[no_mangle]
pub extern "C" fn rust_alloc() -> DynArray {
    let v: Vec<Vec<i32>> = vec![vec![1, 2, 3], vec![1, 2, 3]];
    let mut v2: Vec<i32> = v.concat();
    let mut l: Vec<i32> = vec![3, 3];

    let result = DynArray {
        array: v2.as_mut_ptr(),
        length: v.len() as _,
        v_lengths: l.as_mut_ptr(),
    };

    std::mem::forget(v2);
    std::mem::forget(l);

    result
}

#[no_mangle]
pub extern "C" fn rust_free(array: DynArray) {
    if !array.array.is_null() {
        unsafe {
            Box::from_raw(array.array);
        }
    }
}

Note that your dealloation is incorrect. You need to use Vec::from_raw_parts to deallocate memory you get from a vector. In particular you need to remember the capacity too! If you give it the wrong capacity, it will incorrectly deallocate it and cause memory corruption. If you wish to deallocate it using a Box, you should use Vec::into_boxed_slice when creating the vector, which tells the allocator to shorten the allocation such that the length is equal to the capacity (and may even reallocate).

You also forgot to deallocate the array of lengths. I'm also not sure how array can be null.

1 Like

Thank you. A lot of that was just copied from the original stackoverflow answer. I'll try and correct it now.

Appreciate the help. Does the following now correctly deallocate the memory?

lib.rs

use libc;

#[repr(C)]
pub struct DynArray {
    array: *mut i32,
    array_length: libc::size_t,
    array_capacity: libc::size_t,
    component_sizes: *mut i32,
    component_sizes_length: libc::size_t,
    component_sizes_capacity: libc::size_t,
}

#[no_mangle]
pub extern "C" fn rust_alloc() -> DynArray {
    let v: Vec<Vec<i32>> = vec![vec![1, 2, 3], vec![1, 2, 3]];
    let mut v2: Vec<i32> = v.concat();
    let mut l: Vec<i32> = vec![3, 3];

    let result = DynArray {
        array: v2.as_mut_ptr(),
        array_length: v2.len() as _,
        array_capacity: v2.capacity() as _,
        component_sizes: l.as_mut_ptr(),
        component_sizes_length: l.len() as _,
        component_sizes_capacity: l.capacity() as _,
    };

    std::mem::forget(v2);
    std::mem::forget(l);

    result
}

#[no_mangle]
pub extern "C" fn rust_free(array: DynArray) {
    unsafe {
        Vec::from_raw_parts(
            array.array, 
            array.array_length, 
            array.array_capacity);
        Vec::from_raw_parts(
            array.component_sizes,
            array.component_sizes_length,
            array.component_sizes_capacity,
        );
    }
}

main.c

#include <stdio.h>
#include <stdint.h>

struct DynArray {
    int32_t* array;
    size_t array_length;
    size_t array_capacity;
    int32_t* component_sizes;
    size_t component_sizes_length;
    size_t component_sizes_capacity;
};

struct DynArray rust_alloc();
void rust_free(struct DynArray);

int main () {
  struct DynArray vec = rust_alloc();
  rust_free(vec);
}

It looks ok to me.

1 Like

Know that a Vec is a growable heap-allocated buffer.

If you "just" need a heap-allocated buffer (i.e., you do not intend to make it grow anymore), which ought to be the case when doing FFI, then you can use a Box<[T]> instead of a Vec<T> and no longer have to carry capacity around:

lib.rs

use ::libc::size_t;
use ::core::{convert::TryInto, slice};
use ::scopeguard::defer_on_unwind;

#[repr(C)]
pub struct DynArray {
    array: *mut i32,
    array_len: size_t,
    component_sizes: *mut size_t,
    component_sizes_len: size_t,
}

#[no_mangle] pub extern "C"
fn rust_alloc() -> DynArray
{
    defer_on_unwind!({ ::std::process::abort(); });
    let v: Vec<Vec<i32>> = vec![vec![1, 2, 3], vec![1, 2, 3]];
    let l: Vec<size_t> = v.iter().map(|v| v.len().try_into().expect("Integer Overflow")).collect();
    let l: Box<[size_t]> = l.into_boxed_slice();

    let v: Box<[i32]> = v.concat().into_boxed_slice();

    DynArray {
        array_len: v.len().try_into().expect("Integer Overflow"),
        array: Box::into_raw(v) as _,
        component_sizes_len: l.len().try_into().expect("Integer Overflow"),
        component_sizes: Box::into_raw(l) as _,
    }
}

#[no_mangle] pub unsafe extern "C"
fn rust_free (array: DynArray)
{
    defer_on_unwind!({ ::std::process::abort(); });
    let DynArray { array, array_len, component_sizes, component_sizes_len } = array;
    drop::<Box<[i32]>>(Box::from_raw(slice::from_raw_parts_mut(
        array, array_len.try_into().expect("Integer Overflow"),
    )));
    drop::<Box<[usize]>>(Box::from_raw(slice::from_raw_parts_mut(
        component_sizes, component_sizes_len.try_into().expect("Integer Overflow"),
    )));
}

Also, note how the unsafe has moved from within the function body to the function signature, since the former means that you assert that no matter the input, your code is sound (which it isn't when the arg is DynArray { array: ptr::null_mut(), .. }), whereas the latter does express that there may exist inputs for which your code is not sound.

main.c

#include <stdint.h>
#include <stdio.h>

typedef struct {
    int32_t * array;
    size_t array_length;
    size_t * component_sizes;
    size_t component_sizes_length;
} DynArray_t;

DynArray_t rust_alloc (void);
void rust_free (DynArray_t);

int main ()
{
  DynArray_t vec = rust_alloc();
  rust_free(vec);
}
2 Likes

Excellent that's really helpful. One question - what is the reason for the defer_on_unwind! macro?

1 Like

into_boxed_slice will never reallocate

Yes, it will. As the docs say,

Note that this will drop any excess capacity.

The implementation calls shrink_to_fit which calls RawVec::shrink_to_fit which calls Global::realloc.

1 Like

panic!-king (unwinding) across FFI (e.g., out of an extern "C" function) is Undefined Behavior. The macro calls make it so if a panic! were to happen, it will abort the process to avoid unwinding out of the function, thus preventing that UB.

Another way to achieve this is to compile the crate with panic = "abort" [profile...] in the Cargo.toml, but I'm personally not fond on making the soundness of the code rely on the compilation environment. Or in this case at least, using the macro instead makes this abort-on-unwind property more local and thus visible / resilient to refactoring copy/pastes (such as in a forum post :wink:)) .

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.