How to return byte array from Rust function to FFI C?

I’m integrating my Rust library with multiple languages, and for basic types like int, bool it works great, but how to pass something like &[u8] or Vec<u8> ?
For example I have similar function like this

pub fn generate_data() -> Vec<u8> {
   // returning Vec<u8>
}

And I need to call this from C/C++ and receive byte array. Main concern for me: is it going to mess with Rust’s memory model or it is doable without unsafe thing?
Thanks!

2 Likes

You can’t take data pointer from C and return it to Rust as Vec. Only Rust can allocate a Vec, because it’s always freed using Rust’s own private allocator. If you want to return a Vec, you’ll have to copy the data into it first.

There’s CVec for allowing Rust to use malloc-allocated data.

&[u8] is a type that means “you never ever have to worry about freeing it”, so you can return it from a function only as &'static [u8] if C leaked that memory or it’s from a global/static variable in C, but that’s rather rare.

1 Like

thanks for replay @kornel
So I can write something like this

pub fn generate_data() -> &'static [u8] {
   // returning &[u8]
}

And use it as a byte buffer pointer from C/C++ ?
If I got it right, ownership for the data passed with static lifetime would be on C/C++ for memory cleanup right?

If you want to return Rust allocated memory, then you’ll need to export a function to free it as well, which the C code can call. Here’s a quick example:

#[repr(C)]
struct Buffer {
    data: *mut u8,
    len: usize,
}

extern "C" fn generate_data() -> Buffer {
    let mut buf = vec![0; 512].into_boxed_slice();
    let data = buf.as_mut_ptr();
    let len = buf.len();
    std::mem::forget(buf);
    Buffer { data, len }
}

extern "C" fn free_buf(buf: Buffer) {
    let s = unsafe { std::slice::from_raw_parts_mut(buf.data, buf.len) };
    let s = s.as_mut_ptr();
    unsafe {
        Box::from_raw(s);
    }
}

You may want to consider having the Rust code take an externally allocated buffer instead, so that its (de)allocation is handled elsewhere.

As noted upthread, the important thing is to not mix up the different allocators.

4 Likes

Sorry, I misread which direction you want to pass the data.

C doesn’t understand Rust slices, so you can’t give them to C at all. For passing data to C you have to use raw C pointers, like *const u8.

But careful with raw pointers, because they are unsafe, just like in C. Use-after-free and dangling pointers to stack variables are possible. So when you get a pointer to a Rust object, you must ensure it’s not a temporary on stack (i.e. use Box to allocate it on the heap), and make sure Rust won’t free it while C is still using it (that’s why @vitalyd’s example has mem::forget()).

Box::into_raw() and Box::from_raw() is a good pair giving pointers to C and getting them back to release the memory.

Lifetimes don’t pass ownership. Lifetimes don’t do anything in a running program. Lifetimes only describe to the compiler what would happen to the memory anyway (they’re like assert()). 'static informs the compiler that nobody will free this memory, it’s leaked and there’s no cleanup.

2 Likes

I was just curious the code presented by @vitalyd, so let me ask you a question.

In generate_data() function, std::mem::forget(buf) makes not call drop(). Slice in Box will be leaked here. In free_buf() function, slice be made in unsafe block, but I think this is not previous one. So this example seems causes memory leak. Is my understanding wrong?

No, the allocation and free match up correctly.

Thanks.

But why? At least I think length field inside of a Slice in buf:Box in generate_data() was lost in free_buf(). When was this field released?

The length field is not part of the allocation.

buf.len() returns length of slice. Is the length field placed in stack memory of generate_data() by Box<[T]>?

Yes. Box<[T]> consists of a pointer and a length. If you ask Rust for its size, you'll see this in that it will be 16 bytes.

I understand. Very thank you!

Finally I tried example code. It caused segmentation fault. But because explanation given by @alice was clear, I immediately realized the cause. as_mut_ptr() of slice returns *mut T, this should not give to Box::from_raw() in this case. Because if do such, unintentionally create Box<T>. What we want creat is Box<[T]>. Here is correct code.

extern "C" fn free_buf(buf: Buffer) {
    let s = unsafe { std::slice::from_raw_parts_mut(buf.data, buf.len) };
-   let s = s.as_mut_ptr();
-   unsafe { Box::from_raw(s); }
+   unsafe { Box::from_raw(s as *mut [u8]); }
}

Typical C functions which returns byte sequence without statically known size looks like this:

int generate_data(char* buf, int buflen) { ... }

The function assumes the buffer starts from buf with length buflen is usable, and returns the size of actually written length of the buffer. If the function fails, including the case that given buffer is not large enough, it returns negative integer which represents the error code.

1 Like