How to return byte array from Rust function to FFI C?

I'm integrating my Rust library with multiple languages, and for basic types like int, bool it works great, but how to pass something like &[u8] or Vec<u8> ?
For example I have similar function like this

pub fn generate_data() -> Vec<u8> {
   // returning Vec<u8>
}

And I need to call this from C/C++ and receive byte array. Main concern for me: is it going to mess with Rust's memory model or it is doable without unsafe thing?
Thanks!

4 Likes

You can't take data pointer from C and return it to Rust as Vec. Only Rust can allocate a Vec, because it's always freed using Rust's own private allocator. If you want to return a Vec, you'll have to copy the data into it first.

There's CVec for allowing Rust to use malloc-allocated data.

&[u8] is a type that means "you never ever have to worry about freeing it", so you can return it from a function only as &'static [u8] if C leaked that memory or it's from a global/static variable in C, but that's rather rare.

1 Like

thanks for replay @kornel
So I can write something like this

pub fn generate_data() -> &'static [u8] {
   // returning &[u8]
}

And use it as a byte buffer pointer from C/C++ ?
If I got it right, ownership for the data passed with static lifetime would be on C/C++ for memory cleanup right?

1 Like

If you want to return Rust allocated memory, then you’ll need to export a function to free it as well, which the C code can call. Here’s a quick example:

#[repr(C)]
struct Buffer {
    data: *mut u8,
    len: usize,
}

extern "C" fn generate_data() -> Buffer {
    let mut buf = vec![0; 512].into_boxed_slice();
    let data = buf.as_mut_ptr();
    let len = buf.len();
    std::mem::forget(buf);
    Buffer { data, len }
}

extern "C" fn free_buf(buf: Buffer) {
    let s = unsafe { std::slice::from_raw_parts_mut(buf.data, buf.len) };
    let s = s.as_mut_ptr();
    unsafe {
        Box::from_raw(s);
    }
}

You may want to consider having the Rust code take an externally allocated buffer instead, so that its (de)allocation is handled elsewhere.

As noted upthread, the important thing is to not mix up the different allocators.

6 Likes

Sorry, I misread which direction you want to pass the data.

C doesn't understand Rust slices, so you can't give them to C at all. For passing data to C you have to use raw C pointers, like *const u8.

But careful with raw pointers, because they are unsafe, just like in C. Use-after-free and dangling pointers to stack variables are possible. So when you get a pointer to a Rust object, you must ensure it's not a temporary on stack (i.e. use Box to allocate it on the heap), and make sure Rust won't free it while C is still using it (that's why @vitalyd's example has mem::forget()).

Box::into_raw() and Box::from_raw() is a good pair giving pointers to C and getting them back to release the memory.

Lifetimes don't pass ownership. Lifetimes don't do anything in a running program. Lifetimes only describe to the compiler what would happen to the memory anyway (they're like assert()). 'static informs the compiler that nobody will free this memory, it's leaked and there's no cleanup.

3 Likes

I was just curious the code presented by @vitalyd, so let me ask you a question.

In generate_data() function, std::mem::forget(buf) makes not call drop(). Slice in Box will be leaked here. In free_buf() function, slice be made in unsafe block, but I think this is not previous one. So this example seems causes memory leak. Is my understanding wrong?

No, the allocation and free match up correctly.

Edit for future readers: there's a type mismatch here, actually.

Thanks.

But why? At least I think length field inside of a Slice in buf:Box in generate_data() was lost in free_buf(). When was this field released?

The length field is not part of the allocation.

buf.len() returns length of slice. Is the length field placed in stack memory of generate_data() by Box<[T]>?

Yes. Box<[T]> consists of a pointer and a length. If you ask Rust for its size, you'll see this in that it will be 16 bytes.

I understand. Very thank you!

Finally I tried example code. It caused segmentation fault. But because explanation given by @alice was clear, I immediately realized the cause. as_mut_ptr() of slice returns *mut T, this should not give to Box::from_raw() in this case. Because if do such, unintentionally create Box<T>. What we want creat is Box<[T]>. Here is correct code.

extern "C" fn free_buf(buf: Buffer) {
    let s = unsafe { std::slice::from_raw_parts_mut(buf.data, buf.len) };
-   let s = s.as_mut_ptr();
-   unsafe { Box::from_raw(s); }
+   unsafe { Box::from_raw(s as *mut [u8]); }
}
1 Like

Typical C functions which returns byte sequence without statically known size looks like this:

int generate_data(char* buf, int buflen) { ... }

The function assumes the buffer starts from buf with length buflen is usable, and returns the size of actually written length of the buffer. If the function fails, including the case that given buffer is not large enough, it returns negative integer which represents the error code.

1 Like

When calling free_buf, is it necessary to pass in a whole Buffer or is it sufficient to just pass in a raw pointer?

A raw pointer to what?

  • if you mean a pointer to the Buffer (pointer to the .data and .len fields), then you'll have to dereference-read that pointer to get the values of .data and .len, but it is otherwise equivalent (i.e., you could feature a free_buf(buf: *mut Buffer) function).

  • If you mean the .data pointer, know that it is not enough to free memory that was allocated from within Rust: Rust's allocators demand / require they be given the size of the allocation in order to properly free it.

    So a trick in this situation could be not to allocate the memory within Rust but using libc's malloc, for instance:

    fn slice_to_malloc_buf (xs: &'_ [u8]) -> *mut u8
    {
        use ::core::mem::MaybeUninit as MU;
    
        let ptr = unsafe { ::libc::malloc(xs.len()) };
        if ptr.is_null() { return ptr; }
        let dst = ::core::slice::from_raw_parts_mut(
            ptr.cast::<MU<u8>>(),
            xs.len(),
        );
        let src = ::core::slice::from_raw_parts(
            xs.as_ptr().cast::<MU<u8>>(),
            xs.len(),
        );
        dst.copy_from_slice(src);
        ptr
    }
    

    And the idea is that once you have a Vec, you return the pointer obtained from slice_to_malloc_buf(&vec): such a pointer can be simply and directly freed :slight_smile:

  • Finally, for the case of exporting functions to C from within Rust, I highly recommend a crate such as ::safer_ffi be used:

    image

    With it, the code becomes:

    use ::safer_ffi::prelude::*;
    
    #[ffi_export]
    fn generate_data () -> repr_c::Vec<u8>
    {
        let mut buf = vec![…];
        …
        buf.into() // And that's it!
    }
    
    #[ffi_export]
    fn free_buf (vec: repr_c::Vec<u8>)
    {
        drop(vec); // And that's it!
    }
    

    In the attached guide (click on the picture to see it) is explained how not only this will Just Work™, without requiring you to write unsafe on the Rust side, but also how ::safer_ffi itself will take care of generating the .h header to be #included by C :slight_smile:

3 Likes

This topic was automatically closed 30 days after the last reply. We invite you to open a new topic if you have further questions or comments.