Expressing `Vec<Vec<u8>>` and `Vec<String>` in FFI

I'm working on an FFI project, and I need to share with the C code lists of byte arrays and lists of strings. Is this a solved problem? Are there existing crates that do a good job of this?

Here's what I've got so far. I'll use lists of byte arrays as an example.

Vec is not #[repr(C)], of course, so I have to build a struct that is #[repr(C)] and provides a view of the data inside an existing Vec. For instance:

#[repr(C)]
pub struct rustls_slice_bytes<'a> {
    data: *const u8,
    len: size_t,
    phantom: PhantomData<&'a [u8]>,
}

impl<'a> From<&'a [u8]> for rustls_slice_bytes<'a> {
    fn from(s: &[u8]) -> Self {
        rustls_slice_bytes {
            data: s.as_ptr() as *const u8,
            len: s.len() as size_t,
            phantom: PhantomData,
        }
    }
}

To represent a list of such arrays from a Vec<Vec<u8>>, I have to create a separate Vec, because the inner Vec is not #[repr(C)] so I can't just pass it to C.

pub(crate) struct VecSliceBytes<'a>(Vec<rustls_slice_bytes<'a>>);

impl<'a> VecSliceBytes<'a> {
    fn new(input: &'a Vec<Vec<u8>>) -> Self {
        let mut vv: Vec<rustls_slice_bytes> = vec![];
        for v in input {
            let v: &[u8] = v.as_ref();
            vv.push(v.into());
        }
        VecSliceBytes(vv)
    }
}

And then I need to define the view of that Vec:

#[allow(non_camel_case_types)]
#[repr(C)]
pub struct rustls_slice_slice_bytes<'a> {
    data: *const rustls_slice_bytes<'a>,
    len: size_t,
    phantom: PhantomData<&'a [&'a [u8]]>,
}

impl<'a> From<&'a VecSliceBytes<'a>> for rustls_slice_slice_bytes<'a> {
    fn from(input: &'a VecSliceBytes<'a>) -> Self {
        rustls_slice_slice_bytes {
            data: input.0.as_ptr(),
            len: input.0.len(),
            phantom: PhantomData,
        }
    }
}

I think that roughly covers it, though it's somewhat error prone. Have I got roughly the right idea? Is there an already established pattern I should be using?

Playground link:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1a004b202aaca0829204f539c275704e

You can decompose a Vec into a (*const T, usize, usize) triple and recompose it with Vec::from_raw_parts. Using this method you can create something like this. You can then directly use CVec to represent a vector

2 Likes

You might want to check out the SaferFFI crate which has some helpers around passing vectors and strings:

https://getditto.github.io/safer_ffi/

3 Likes

A related question: Right now I'm representing &str to C with a different type than &[u8]. That's because &str comes with the useful invariant that it contains UTF-8. But what would be really useful is a slightly more constrained &str: One that is additionally guaranteed to not contain any NUL bytes. That way, C code can interpolate it into a C-style NUL-terminated string without worry that it could break the string unexpectedly.

Is this a common thing for FFI code? Am I going about this wrong?

Check out CString and CStr for intermediate types. They don't constrain on UTF8, so you'll have some sort of check as part of the conversion in either direction (String contains no NULs; CString contains valid UTF8). Though you can bypass the checks with some unsafety.