Allocating a C string in Rust and passing to C

There's a lot of questions and examples out there of either creating a Rust string to FFI or recreating a Rust string from an FFI pointer. However, there are just a few examples of
allocating a buffer on the Rust side and passing a pointer to it to FFI.
So far the recommended ways seems to be:

let v = vec![0; size];
let s = CString::from_vec_unchecked(v);
let ptr = s.into_raw();
let err = call_to_ffi(ptr, size);
assert_eq!(err, 0);
let s = CString::from_raw(ptr);

Is it safe to replace vec![0;size] with Vec::<u8>::with_capacity(size). With the latter, the vector will be uninitialized, but the FFI function will fill it in anyway.

Actually, I might even get by without a CString before FFI:

let mut v = Vec::<u8>::with_capacity(size);
let ptr = v.as_mut_ptr() as *mut i8;
call_to_ffi(ptr, size)
let s = unsafe {CString::from_raw(ptr)}

Is this safe?

What is the most performant way compared to C's ?:

char * buf = (char *) malloc(num);

vec![0; size] should be fast enough most of the time, and also the safest way to do this. You probably don't need the conversion to CString in between, and instead using Vec::as_mut_ptr() should be enough.

First the safe way:

// Allocating zeroed memory is generally fast.
// Over allocating by one isn't strictly necessary
// but allows us to catch misbehaving FFI.
let mut v = vec![0; size + 1];
let ptr = v.as_mut_ptr() as *mut i8;
unsafe { call_to_ffi(ptr, size); }
// find the null termination
let len = a.iter().position(|&c| c == 0).expect("a foreign function overflowed the buffer");

What you do next depends on what you really need. If it's a UTF-8 string you can keep the vec and use &str or turn it into a String.

v.truncate(len);
// Reference it as a `&str`
let s = str::from_utf8(&v).expect("TODO: Handle invalid UTF-8");
// or convert the vec to an owned String.
let s: String = String::from_utf8(v).expect("TODO: Handle invalid UTF-8");

If it's not UTF-8 or you need to use it as bytes for other reasons then simply truncate the Vec to the new size:

// Use +1 if you want to keep the null termination.
v.truncate(len + 1);

I don't think you need CString here. That's usually for ensuring a string is null terminated before passing it to FFI.


The unsafe way is more tricky. If the API doesn't declare how many bytes have been written then you have to manually find the null. Rust cares about avoiding UB and buffer overflows so reading uninitialized memory has to be done with care. It might be best to do so by using a pointer to read each byte up until the last within the capacity and then calling set_len on the Vec. Though to be honest I doubt all this effort will make any noticeable performance difference.

3 Likes

Hey @chrisd,

You're right I don't actually need CString here, and the FFI api tells me the exact size of the buffer, so I can just truncate as you suggested. I wasn't sure about the allocation part, because to me it looks a little hacky to use the vec! macro for this purpose. What if I need a 100k bytes array? I know that the compiler should optimize a 100k pushes to the Vec right? I guess I have to check.

Could you please tell if this is safe?

let mut v = Vec::<u8>::with_capacity(size); 
let ptr = v.as_mut_ptr();
// pass ptr to ffi.

Thank you!

Yes it's safe to do. If the call_to_ffi tells you the length (and you fully trust the function) then you can call set_len with the function you give.

But do note that vec![0u8; size] isn't hacky at all. It's basically a safe way to do alloc_zeroed. This is practically the same as with_capacity except the OS will return zeroed bytes instead of uninitialised memory.

Even if you're on an OS that doesn't have offer the ability to allocate zeroes, zeroing bytes will still be a very efficient operation.

1 Like

FFI is never "safe" in Rust safe/unsafe distinction. As for, "will this cause undefined behavior?", it may if you read uninitialized memory. You need to know that all the memory is written to or you have some other surefire way to know how much memory was written to. You'll have to set the length (not capacity) after your FFI call to reflect that.

Here's an example from the official documentation.

In my case I make an assumption that FFI writes all the bytes, because the same FFI gives me the buffer size it is going to fill and it also returnes an error code if failed.

Since I don't know how much bytes it actually wrote, I assume it wrote in full so truncate or set_len will both do I guess.

Thanks

Take note that truncate will only shorten a length (because of the uninitialized values hazard).