Converting *const c_char to &str


#1

I would like to use an unsafe C function that returns a raw pointer to a C string,

  extern "C" {
      pub fn unsafe_fun() -> *const ::std::os::raw::c_char;
  }

and create a safe wrapper function. Is this the right way to do it?

pub fn safe_fun() -> Result<&str, Utf8Error> {
    let mut char_ptr = std::ptr::null() as *const c_char;
    unsafe { unsafe_fun(&mut char_ptr) };
    let c_str = unsafe { CStr::from_ptr(char_ptr) };
    c_str.to_str()
}

#2

This doesn’t look quite right:

  • If your FFI function returns a char*, then you must catch that return value
  • You need to explain Rust what is the lifetime of the output string

Assuming that said lifetime is 'static (which means that the C code promises to never deallocate the string, that’s true if it is a global variable for example), you can try this…

pub fn safe_fun() -> Result<&'static str, Utf8Error> {
    let char_ptr = unsafe { unsafe_fun() };
    let c_str = unsafe { CStr::from_ptr(char_ptr) };
    c_str.to_str()
}

…but another thing to keep in mind is that Rust assumes that data reached via &str will not change, whereas C does not make this assumption about const char*. So you need to check that the underlying C API guarantees that the underlying string will never be mutated after the first access.

If either of the above requirements ('static lifetime and immutability) is not met by the C API, then you will need to use a heavier FFI style in order to work around it, for example by making an “owned” copy of the C string into a String.

Finally, you may want to check if the C API is allowed to return a null pointer. If so, you will need to test for it before calling CStr::from_ptr.


#3

Thanks for correcting my example. So as I don’t know how long the string exists I should make a owned copy.
Can you tell me what lifetime would be assigned to the reference if I don’t use 'static?


#4

If you decide to make an owned copy, you will simply return a String like this:

pub fn safe_fun() -> Result<String, Utf8Error> {
    let char_ptr = unsafe { unsafe_fun() };
    let c_str = unsafe { CStr::from_ptr(char_ptr) };
    c_str.to_str().map(|s| s.to_owned())
}

In this case, no lifetimes are involved, but you pay the price of a memory allocation and copy.

Lifetimes could get involved, however, if you later found out that the C API provides some guarantee about the lifetime of the output string (e.g. that it is valid as long as a certain other API object is valid), and decided to model that in the API.

Here is an example, where the lifetimes were annotated explicitly for clarity:

pub fn safe_fun<'a>(_session: &'a SomeSessionType) -> Result<&'a str, Utf8Error> {
    let char_ptr = unsafe { unsafe_fun() };
    let c_str = unsafe { CStr::from_ptr(char_ptr) };
    c_str.to_str()
}

In this example, I happen to know that the string emitted by unsafe_fun() will remain valid and constant as long as a certain object of type SomeSessionType is in scope and is not modified. So I encode that into my FFI API, and can then safely go back to zero-cost string slices.


#5

Hmm, this may or may not be sound depending on how the C library works. From what I can see, there are two issues you want to look out for:

  • Mutability: const can’t be enforced across the FFI boundary, so if your C library mutates the string you’re gonna have a bad time. (e.g. your &str reference thinks it’s pointing to 10 characters, but now it only points to 8).
  • Lifetimes: What lifetime can you assign to the &str you get out? Unless it’s a string actually embedded in the C library chances are it can’t be 'static. Likewise, your function doesn’t take any input parameters so there’s nothing to bound the reference’s lifetime to. Instead we’re just conjuring a lifetime out of thin air (see Unbounded Lifetimes).

It also looks like you’re using the function wrong in your example (the extern "C" decl is fn() -> c_char whereas you use it as fn(&mut c_char)), but I’m going to assume that was just a typo.


#6

Thanks that makes it much clearer to me. I actually have such an object &self. I guess if I omit the lifetime parameters then the lifetime of self is used?


#7

Yes. If a lifetime parameter is required on the output of a function, and there is exactly one input reference, the Rust compiler will infer that the output data must come from there and use that reference’s lifetime automatically.