Correct way to implement a function which returns a C string


#1

I would like to implement a function which is exposed as a C function via FFI, returning a C string. What would be the correct memory management model to use?

My first try was something along the lines:

// In .h
extern const char* const get_string();

// In .rs
#[no_mangle]
pub extern fn get_string() -> *const c_char {
    let s = String::from_str("Hello!");
    let cs = CString::from_slice(s.as_slice().as_bytes());
    cs.as_ptr()
}

But since the lifetime of pointer returned from CString ends as soon as cs goes out of scope, that cannot be the correct way, right? (although it seems to compile just fine). What is the recommended way to do this?


#2

Hm, that’s a complicated question. Basically, like this: http://is.gd/b0mrwV

extern crate libc;

use libc::c_char;
use std::ffi::CString;
use std::sync::{Once, ONCE_INIT};
use std::mem::transmute;
use std::ptr::read;

static START: Once = ONCE_INIT;
static mut data:*const CString = 0 as *const CString;

#[no_mangle]
pub extern fn get_string() -> *const c_char {
    START.call_once(|| {
        unsafe {
            let boxed = Box::new(CString::from_slice("Hello World".as_bytes()));
            data = transmute(boxed);
        }
    });
    unsafe {
        return (&*data).as_ptr();
    }
}

pub fn free_resources() {
    unsafe {
        let _ = transmute::<*const CString, Box<CString>>(data);
        data = 0 as *const CString;
    }
}

fn main() {
    println!("{:?}\n{:?}\n{:?}", get_string(), get_string(), get_string());
    free_resources();
}

You need do two things:

  1. Move the value on to the heap so it is not lost when the stack frame is dropped.

  2. Move the value out of the rust memory management model.

Every Box is equivalent to a heap allocated *const T, and can calling transmute() consumes the value, turning it into a pointer that ‘exists forever’ safely.

You don’t have to use a global variable for the value, but you will leak memory every time you call otherwise.

Notice as well that this code is not thread safe; you’ll need a mutex guard as well to achieve that, but for the simple case it’s not a big deal.


#3

Thanks for very clear answer.


#4

Why not just return the transmuted pointer and instruct the caller to free it later by passing it to a free function:

#[no_mangle]
pub extern "C" fn my_module_free_string(ptr: *mut c_char) {
    let _ = unsafe { Box::<CString>::from_raw(ptr) };
}

Note that it’s better to use mutable pointers as a hint in C APIs that the value is owned.


#6

Honestly I’m not really sure what you’re suggesting.

If you want to create a pointer that can be free’d from the c code, you must allocate it use libc::malloc, not Box.

However, CString is ofter deeply annoying to use; you may indeed wish to use a function that passes in a buffer to be populated using memcpy;

pub extern fn get_string(ptr:*mut c_char, maxlen:c_int) { 
  let src = "Hello".as_bytes().as_ptr();
  let len = ...; // Check length of str vs maxlen - 1
  unsafe {
    copy_memory(ptr, src, len);
    (*ptr.offset(len as isize)) = 0;
  }
}

NB. The extern “C” syntax is obsolete apparently; it’s just extern now.


#7

The idea is that he provides a custom free function that needs to be called to free his particular structure. You don’t need libc::malloc there.


#8

What’s wrong with providing a free function to a Rust-allocated string?

Note also that libc::malloc and the free function used by a C library may be linked to different allocators.


#9

I just didn’t understand what you were suggesting.

If you mean, provide an api like;

fn get_str() -> *mut c_char
fn release_str(sp:*mut c_char)

Which you’d use from c as:

char *sp = get_str();
do_something(sp);
release_str(sp);

Then there’s nothing wrong with that, except that’s a set of allocations per-call, which is a bit of a slow waste if you’re doing it often.

libc::malloc and the free function used by a C library may be linked to different allocators.

Entirely true. It only works if the c program is using the standard allocator (which is guaranteed to be what libc:malloc uses); the point I was making is that using free() from C on a boxed value is undefined behaviour (and almost certainly a segfault, since rust uses jemalloc).


#10

Dredging up an old thread here, but it showed up in my Google search. CString now provides an into_raw method:
https://doc.rust-lang.org/std/ffi/struct.CString.html#method.into_raw

You can use that to return a char* to C code, and then provide a free method that calls from_raw:
https://doc.rust-lang.org/std/ffi/struct.CString.html#method.from_raw