SIGSEGV when binding *const c_char


#1

I am trying to write Rust bindings for Clp using bindgen. Clp is a C++ library but it also has a C interface. In the C interface there is the following function:

/** Read an mps file from the given filename */
COINLIBAPI int COINLINKAGE Clp_readMps(Clp_Simplex * model, const char *filename,
                                       int keepNames,
                                       int ignoreErrors);

Bindgen generates the following:

extern "C" {
    /// Read an mps file from the given filename
    #[link_name = "\u{1}_Clp_readMps"]
    pub fn Clp_readMps(
        model: *mut Clp_Simplex,
        filename: *const ::std::os::raw::c_char,
        keepNames: ::std::os::raw::c_int,
        ignoreErrors: ::std::os::raw::c_int,
    ) -> ::std::os::raw::c_int;
}

I am able to call the function with:

let status = Clp_readMps(
    lp,
    CString::new("./file.mps")
        .unwrap()
        .as_ptr(),
    0,
    0,
);
println!("status: {}", status);

but if the following fails with (signal: 11, SIGSEGV: invalid memory reference):

let s = CString::new("./file.mps")
    .unwrap()
    .as_ptr();
let status = Clp_readMps(lp, s, 0, 0);
println!("status: {}", status);

It’s quite puzzling as the only difference between the two calls is using variable binding. Does anyone know how to debug that?


#2

Let us look at the documentation of CString::as_ptr():

Returns the inner pointer to this C string.

The returned pointer will be valid for as long as self is, and points to a contiguous region of memory terminated with a 0 byte to represent the end of the string.

WARNING

It is your responsibility to make sure that the underlying memory is not freed too early. […]

Hopefully that should clear things up: your CString gets freed before you use the pointer which is taken from it, which results in use-after-free, a form of undefined behavior. As soon as you enter undefined behavior territory, the program’s operation becomes unpredictable: it can depends on compiler version, hardware architecture, host OS, surrounding instructions… pretty much everything.

One useful tool for diagnosing memory-related undefined behaviour in C is Memcheck, which is part of the Valgrind suite. I’m not sure how well it works with Rust, but if the two play well together, it could be useful in this kind of FFI scenario.


#3

Thanks @HadrienG. It totally addresses the issue.


#4

The first relies on the compiler keeping the temporary value around until after the call.

I’m not sure it is wise to expect future (maybe even next major update, but not the regular releases) compilers to behave the same. (Possibly linked with NLL changes.) I’m not a compiler dev but it’s possible even they can’t predict every change. (Although the breaking changes are undesirable, they happen.)


#5

This reminds me of a CString::as_ptr() considered harmful thread from a while back. Like others have already mentioned, you’re getting a pointer to a temporary variable which will be dropped at the end of the as_ptr() call (or thereabouts).