(Why) is this safe rust?

Hi,

I started to use the ffi function of rust and got confused. This function is callable by c and works as expected. My question is: What is the lifetime of the return value?

extern crate libc;
use std::ffi::CString;

#[no_mangle]
pub extern fn call_me_from_c() ->* const libc::c_char
{
   let thecstring = CString::new("Hello, world!").unwrap();
   thecstring.as_ptr()
}

It is totally clear that C does not know of rusts lifetimes so I assume I need to free the value from C. But If this is so why is this code safe? I thought I need to mark code unsafe if I avoid rusts lifetime system.

The lifetime of the pointer is the function body because thecstring gets destroyed at the end of it, so the function returns an invalid pointer. This is similar to the following which the compiler knows is wrong:

fn foo() -> &str {
    let s = String::new();
    let r: &str = &*s;
    r
}

It was argued that as_ptr functions don't need to be marked unsafe because you can't exploit any unsafety of the raw pointer in safe code. Your case seems like a counterexample to that.

Just like the example above can be fixed by making it return an owned String, you can get an owned pointer with unstable CString::into_ptr. That pointer needs to be returned back to Rust after it's not needed and freed by converting it back into CString with from_ptr.

Can you explain a bit more? The code in the OP seems perfectly safe. A consumer of that function gets a raw pointer, and the only way to dereference that raw pointer is by using unsafe. But the mere presence of an invalid pointer isn't in and of itself unsafe.

3 Likes

That is, IMO, a good intuition. But to answer your question, we need to be a bit more precise. I think the thing you're looking for are the "unsafe superpowers". Namely, unsafe permits you to:

  1. Access or update a static mutable variable.
  2. Dereference a raw pointer.
  3. Call unsafe functions. This is the most powerful ability.

Notice that creating a raw pointer is not one of unsafe's powers. In fact, you can do it in safe code just like your example shows. It is only the dereferencing of a raw pointer that is unsafe. If your program that includes your example function never invokes unsafe, then it must also never dereference the pointer returned by call_me_from_c (assuming you don't pass it to some other library that does). Therefore, even if the returned pointer is dangling, you never actually observe unsafe behavior!

2 Likes

C code calling Rust code can compared to unsafe Rust code calling safe Rust code.
Just because the safe Rust code is used by unsafe code doesn't mean that it must be unsafe itself.

In your example, the call_me_from_c function does absolutely nothing dangerous.

ok,

I think I understand this.
Basically:

  • creating a raw pointer is not unsafe
  • dereferencing a raw pointer is unsafe
  • my function just creates an unsafe pointer but does not dereference it

-> But if I call this function from C and dereference the pointer this will result in undefined behaviour (?)

If I want to return a valid pointer I would use forget?

#[no_mangle]
pub extern fn call_me_from_c() ->* const libc::c_char {
    let thecstring = CString::new("Hello, world!").unwrap().as_ptr();
    forget(thecstring);
    thecstring
}

[quote="BurntSushi, post:3, topic:2042"]
Can you explain a bit more? The code in the OP seems perfectly safe. A consumer of that function gets a raw pointer, and the only way to dereference that raw pointer is by using unsafe. But the mere presence of an invalid pointer isn't in and of itself unsafe.
[/quote]Since the consumer of that extern fn is most likely not Rust (and Rust couldn't call it without unsafe), I guess we could just say: the code on the other side is unsafe by definition so anything goes. But the ability to pass a dangling pointer across the language (library) boundary so easily without a single unsafe seems too foot-gunny to me.

Dereferencing the raw pointer is memory unsafe. Calling the function is not. The code that is dereferencing the pointer is at fault, which sounds like it happens in your C program. It would not be allowed in safe Rust.

The correct way may be to use CString::into_ptr, which is unfortunately quite new. I say maybe because how to handle resource ownership across ffi boundaries like this requires some careful choices, not sure what's the best.

You need into_ptr:

#[no_mangle]
pub extern fn call_me_from_c() -> *const libc::c_char {
    let thecstring = CString::new("Hello, world!").unwrap();
    thecstring.into_ptr()
}

But either way this is a memory leak unless you take care of cleaning up somehow.

However as @bluss said, in some circumstances you may actually not want to return an owned pointer, depending on the actual needs and choices.

2 Likes

[quote="bluss, post:8, topic:2042"]
The code that is dereferencing the pointer is at fault
[/quote]I find it difficult to accept this unless call_me_from_c is expected to return an invalid pointer.

It is how Rust works internally. We can create arbitrary raw pointers in safe Rust but only use raw pointers (offset or deref, or pass to ffi) by using unsafe blocks. We've simply decided that raw pointers have no guarantee of being vaild pointers.

In Rust we have a ways to express that a pointer is always valid: &T, &mut T, Box<T>, Rc<T> and so on. Unfortunately we can't use these across the ffi boundary. We also can't enforce Rust's rules in programs that are not written in Rust..

3 Likes

A raw pointer is just a strong typedef of a usize. There are no guarantees about the value of a number. If the function's doc says it returns a number that's actually the address of a valid object, then it's up to the implementor of the function to make it so. The implementor could just as well return 42 as *const libc::c_char, but he'd be breaking his own contract. As far as Rust is concerned, the function's contract says it returns a number between 0 and 2^32 or 2^64 depending on your platform.

3 Likes