Can Box::leak be complemented with Box::into_raw?

This is similar to Can Box::leak be complemented with Box::from_raw, but not quite the same. In that topic, leak is used to create the pointer.

The topic Passing mutable objects across ffi and back is slightly more in line with my goal, but there the object lives within Rust and is then passed as a reference to foreign code.

In my use case, I'm dealing with a generic struct opaque to the caller (not strings like the first topic), created by Rust on the foreign side (Rust is the foreign call, Java in my case is the host).

The following code should help make it clearer (playground):

// (obviating https://doc.rust-lang.org/nomicon/ffi.html#representing-opaque-structs for simplicity)
use std::ffi::c_void;

#[derive(Debug)]
struct Foo {
    data: i32,
}

#[no_mangle]
pub extern "C" fn foo_create() -> *mut c_void {
    Box::into_raw(Box::new(Foo { data: 0 })) as *mut c_void
}

#[no_mangle]
pub extern "C" fn foo_increment(pointer: *mut c_void) {
    let foo = Box::leak(unsafe { Box::from_raw(pointer as *mut Foo) });
    foo.data += 1;
}

#[no_mangle]
pub extern "C" fn foo_print(pointer: *mut c_void) {
    let foo = Box::leak(unsafe { Box::from_raw(pointer as *mut Foo) });
    dbg!(foo);
}

#[no_mangle]
pub extern "C" fn foo_free(pointer: *mut c_void) {
    let _ = unsafe { Box::from_raw(pointer as *mut Foo) };
}

// pretend this main() function is Java code calling into the Rust code
// (in the real code the functions return i64 and not a pointer type
//  but that should not relevant to the question)
fn main() {
    let ptr = foo_create();
    foo_increment(ptr);
    foo_print(ptr);
    foo_free(ptr);
}

In essence, some opaque object Foo is put into the heap with Box::new in foo_create. It is then converted into a pointer with Box::into_raw.

Now, here's the question: both foo_increment and foo_print reconstruct the Box with Box::from_raw, but then immediately leak it with Box::leak to prevent it from being freed (essentially behaving like ManuallyDrop), since the Java code (main() in the example) is still holding the ownership of the object.

I ran the program with Miri in the playground, and it did not seem to complain. Is leaking the Box like this to prevent it from being freed safe to do? Or should the functions be implemented with something more akin to:

#[no_mangle]
pub extern "C" fn foo_increment(pointer: *mut c_void) {
    let mut foo = unsafe { Box::from_raw(pointer as *mut Foo) };
    foo.data += 1;
    let _ = Box::into_raw(foo);
}

(i.e. re-converting the Box into a raw pointer). This doesn't feel right to me for two reasons:

  • If foo_increment were to panic before Box::into_raw was called, unwinding would free the memory which Java could then incorrectly attempt to use.
  • This is "creating a new pointer" which is then discarded, under the assumption the pointer will be the same as the old one, so that the old one can remain being used.

Both the leak and the re-into_raw approach compile and run fine in playground under Miri.

Perhaps there is a third way I'm unaware of. I might be overthinking this, and maybe the solution is to not touch Box at all until I need to free the memory like so:

#[no_mangle]
pub extern "C" fn foo_increment(pointer: *mut c_void) {
    let foo = unsafe { &mut *(pointer as *mut Foo) };
    foo.data += 1;
}

It is. The Box::into_raw() and Box::from_raw() methods don't deal with pointers to the box; the returned and consumed pointers are pointing to the value inside the Box. So for accessing the value without changing its ownership, simply dereference that raw pointer, and don't even bring Box into the picture at all.

2 Likes

Thanks, will do that then. I'm still curious if Box::leak(Box::from_raw()) is okay to do though, even if it's redundant.

To further clarify: the confusion came from the fact Box seems, at a glance, a wrapper, so it would make sense one would have to recover that Box to work with it. But since it's actually just a smart pointer (and so, just a pointer at the end of the day), there's no need to go through it to access the data.

I'm not sure about that. It seems problematic because the raw pointer is not unique and it doesn't represent ownership — IIUC the Java VM calling your Rust code is the owner and has a copy of it. This might as well violate Box's assumption of unique ownership even if it's not destroyed. (The problem is not aliasing raw pointers – raw pointers are explicitly allowed to alias, what seems fishy here is the double ownership, even if temporary.)

1 Like

No, not at all. It's perfectly fine for a wrapper to hand out a pointer to its internals, and that's exactly what Box is doing, even if it's trivial. You might as well imagine another wrapper like this:

struct MultiWrapper {
    a: *mut A,
    b: *mut B,
    c: *mut C,
}

impl MultiWrapper {
    fn into_a(self) -> *mut A {
        deallocate(self.b);
        deallocate(self.c);
        self.a
    }

    unsafe fn from_a(a: *mut A) -> Self {
        let b = allocate_b();
        let c = allocate_c();
        Self { a, b, c }
    }
}

If you obtained a pointer to a, it is perfectly fine to use that pointer-to-a without continuously re-creating a b and c just to also re-create a MultiWrapper. This is basically the same as the impl Deref<Target = T> for Box<T> implementation, except with raw pointers.

The point is: Rust's type system is locally compositional. If you have a pointer or reference to a value, you shouldn't need to care where it came from, whether it lives on the heap or the stack, whether it's in a static or a local variable, etc. If you have a *mut T or a &T or a &mut T, you should be able to treat the dereferenced value as a T, no matter where exactly it is or who created (or didn't create) it.

If this weren't the case, it would practically be impossible to write correct, non-trivial programs.

1 Like

Gotcha, that makes perfect sense then.

During creation, Rust has ownership of the Box, and gives up on it via into_raw. Once returned, Java is now the sole owner.

If Rust needs to access it while Java is still the owner, then it doesn't make sense to recover ownership via from_raw (even if it's "temporarily"), but rather use the value through a reference, because as you point out there would be two "logical" owners.

Once Java no longer needs it, it gives its ownership up back to Rust, which can recover the Box via from_raw and then let it drop to free the memory.

1 Like

I would like to point out a minor nitpick might be off-topic.

fn foo_print and families should be marked with unsafe, since they are obviously not safe for all input, the input pointer could be a invalid pointer to Foo.

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.