I'm trying to implement some Rust code which will be called from C/C++. There are some data structures on the Rust side, so the idea is to put them into a structure allocated on the heap and return that as an i64 to C/C++ and then back again.
The structure needs to contain an Arc - and I'm having trouble cleaning that up.
struct Thing {
obj: Arc<SomeObject>
}
#[no_mangle]
pub extern "C" fn rust_init() -> i64 {
// Allocate some Rust things
let obj : Arc<SomeObject> = // Normal Rust object ....
// Now we need a structure on the heap to hold it
let layout = Layout::new::<Thing>();
let ptr = unsafe { alloc(layout) as *mut Thing };
unsafe {
(*ptr).obj = obj;
}
return ptr as i64 // The value is opaque to C/C++
}
So far so good - unless someone has suggestions
Now let's try to clean this up, we need to clear the Arc and deallocate the structure on the heap.
#[no_mangle]
pub extern "C" fn rust_destroy(obj: i64) {
if obj != 0 {
let thing = obj as *mut Thing;
// 1 - reset the Arc so it frees the held object
unsafe {
drop((*thing).obj);
}
// 2 - deallocate memory
let layout = Layout::new::<Thing>();
unsafe {
dealloc(thing as *mut u8, layout);
}
}
}
I'm having trouble with statement (1) above. Error message:
cannot move out of `thing.obj` which is behind a raw pointer
move occurs because `thing.obj` has type `Arc<SomeObject>`, which does not implement the `Copy` trait
Think I understand the "does not implement copy trait" part.
But what about the "cannot move" error? I'm quite lost, just trying to reset the Arc pointer - and since there is no reset or clear method, seems I need to call drop but unable to do so.
Any suggestions on cleaning up the Arc inside my Thing structure?
And more broadly, on the overall pattern? I just need to create some Rust things and return them as an opaque value to C/C++ - and to clean up at the end. Experience with Rust so far - 3 days so I'm open to suggestions!
fn rust_init() -> i64 {
let obj : Arc<SomeObject> = // ...
let thing : Box<Thing> = Box::new(
Thing {
obj
}
);
let ptr = Box::into_raw(thing);
ptr as i64
}
And then for cleanup:
fn rust_destroy(val : i64) {
if (val != 0) {
let thing: Box<Thing> = unsafe { Box::from_raw(obj as *mut Thing) };
}
}
Am I understanding correctly that thing: Box<Thing> will be automatically dropped at the end of rust_destroy - and that it will destroy any members of Thing including the obj: Arc<SomeObject>?
Using Box is generally preferable to managing the allocation manually, yes. Just note that any other functions that aren't trying to deallocate the Box should manipulate the pointer directly rather than calling Box::from_raw as well. You can run into issues otherwise.
It's also generally best to avoid casting pointers to integers. You can define your own opaque type, or just cast it to c_void if you don't want to bother with that.
use std::{ffi::c_void, sync::Arc};
struct SomeObject;
struct Thing {
obj: Arc<SomeObject>,
}
#[no_mangle]
pub extern "C" fn rust_init() -> *mut c_void {
// Allocate some Rust things
let obj: Arc<SomeObject> = Arc::new(SomeObject);
// Now we need a structure on the heap to hold it
let boxed = Box::new(Thing { obj });
Box::into_raw(boxed).cast()
}
#[no_mangle]
pub extern "C" fn rust_destroy(obj: *mut c_void) {
if obj.is_null() {
return;
}
// Calling drop explicitly so it's more obvious that our intent is to deallocate this Box
drop(unsafe { Box::from_raw(obj.cast::<Thing>()) });
}
And one more thing, let's say I have a Rust function that "does something" in between init and destroy. Guess I'll need to create a Box and call into_raw before returning even though I don't need the raw pointer again - to make sure that the Box doesn't deallocate the owned object.
#[no_mangle]
pub extern "C" fn rust_do_something(obj: *mut c_void) {
let box = Box::from_raw(obj.cast::<Thing>());
// Do some work
box.obj.doSomethingImportant();
// Make sure box doesn't deallocate - right?
let _ = Box::into_raw(box);
}
It's generally a bad idea to re-create the Box in functions that aren't trying to deallocate it, manipulating the raw pointer directly is often counter-intuitively safer.
#[no_mangle]
pub extern "C" fn rust_do_something(obj: *mut c_void) {
let thing = obj.cast::<Thing>();
// Do some work
unsafe { (*thing).obj.doSomethingImportant() };
}
This has the advantage of not accidentally deallocating if you forget about doing another into_raw, and also avoids some potential undefined behavior at the expense of possibly requiring you to write more unsafe code
Your Rust do_something_important should either take the Arc<T>, in which case it's moved out of the Box and the variable is considered uninitialized (and you won't be able to use it anymore), or it should take a &Arc<T> or &T or the like, in which case you still want to drop the Box yourself (and not use into_raw).
You can never have a usable Box pointing to already-deallocated memory; that's UB. So you never need to leak or into_raw a box to avoid a double free (if you're avoiding a double free in practice by doing so, you're already in UB territory and something has already gone wrong elsewhere).
But dropping the box will deallocate its member objects including obj - correct? I don't want that, let's say rust_do_something might get called multiple times.
// In C++ land
const int64_t rustObject = rust_init();
rust_do_something(rustObject);
rust_do_something(rustObject);
rust_do_something(rustObject);
// And finally
rust_destroy(rustObject);
If you don't mind exposing the Arc, you can make it a lot easier to handle on both the Rust and C sides. Firstly, adopt the convention that the Arc is passed by value. So you always use Arc::into_raw when returning it, and Arc::from_raw when having it passed as an argument. Secondly, expose Arc::increment_strong_count and Arc::decrement_strong_count to C. That way, consumers of your library can use the reference counting as necessary. Thirdly, remove any destroy_rust_obj function, as decrementing the strong count serves the same purpose. Finally, expose similar Weak-based functionality if you think it makes sense to do so.
Additionally, saving a heap allocation and just pretending the Arc is a Box can also work, without needing to leak implementation details.
Isn't it a value object (as opposed to a pointer to a heap allocated structure)? If so, C/C++ won't have a way to represent it - unless I made a struct that is returned by value and which has the same size as Arc but that seems a lot more dangerous than passing around pointers disguised as int's (at least we know the size of those).
Arc is just a smart pointer to a heap object. Use Arc::into_raw to obtain a pointer to your object. Cast into something opaque to C, and then return that.