Code review request. A crate to efficiently wrap a type in a #[repr(C)] wrapper

Background: This is something I wrote based on the need described here: A way to return stack-allocated #[repr(Rust)] types to C through FFI?

I plan to publish it once it gets some more eyes for safety / soundness.

I'm primarily looking for feedback on soundness, although I also want to get the efficiency as high as possible (ie. give the compiler the opportunity to avoid memcpy operations where possible, and I'm open to feedback on ergonomics (I know the macros are ugly and hopefully they won't be needed soon).

Here is the code:

//! An opaque `#[repr(C)]` wrapper for `#[repr(Rust)]` types that can be passed by value over FFI
//!
//! ```Rust
//! use repr_c_wrapper::*;
//! 
//! #[repr(C)]
//! pub struct OpaqueWrapper(repr_c_wrapper_t!(String));
//!
//! #[no_mangle]
//! pub extern "C" fn some_func() -> OpaqueWrapper {
//!    OpaqueWrapper("hello".to_string().into())
//! }
//! ```

use core::mem::{size_of, ManuallyDrop};

/// Wraps a type in an opaque `#[repr(C)]` wrapper with a size equal to the original `T`
#[repr(C)]
pub struct ReprCWrapper<const SIZE: usize, T> {
    bytes: [u8; SIZE],
    phantom: core::marker::PhantomData<T>,
}

impl<const SIZE: usize, T> From<T> for ReprCWrapper<SIZE, T> {
    fn from(val: T) -> Self {
        Self::new(val)
    }
}

impl<const SIZE: usize, T> Drop for ReprCWrapper<SIZE, T> {
    fn drop(&mut self) {
        let val = unsafe{ &mut *(&mut self.bytes as *mut [u8; SIZE]).cast::<ManuallyDrop<T>>() };
        unsafe{ ManuallyDrop::<T>::drop(val); }
    }
}

impl<const SIZE: usize, T> ReprCWrapper<SIZE, T> {
    /// Returns a `ReprCWrapper` from a `T`
    pub fn new(val: T) -> Self {
        assert_eq!(SIZE, size_of::<ManuallyDrop<T>>());

        let val = ManuallyDrop::<T>::new(val);
        Self {
            bytes: unsafe{ *(&val as *const ManuallyDrop<T>).cast() },
            phantom: core::marker::PhantomData
        }
    }
    /// Borrows the inner `T` from a `ReprCWrapper`
    pub fn borrow(&self) -> &T {
        unsafe{ &*(&self.bytes as *const [u8; SIZE]).cast::<T>() }
    }
    /// Mutably borrows the inner `T` from a `ReprCWrapper`
    pub fn borrow_mut(&mut self) -> &mut T {
        unsafe{ &mut *(&mut self.bytes as *mut [u8; SIZE]).cast::<T>() }
    }
    /// Consumes the `ReprCWrapper`, and returns the inner `T`
    pub fn into_inner(self) -> T {
        let val = unsafe{ core::ptr::read_unaligned((&self.bytes as *const [u8; SIZE]).cast::<ManuallyDrop<T>>()) };
        core::mem::forget(self);
        ManuallyDrop::<T>::into_inner(val)
    }
}

#[cfg(test)]
mod test {
    use super::*;

    derive_wrap_repr_c!(String);

    #[test]
    fn test_derive_wrap_repr_c() {

        let mut wrapper = "Hello".to_string().into_repr_c();
        assert_eq!(wrapper.borrow(), "Hello");

        *wrapper.borrow_mut() = "World".to_string();
        assert_eq!(wrapper.borrow(), "World");

        let the_string = wrapper.into_inner();
        assert_eq!(the_string, "World");
    }

    struct VecCWrapper(repr_c_wrapper_t!(Vec<usize>));

    #[test]
    fn test_repr_c_wrapper_t_macro() {

        let wrapped_vec = VecCWrapper(vec![3, 2, 1].into());

        assert_eq!(wrapped_vec.0.borrow(), &[3, 2, 1]);
    }

}

/// A `ReprCWrapper` type that corresponds to a wrapped version of `T`
///
/// NOTE: This macro is a stop-gap convenience to automatically get the correct type size,
/// until a future version of Rust stabilizes `generic_const_exprs`.  At that point, it will
/// be as simple as using `ReprCWrapper<T>`.
#[macro_export]
macro_rules! repr_c_wrapper_t {
    ( $t:ty ) => { $crate::ReprCWrapper<{core::mem::size_of::<$t>()}, $t> };
}

/// Derive this trait with [derive_wrap_repr_c] to conveniently get a [ReprCWrapper] for a type
pub trait WrapReprC {
    type WrapperT;

    /// Wraps the type in a [ReprCWrapper]
    fn into_repr_c(self) -> Self::WrapperT;
}

/// Derive the [WrapReprC] trait on a type
/// 
/// NOTE: this isn't a derive proc macro because you might want to explicitly implement this
/// trait on a built-in type, and also because this macro is a stop-gap until a future version
/// of Rust eliminates the need for the [WrapReprC] trait altogether.
#[macro_export]
macro_rules! derive_wrap_repr_c {
    ( $t:ty ) => {
        impl $crate::WrapReprC for $t {
            type WrapperT = $crate::ReprCWrapper<{core::mem::size_of::<Self>()}, Self>;

            fn into_repr_c(self) -> Self::WrapperT {
                Self::WrapperT::new(self)
            }
        }
    };
}

//FUTURE NOTE: This implementation will get much cleaner when generic_const_exprs
// is merged into stable.  https://github.com/rust-lang/rust/issues/76560
//When that happens, the trait, derive macro, and associated subcrate can be totally eliminated

// #![feature(generic_const_exprs)]

// #[repr(C)]
// pub struct ReprCWrapper<T>
//     where [(); size_of::<ManuallyDrop::<T>>()]:
// {
//     bytes: [u8; size_of::<ManuallyDrop::<T>>()],
// }

Thank you.

The creation of a reference in borrow() and borrow_mut() (and possibly elsewhere) is unsound because your byte array isn't guaranteed to be aligned.


There is a lot of complicated pointer manipulation going on, usually going through references. That's highly unsafe, because references must always point to valid data, and mutable references must be unique, and such complicated code usually violates either or both of these requirements, and is usually a sign of not understanding them in the first place.

To be entirely honest, I don't see the purpose of this code. You could just pass a regular raw pointer to C code that needs to read/write from/to the value, and avoid manual memory management and pointer casting shenanigans.

1 Like

Thanks for your feedback.

The creation of a reference in borrow() and borrow_mut() (and possibly elsewhere) is unsound because your byte array isn't guaranteed to be aligned.

Would it be enough to change the declaration to:

#[repr(C)]
#[repr(align(64))]
pub struct ReprCWrapper<const SIZE: usize, T> {
    bytes: [u8; SIZE],
    phantom: core::marker::PhantomData<T>,
}

My thinking is that 64 is the lowest (greatest) common denominator on today's machines. Or do some types need an even coarser alignment?

There is a lot of complicated pointer manipulation going on, usually going through references. That's highly unsafe, because references must always point to valid data, and mutable references must be unique, and such complicated code usually violates either or both of these requirements, and is usually a sign of not understanding them in the first place.

On the Rust side, I think those constraints hold. Ie. the ownership of T is given to the wrapper. borrow() and borrow_mut() are both protected by the fact that they require a reference / mutable reference to the wrapper. And into_inner() consumes the wrapper.

On the C side, all bets are off. But that's the fundamental nature of FFI.

To be entirely honest, I don't see the purpose of this code. You could just pass a regular raw pointer to C code that needs to read/write from/to the value, and avoid manual memory management and pointer casting shenanigans.

In this use case, C is at the top of the call stack. It would be valid to create placeholder structs in the C header file, and ask the C programmer to instantiate them, and then pass a pointer into the Rust FFI function to actually copy the Rust type into the buffer.

However, I think that has exactly the same safety issues. ie. it's still a raw ptr and I need to use unsafe to copy the Rust type(s) into that buffer.

Thanks again.

There is a lot of complicated pointer manipulation going on

I replaced all the reference coercion with addr_of! and now it's a lot clearer what's happening. ie. this:

unsafe{ *(&val as *const ManuallyDrop<T>).cast() },

turns into this:

unsafe{ *addr_of!(val).cast() },

You need to account for that exactly because Rust has explicit #[repr(align(…))]. If you want to use this as a wrapper around Rust types, then 64 is certainly not universal.

Structs can have ~arbitrary alignment.

error: Undefined Behavior: accessing memory with alignment 128, but alignment 256 is required
   --> src/main.rs:52:17
    |
52  |         unsafe{ &*(&self.bytes as *const [u8; SIZE]).cast::<T>() }
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ accessing memory with alignment 128, but alignment 256 is required
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE:
    = note: inside `ReprCWrapper::<256, test::WellAligned>::borrow` at src/main.rs:52:17: 52:65
note: inside `test::test_repr_c_well_aligned`
   --> src/main.rs:97:20
    |
97  |         assert_eq!(wrapped_wa.0.borrow(), &WellAligned([0; 256]));
    |                    ^^^^^^^^^^^^^^^^^^^^^
note: inside `main`
   --> src/main.rs:106:5
    |
106 |     test::test_repr_c_well_aligned();
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Reading uninitialized bytes like padding bytes is UB:

error: Undefined Behavior: constructing invalid value at [256]: encountered uninitialized bytes
   --> src/main.rs:46:28
    |
46  |             bytes: unsafe{ *addr_of!(val).cast() },
    |                            ^^^^^^^^^^^^^^^^^^^^^ constructing invalid value at [256]: encountered uninitialized bytes
    |
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE:
    = note: inside `ReprCWrapper::<512, test::WellAligned>::new` at src/main.rs:46:28: 46:49
note: inside `<ReprCWrapper<512, test::WellAligned> as std::convert::From<test::WellAligned>>::from`
   --> src/main.rs:27:9
    |
27  |         Self::new(val)
    |         ^^^^^^^^^^^^^^
    = note: inside `<test::WellAligned as std::convert::Into<ReprCWrapper<512, test::WellAligned>>>::into` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/convert/mod.rs:716:9: 716:22
note: inside `test::test_repr_c_well_aligned`
   --> src/main.rs:96:46
    |
96  |         let wrapped_wa = WellAlignedCWrapper(WellAligned([0; 256]).into());
    |                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `main`
   --> src/main.rs:106:5
    |
106 |     test::test_repr_c_well_aligned();
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2 Likes

Thanks for the insight and explanations!

So RE alignment:

Since my wrapper struct depends on the a generic, there is no alignment I could provide using a #[repr(align(??))] directive that would be guaranteed to cover every case.

So I have 3 ideas for work-arounds:
1.) Just pick a reasonable number, like 8, and if align_of::<T>() > 8 then panic. So this crate just won't work for those types.
2.) Make the wrapper type be a union between the [u8; SIZE] and T itself, and then provide methods to access the bytes array as a #[repr(C)] type. (I haven't prototyped this yet so there is probably a snag I'm not seeing there)
3.) Use something like https://crates.io/crates/elain (I would just use elain except the alignment types aren't #[repr(C)] :frowning:

BTW: what tool / build settings are you using to get that output? When I run your tests with a normal debug build they just pass.

Thanks again for all your feedback.

It's Miri.

1 Like

bytemuck is the go-to crate for safely converting types at a low-level. It will either return an Option or panic if you have the wrong alignment, for example, and there are traits for things like "has no padding".

You can run Miri in the playground under Tools, top-right. Or in your dev environment. Miri is the first stop for "is this UB" questions.

4 Likes

The returned bytes array wont be aligned if you return it by value. And you need MaybeUninit<u8> on the rust side.

How do you generate the header file?

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.