Creating stack space allocation from whole cloth in unsafe

I am wrapping an old C library and am therefore diving into the unsafe world. I want to be able to construct a rust object purely from the results of a C function, but the only solution I found requires that the object's type have Default, or otherwise be initialized somehow.

Here's what I mean.

I have a function which I have brought in via rust-bindgen which uses the typical C pattern of accepting a pointer and length argument, and constructing the object by writing to that memory, with the memory chunk being provided by the user.I want to take that memory chunk and trust that it's a valid rust struct once done.

Note: Not shown here, I use a proc macro to ensure the struct is a flat collection of f64s with #[repr(C)], which matches the layout expected. (I hope lol!)

I have a function provided by rust-bindgen with a signature similar to:

use std::ffi::c_void;

mod ffi {
  unsafe fn read(dest: *mut c_void, size: usize) -> *mut c_void;
}

I am wrapping this in the following code:

use std::ffi::c_void;

pub fn read<B: ReadableThingie>() -> Result<B, SomeError> {
  let size = size_of::<B>();

  let block: B = B::default();

  let ptr = &block as *const B;
  let ptr = ptr as *mut c_void;

  let result = { unsafe { ffi::read(ptr, size) } };

  if result.is_null() {
    Ok(block)
  } else {
    Err(SomeError)
  }
}

I believe this works as intended. I'm assuming, of course, that ReadableThingie meets the criteria of memory layout matching what's expected, and that it implements Default. By constructing the Default variant, I can be sure that the stack allocation has been made correctly, and that ptr is a good sane thing to send to C-land.

The thing is, and I admit this is kind of nit picky, I don't want to require ReadableThingie to impl Default. I don't care about what was in memory before since I am positive that every bit of the new struct will be initialized correctly. Also, a user may intruduce some shennanigans with an explicit impl Default that would be more complicated than just assigning each field to zero/some literal. Which is fine, but that's wasted work since I'm just going to clobber it inside of the ffi call. Also, though I don't enforce this, the only way a user should be constructing one of these is via this method anyways. There's no reason a user would want a default instance of one of these in practice.

Is there a way to get the stack space allocated uninitialized (or I would accept zero'd) and just transmute it?

Other techniques I've tried:

  • create an array of u8 with the size of B, and use one of the transmute variants to wish the object into being. Works fine if B is known, doesn't work in generics. Rust refuses to do the monomorphization here.
  • create a Vec and do something similar to above. Works even in a generic, but allocates heap which I feel is unnecessary.

I get why rust requires that block be initialized normally, but in this case I'm providing the constructor via the C function and don't necessarily need it initialized, just allocated.

I can also just live with requiring Default. But just for educational purposes, is there a way around this?

tha's undefined behavior right there. writing through a & is always undefined behavior. block is not declared mut, so any way you find to write to it will be undefined behavior.

what you are looking for though is called MaybeUninit :

pub fn read<B: ReadableThingie>() -> Result<B, SomeError> {
  let size = size_of::<B>();

  let mut output: MaybeUninit<B> = MaybeUninit::uninit();

   // SAFETY : the pointer given is valid for writes for size_of::<B>() as it comes from an MU<B>
  let result = { unsafe { ffi::read(output.as_mut_ptr().cast(), size) } };

  if result.is_null() {
     // SAFETY : ffi::read guarantees that B must be properly initialized in this branch
    Ok(unsafe { output.assume_init() } ) 
  } else {
    Err(SomeError)
  }
}

i am very surprised you haven't learned about MaybeUninit before tbh

2 Likes

This is exactly what MaybeUninit is for.

You also have a second problem in your code: you're writing to the memory of block without permission. You haven't declared block as mut, and you're taking an & immutable reference to it to derive your pointer from. That’s UB and you must not do it.

Corrected and cleaned up version:

use std::ffi::c_void;
use std::mem::MaybeUninit;

pub fn read<B: ReadableThingie>() -> Result<B, SomeError> {
  let mut block: MaybeUninit<B> = MaybeUninit::uninit();

  let ptr: *mut c_void = block.as_mut_ptr().cast::<c_void>();
  let size = size_of::<B>();
  let result: *mut c_void = unsafe { ffi::read(ptr, size) };

  if result.is_null() {
    Ok(unsafe { block.assume_init() })
  } else {
    Err(SomeError)
  }
}

Note also that in a situation where you weren’t using MaybeUninit, the modern cleanest way to get a raw pointer from a variable is to use the raw borrow operator &raw:

let ptr: *mut B = &raw mut block;

There is no longer any need to use as. as should generally be avoided whenever possible, as it does too many different things and makes it harder to understand code.

4 Likes

Thank you. MaybeUninit is the solution, and as is often the case with the rust standard library the semantics match pretty much exactly what I was trying to do.

Good call out on forgetting to make it mut also. Such are the dangers of working in unsafe, and good oversight is appreciated! Unfortunately I'm working in a lab with very few other rust developers so I have to get code/concept review from outside.

Is there a good book/blog/etc on working in unsafe out there? I'm kind of just googling for stack overflow articles at this point, and there's some suspect info out there on this topic.

IDK, but the Rust standard library has some extensive documentation in several places:

Additionally, maybe you know already, there is Miri, with somewhat limit use when using FFI. But for learning unsafe it is a very helpful tool.

A rule of thumb is:

If there are references to something, there should be no raw pointers pointing to.

You should definitely read The Rustonomicon. It contains a lot of big-picture introductions to unsafe concepts and techniques.

That’s too narrow. There are lots of cases where you can and should either reborrow a raw pointer as a reference, or reborrow a reference as a raw pointer — either because the reference is what you started with, or because it helps you have less total unsafe code. The important thing is to make sure such uses are structured as a valid reborrow. For example, the original code was incorrect because it tried to make *mut from & and use it, and that is invalid for the same reasons and in the same cases[1] as making &mut from &, not because there was an & involved at all.


  1. except that it would be allowed to create but not use the raw pointer ↩︎

2 Likes