Whilst working on an userspace interface to a kernel driver, and I stumbled upon some cumbersomeness.
When interacting with the kernel from userspace, there are times where a process must prepare an array of bytes and pass it through a syscall to be consumed by the kernel code. Sometimes, the structures expected in the byte array are defined as C/C++ structs. Sometimes, said structs actually have padding, which is easily replicated on the Rust side via #[repr(C)]. The issue arises when it comes to constructing a byte buffer, and as I see it there are 5 choices:
unsafe { std::slice::from_raw_parts(&my_struct as *const _ as *const u8, mem::size_of<MyStruct>()) } which is bad because it constructs a slice which contains uninitialized bytes, and reading those is UB, so the resulting slice cannot be copied into a different buffer whilst retaining the ability to sleep at night. The bad part is obviously the undefined behavior.
Do the same as above, but construct a slice of MaybeUninit<u8>, which is somewhat more cumbersome, because it can't be copied into a slice of regular bytes. The somewhat cumbersome bit is that now it's no longer a byte slice, so io::Write is out of the question, and if a single struct that should be written into a buffer has padding, all the other types and the buffer itself need to be &[MaybeUninit<u8>], right until it gets converted to a *const u8 for the FFI call.
Construct the buffer beforehand, and instead of copying the struct into the buffer, cast a pointer into the buffer to the type you'd want to write and write into it through that. The bad part is that one needs to keep track of more pointer arithmetic, the buffer always has to be created before an instance of the struct, and this requires a fieldwise copy of the struct in question.
Copy the bytes from the struct to the buffer field by field, filling in the padding with zeroes where necessary. This might be doable with a macro. The only bad thing about this solution is that the macro needs to be written.
Instead of using #[repr(C)], use #[repr(packed)] and pad it manually.
All of the above solutions either introduce UB or are somewhat cumbersome, requiring a lot more code than I personally feel comfortable with. Hence I have a feeling that I'm unaware of a more optimal solution. For now, I'm choosing to use &[MaybeUninit<u8>] in cases where padded structs need to be written to a buffer. Maybe one should define the structs as C unions?
I'm not trying to make a value judgement about Rust's behavior here, but when compared to C, it's awfully difficult to achieve
A concrete example would be useful, since the best solution depends on the details of the structs and syscalls involved.
If the FFI function only reads from the provided buffer, I would typically construct a MyStruct in Rust, and then pass a raw pointer to that struct:
let my_struct = MyStruct::new();
ffi_function(&my_struct as *const MyStruct as *const u8);
If the FFI function only writes to the provided buffer, then I would typically construct a MaybeUninit<MyStruct>, and then use MaybeUninit::as_mut_ptr:
let mut my_struct: MaybeUninit<MyStruct> = MaybeUninit::uninit();
ffi_function(my_struct.as_mut_ptr() as *mut u8);
let my_struct = my_struct.assume_init();
Note that in both cases we convert directly from raw struct pointers to raw byte pointers, without creating any slice types, so we don't need to worry about invalid slices.
Consider the case where the buffer needs to contain multiple structs, sometimes of different types.
extern "C" fn syscall(buffer: *const u8) -> u32, and the buffer is supposed to contain multiple different copies of different structs.
The structures in question might also have variable size in C, and the as_byte_slice() is pseudocode for whatever would be the best way of copying bytes from a struct to a slice/array/vector.
#[repr(C)]
#[derive(Clone, Copy)]
struct Unpadded {
a: u32,
b: u32,
}
#[repr(C)]
#[derive(Clone, Copy)]
struct Padded {
a: u8,
b: u16,
}
fn foo(unpadded: &[Unpadded], padded: &[Padded]) -> u32 {
let buffer_size =
unpadded.len() * mem::size_of::<Unpadded>() + padded.len() * mem::size_of::<Padded>();
let buffer = vec![0u8; buffer_size];
for u in unpadded.iter() {
buffer.write(u.as_byte_slice());
}
for p in padded {
buffer.write(p.as_byte_slice());
}
unsafe { syscall(buffer.as_ptr()) }
}
I wonder if the unstable MaybeUninit::write_slice and slice_as_mut_ptr methods will help. For example, this code is pretty verbose, but otherwise seems reasonable: