[FFI] Casting C void* to Rust structure (erratum)

Hi everyone,

I am asking you for some help because I'm facing an issue, trying to look for a solution but nothing cames out, here it is.

I am writing a Rust code which is interfacing with C code. On the C side, I got this structure:

typedef struct {
  uint16_t event;
  uint8_t data[];
} HEADER;

The data field can get any size, so I represented this structure in Rust like this:

#[repr(C)]
pub struct Header<T: ?Sized> {
    pub event: u16,
    pub data: T,
}

To malloc this struct in C side, I am doing it like that:

void foo() {
 Header* hdr = (Header*) external_malloc(size_of(smth));
}

But I actually need to cast it in rust due to FFI safe issue ([u8] is not FFI safe). I am sending a void* data to rust instead and I need to cast it to Header.

To handle the data of this structure in Rust, here is how I proceed:

extern "C" {
    // calling my personnal C malloc
    fn  external_malloc(size: usize) -> *mut c_void;
}

fn malloc_c_struct(size: usize) -> &'static mut Header<[u8]> {
    /* I am facing an issue here, the cast is not as straightforward as it is in C*/
    let p_header: &mut Header<[u8]> = unsafe { external_malloc(size) } as &mut Header<[u8]>;
    p_header
}

What is the best way to cast my *mut c_void to &mut Header<[u8]>, something similar to the following C code:

Header* hdr = (Header*) external_malloc(size_of(smth));

Thank you very much for your help :slightly_smiling_face: ! Hope i've been clear enough.

I have no idea at all about how correct or incorrect this is, but this compiles:

use std::ffi::c_void;
#[repr(C)]
pub struct Header<T: ?Sized> {
    pub event: u16,
    pub data: T,
}

extern "C" {
    // calling my personal C malloc
    fn external_malloc(size: usize) -> *mut c_void;
}

fn malloc_c_struct(size: usize) -> &'static mut Header<[u8]> {
    /* I am facing an issue here, the cast is not as straightforward as it is in C*/
    let p_header: *mut Header<[u8]> =
        unsafe { std::ptr::slice_from_raw_parts_mut(external_malloc(size) as *mut u8, size) } as _;
    unsafe { &mut *p_header }
}

Edit: It also compiles without the as *mut u8.

1 Like

Regarding this line and the online documentation:

std::ptr::slice_from_raw_parts_mut(frs_osi_malloc(size) as *mut u8, size)

will reconstruct the raw pointer into a bytes slice given the number of elements, here size. So we can admit that line give us a [u8].

What does exactly do this ?

as _;

I don't get how the following line:

[u8] as _;

Can give us a Header<[u8]>, do you know if it's reconstruct properly the Header struct ?

So as _ is the same as as *mut Header<[u8]> in this case (the type of p_header that I’m assigning to). I just avoided writing this type twice.

In Rust there are fat pointers and ordinary pointers. Fat pointers are pointers or references to unsized values like:

  • slices
  • trait objects
  • structs that have an unsized type as their last field

Ordinary pointers are an address in memory, and fat pointers are twice as large, consisting of an ordinary pointer plus some extra data which is

  • the length of the slice,
  • the pointer to the vtable of a trait object,
  • or the data part that a pointer to the unsized type contained inside of the struct would have

for each of the three cases listed above respectively.

Pointers in rust cannot be cast between ordinary pointers and fat pointers. It is however possible to cast slice pointers to other kinds of slice pointers, or pointers to an unsized T to pointers to structs containing such a T. For these casts the data part just remains untouched.

In this case, this means that a pointer *mut Header<[u8]> is a fat pointer having as its data part the length of the data slice. You can create a *mut [u8] pointer from its parts, a *mut u8 pointer and a usize length, with std::ptr::slice_from_raw_parts_mut. Then this *mut [u8] can be cast into a *mut Header<[u8]>.

Without the as *mut u8, the std::ptr::slice_from_raw_parts_mut takes a *mut c_void and returns a *mut [c_void] and since its a slice, and Header<[u8]>’s unsized field is a slice, too, this *mut [c_void] can be cast into a Header<[u8]>.

Finally, casting a pointer into a reference is not possible in Rust. Instead you have to dereference the pointer and create a reference to that destination, as in the expression &mut *p_header.

1 Like

Alright, I will give it a try on my project !

Thanks a lot for your time, that's enlightened me, still a lot to learn in Rust :smiley:

This is probably not correct because size is not the length of the [u8]. Or maybe it is? I'm really not sure. What does the argument to external_malloc represent -- the actual size passed to malloc, or the size only of the data member?

Most of the time in C you have a struct with a FAM (flexible array member) and the size is another field of the struct. So something like

struct Header {
    uint16_t event;
    size_t len;
    uint8_t data[];
};

And when you create it you have to make space for both parts and initialize the size field:

header_t *header_new(size_t len_data) {
    header_t *ret = calloc(sizeof *ret + len_data, 1);  // space for both the header and the FAM
    ret->len = len_data;
}

Translating this into Rust doesn't really work because unsized values in Rust make pointers fat. So the "usual" way to do the same thing in Rust is as you said,

struct Header<T: ?Sized> {
    event: u16,
    data: T,
}

But to create one you would usually create the Sized version and coerce it. There's not (as far as I know) any allowance in Rust for making a custom DST that isn't coerced from a Sized type. You can't even do it with unsafe. (For trait objects you can transmute a std::raw::TraitObject but there's no equivalent for slices.)

Casting *mut [u8] to *mut Header<[u8]> is the kind of thing that sounds like it ought to work, but I don't think it's actually guaranteed to do anything meaningful. Obviously the same length can't be correct for both of those types, because that would make them different sizes. But I guess it might work?

What I've seen elsewhere is just to wrap the C type in a Rust type that holds a pointer, and not expose the underlying type with the FAM at all.

1 Like

This was what I heard in the past, too. But I’m also pretty certain that internally, a DST struct will just have the extra data of the contained DST-field’s pointer attached to its pointer type. This would mean that a *mut Header<[u8]> fat pointer would have the same size value as its data as the contained [u8]. Also the rust compiler is pretty strict about disallowing pointer casts where the “vtable types may not match”. Try e.g.

trait T1 {}
trait T2 {}

fn foo() {
    let x: *mut dyn T1 = todo!();
    let y: *mut dyn T2 = x as _;
}

which doesn’t work — which makes me feel like the pointer cast not being rejected might be a good hint that there’s a chance for that my casts above are actually legal.


So perhaps we need to update the idea that a non-generic type like

struct Foo {
    field1: i32,
    field2: [u8],
}

is impossible to construct, at least when the type is repr(C).


Also, to give some practical evidence, put this into the playground:

#[repr(C)]
#[derive(Debug)]
struct Struct1 {
    field1: i32,
    field2: [u8],
}

#[repr(C)]
struct Struct2 {
    field1: i32,
    field2: [u8; 10],
}

fn foo(x: &mut Struct2) -> &mut Struct1 {
    let y: *mut Struct1 = std::ptr::slice_from_raw_parts_mut(x, 10) as _;
    // EDIT2: curiously, Miri is much happier WITHOUT this drop(x)....
    // drop(x); // just to be sure this is under no circumstances UB because of
                // aliasing &mut references; who knows if this is even necessary...

    unsafe { &mut *y }
}

fn main() {
    let mut s = Struct2 {
        field1: 42,
        field2: [1,2,3,4,5,6,7,8,9,10],
    };
    let x = foo(&mut s);
    println!("{:?}", x);
}

Edit1: I just noticed that ptr::slice_from_raw_parts_mut is not even unsafe.

The issue here is that, according to stacked borrows, transferring ownership of x into the drop function invalidates any raw pointers previously derived from x. If you want to ensure x is no longer available, you need to use a construct that makes x leave scope without mentioning it after it’s produced the pointer:

let y: *mut Struct1 = {
    let x = x; // move `x` into this block
    std::ptr::slice_from_raw_parts_mut(x, 10) as _
};
3 Likes

Alright, some slight mindf**k ... I mean ... not 100% perfectly intuitive these rules, but so be it.

Unfortunately this doesn’t really answer the original question of whether any other way of trying to re-borrow a mutable reference through an intermediate pointer is UB because of aliasing, or just fine because ... I don’t know ... we don’t actually access x anymore and Miri doesn’t complain either.

Edit: I guess this explanation would answer my question with “no, it is fine to use the pointer like that when x is simply not used anymore”.

1 Like

I'll respond to you with the same answer I met with when arguing something similar:

It seems to me that, for certain kinds of UB, the way the language is defined really allows only one reasonable interpretation, and the documentation should probably be updated to codify that "reasonable interpretation" as "how the language actually works". But until that happens it's still technically UB to rely on it, regardless of how sensible the explanation sounds.

(If there is a reading of any documentation that justifies casting *mut [u8] to *mut Header<[u8]> I don't know about it, but I'll be happy to eat my words if I'm wrong.)

As I recall, the C spec used to distinguish between “undefined behavior” and “implementation-defined” behavior. At some point both of these concepts got merged together, but it seems this may have been a mistake.

There’s code that’s unsound because computers are fickle, and there’s code that’s correct today, but considered unsound because the compiler developers might want to do something differently in the future. (std gets to use this latter category because it’s released in compiler-version-specific packages.) These seem like fundamentally different categories of unsafe-ness, and perhaps it’s a good idea to treat them differently in discussions.

It's tangential, but those are still distinct as of the C18 standard (and the latest draft AFAIK). There is some behavior that was previously undefined which has been "promoted" to unspecified or implementation-defined, and there's some other new stuff to do with bounds checking in the standard library that redefines some things that were previously UB, but the standard maintains the distinction between undefined and unspecified behaviors.

2 Likes

I beg to differ. (So long as the type is a slice-detived DST. Link is to a crate I wrote to do exactly that.) You're perfectly able to do raw allocation with the alloc module and then cast that into typed pointers, and even create regular Boxes.

And yes, doing this is defined behavior. The tricky parts are calculating the correct layout for the type and creating the correct pointer; actually doing the allocation and initialization is just annoying busywork (well, modulo unwind correctness).

1 Like

Indeed, size does not represent the length of the [u8] but another field.

I called it external malloc because it is allocating space for the size passed in argument plus others inner sizes inside the function itself.

Indeed, especially when the rust pointer in not pointing to the same address in memory than the C's one, refering to what @steffahn said earlier:

Resulting to what have been said in this discussion, I decided to keep the memory allocation logic in C and holding the resulting pointer inside a Rust struct. As mentioned @trentj in his first comment:

Hi guys,

Here is the final answer coded to resolve the original question. The Rust pointer point at the same memory address than the C's one. So logic can finally be handle in Rust (Bye bye C :kissing_heart:)

C Code:



#include <stdio.h>

#include <stdlib.h>

#include <inttypes.h>

​

void *runtime(size_t len);

​

typedef struct t {

    uint16_t event;

    uint8_t data[];

} tt;

​

void * external_malloc(size_t len) {

  void * mem = malloc(len);

  tt *foo = mem;

  printf("Mem alloc is %p for %d\n", mem, (int)len);

  return mem;

}

​

void c_callback(void *mem) {

  printf("callback %p\n", mem);

}

​

int main() {

  int len = 11;

  tt *foo = runtime(len);

  printf("Got %p %hu\n", foo, foo->event);

  for (int i = 0; i < len - 2; ++i) {

    printf("[%d] %hhu\n", i, foo->data[i]);

  }

  free(foo);

}

And the Rust code associated:



use std::ffi::c_void;

#[repr(C)]

pub struct Header<T: ?Sized> {

    pub event: u16,

    pub data: T,

}

#[no_mangle]

extern "C" {

    fn external_malloc(size: usize) -> *mut c_void;

    fn c_callback(msg: *mut c_void);

}

fn alloc_memory(size: usize) -> *mut c_void {

    unsafe {external_malloc(size)}

}

fn malloc_c_struct(size: usize) -> &'static mut Header<[u8]> {

    let p_header: *mut Header<[u8]> = std::ptr::slice_from_raw_parts_mut(alloc_memory(size), size) as _;

    unsafe { &mut *p_header }

}

​

const EVENT: [u8; 4] = [4,1,8,7];  // Just as example

​

#[no_mangle]

pub extern "C" fn runtime(size: usize) -> *mut c_void {

    let t = malloc_c_struct(size);

    t.event = size as u16;

    let offset = 5;

    t.data[offset..(offset + EVENT.len())].clone_from_slice(&EVENT);

    let ptr = t as *mut _ as *mut c_void;

    unsafe { c_callback(ptr);}

    ptr

}

Thank you all for your help !

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.