Custom Memory Representation

Hi, I've recently been reading up on OS development and would like to implement something in Rust. I'd like to preface this by saying that I don't know much about OS development and memory in general, so my ideas may be fundamentally flawed. Please correct me if I've understood something wrong.

I want to have system calls that return Results allowing for nice error handling. However, I would also like for these system calls to be able to be called from other languages such as C.

Generally speaking (as far as I can tell) Linux system calls return numbers; a negative number represents an error, with the specific number signifying the type of error. My idea is for the syscall to return a Result<u32, u32>, but for it to be stored in memory like an i32. The first bit would signify whether it is an Ok or Err variant, and the other 31 bits would store the actual value of the call.

Rust code would use this custom Result enum as the return type of the function. However, other languages could also read the value in the same memory location, as just an i32.

Is this kind of custom representation in memory possible in Rust? It seems similar to serialisation and deserialisation but more performant and enforced by the type system.

Example

Given the following error type:

#[repr(u32)]
enum ErrorType {
    Foo = 1,
    Bar = 2
}

an Err(ErrorType::Bar) would be stored as 1000010. An Ok(5) would be stored as a regular five (i.e. 00000101).

Generally speaking, Rust-specific types can't cross the C FFI boundary. The only types allowed to do that are types that C itself recognizes, e.g. #[repr(C)] structs, primitive integers, floating-point types, and pointers. C doesn't recognize enums with associated data, so there's no way you could tell C how to work with one. The ABI of Rust types is also not stable except for the aforementioned few ones.

You can't force the compiler to choose a very specific representation for a type, either. There are some knobs you can turn on FFI-safe types (e.g. the very #[repr(C)] attribute), but that's not much.

What you usually want to do instead is to write a thin wrapper layer over your Rust types which only uses FFI-safe types, and expose that as a separate crate. This is basically the inverse of how you would wrap a C library to expose it behind a safe abstraction to Rust.

1 Like

I want the Rust struct/enum to be read as a C integer. So theoretically something like this:

#[repr(transparent)]
struct Result {
    #[repr(bool)]
    bool is_error,
    #[repr(u31)]
    u31 value,
}

Obviously, this example is impossible because u31s aren't a thing, and you can't #[repr(bool)], but the total size of this struct would be 32 bits.

If I understand correctly, you are saying that the only way to achieve this is to use something like:

#[repr(C)]
struct Result {
    bool is_error,
    u32 value,
}

And then you would have to read it in as a struct in C. And then this would have a From<std::result::Result> and Into<std::result::Result> implementation. Is that right? This wouldn't be as elegant as it would require conversions in Rust and C, and would use up 7 more bits (although that isn't really a concern). I guess if that's the only way to do it, though.

Another way to do it (I think) would be to just have a:

#[repr(C/transparent)]
struct Result(i32);

that implements From<std::result::Result> and Into<std::result::Result>. This would be the return type of the syscall in Rust which would ensure that you check it before doing anything with it, but it would also store it as an i32 in memory, allowing C to access it as an i32.

Would this use a #[repr(transparent)] or a repr #[repr(C)]?

1 Like

It would use #[repr(transparent)] if the C-side treats it as an int, but #[repr(C)] if the C-side treats it as a struct containing a single int. I think this way is the best way.

This could in theory be a valid optimization if you set the highest bit in Foo = 1, and Rust had a u31 type - Result<u31, ErrorType> could be represented as a single u32 since the valid bit patterns of a u31 does not overlap at all with the valid bit patterns of an ErrorType. However, u31 doesn't exist, so this can't currently be done.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.