Enum bounds in FFI?


#1

I’m trying to write a FFI involving some integer constants, and at the moment I’m defining a bunch of static integers that are cast from enum variants, but I’d prefer to use those enum variants more-or-less directly. My question is, does Rust provide any guaranteed bounding of values passed to the FFI, and if not, how would I best perform such checks myself?

E.g. I’m currently doing this:

#[repr(u8)]
pub enum MyEnum {
  Variant1,
  Variant2,
}

#[no_mangle]
pub static FFI_CONSTANT_1: uint8_t = MyEnum::Variant1 as uint8_t;

#[no_mangle]
pub static FFI_CONSTANT_2: uint8_t = MyEnum::Variant2 as uint8_t;

#[no_mangle]
pub unsafe extern "C" fn ffi_function(ffi_constant: uint8_t) {
  ...
}

but I’d rather cut out the middle man and do this:

#[repr(u8)]
pub enum MyEnum {
  Variant1,
  Variant2,
}

#[no_mangle]
pub unsafe extern "C" fn ffi_function(ffi_constant: MyEnum) {
  ...
}

C header generators such as moz-cheddar and cbindgen suggest it can be done, because they can take the above and produce something like

enum MyEnum {
  Variant1 = 0,
  Variant2 = 1
};
typedef uint8_t MyEnum;

void ffi_function(MyEnum ffi_constant);

The trouble is, AFAIK a C client could call ffi_function(5), even though that isn’t a valid enum value. Pattern matching doesn’t seem to work catching that, and what seems to happen is that values beyond the valid range are clamped:

#[derive(Debug)]
#[repr(u8)]
pub enum MyEnum {
  Variant1,
  Variant2,
}

impl MyEnum {
    pub fn check(&self) -> Result<(), u32> {
        match *self {
            MyEnum::Variant1 | MyEnum::Variant2 => Ok(()),
            _ => Err(1) // The compiler says this is never called.
        }
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    use std::mem::transmute;

    #[test]
    fn check_should_err_if_enum_value_is_not_in_bounds() {
        let id: MyEnum = unsafe { transmute(3u8) };

        println!("{:?}", id);  // This prints Variant2

        assert!(id.check().is_err());  // This assert fails, backing up the compiler's earlier message.
    }
}

I also tried a more complex case where the enum variants didn’t start at zero and had gaps in their range of allowed values, and all invalid values were “clamped” to the highest valid value. I also tried this with debug and release builds, with the same behaviour for both.

Am I missing something? Is this behaviour intended, and can I rely on it? I couldn’t find any documentation mentioning it…


#2

It’s undefined behavior for an enum value to exist that doesn’t map to a variant. Literally anything can happen if that occurs, and it will vary based on the context that it’s encountered.

It’s generally a good idea to avoid them entirely in FFI contexts due to that IMO. An alternative approach is to define an opaque wrapper around the raw C type and define constants for each variant.


#3

I’m not sure what you mean by the raw C type, but isn’t defining constants for each variant what I’m already currently doing?


#4

You’d use pub struct MyEnum(u8) rather than #[repr(u8)] pub enum MyEnum { ... }. The API itself would need to take and receive just a u8 rather than a MyEnum until #[repr(transparent)] is stabilized.


#5

I see. What’s the benefit of implementing it that way vs. matching to an enum inside my current ffi_function()? E.g.

#[no_mangle]
pub unsafe extern "C" fn ffi_function(ffi_constant: uint8_t) {
  let enum_value = match ffi_constant {
    x if x == FFI_CONSTANT_1 => MyEnum::Variant1,
    x if x == FFI_CONSTANT_2 => MyEnum::Variant2,
    _ => /* Pretend I have error handling. */
  };
  ...
}

#6

Sure, that also works. The wrapper type is more useful if it’s the Rust representation of a C “enum”, where as this seems like the other way around.


#7

Ah, yes, this is providing an interface to a Rust lib to use in C(++). Thanks for answering my questions!