Option<T> for integer-repr enums with no 0-valued variant

rectang · October 21, 2021, 7:25pm

Rust's Option type guarantees that the size of Option<NonNull<T>> is the size of a pointer, because 0 can be used to represent the None variant.

How can I achieve a similar effect for a Rust enum with an integer-repr which has no zero-valued variant?

My motivation is to write a Rust binding for a C enum defined in a third-party C library which specifies values for all of its enumerators, none of which are 0. And yet, the value of a variable with such an enum type may legitimately be 0 in C code, because C's enum is pretty much an int under the hood.

#include<stdio.h>

typedef enum Positive {
    One = 1,
    Two = 2
} Positive;

int main() {
    Positive pos = 0;
    printf("A \"Positive\" value: %d\n", pos);
}

I would like to represent this C enum using a Rust enum — an application of the "newtype" design pattern, which adds semantics to an existing type, generally while preserving the underlying representation. In this case, the additional semantic constraint is that the Rust enum type can only take on the values of its variants, rather than any C int value.

However, we must also accommodate 0 values, as they are used legitimately in C code — for example, if a struct field has this type, its initial value may be set to 0.

(Bogus values which neither correspond to struct variants nor 0 can be handled using careful validation and the num_enum crate.)

It is important that this type have the same memory representation in C and Rust, so that for example if it is used as struct field it will have the right size.

I would like to solve this problem by representing values of this enum type which might be 0 using Option<T>. Here's code that I wish worked (unfortunately it is unsound because Option<T> isn't guaranteed to have an FFI-safe represntation in this scenario):

#[derive(Debug, PartialEq, Eq)]
#[repr(u32)]
enum Positive {
    One = 1,
    Two = 2,
}

// An attempt to implement the following C function in unsafe Rust:
// `Positive foo(int value) { return (Positive)value; }`
unsafe extern "C" fn foo(value: u32) -> Option<Positive> {
    std::mem::transmute(value)
}

fn main() {
    let some_one = unsafe { foo(1) };
    assert_eq!(some_one, Some(Positive::One));
    let some_two = unsafe { foo(2) };
    assert_eq!(some_two, Some(Positive::Two));
    eprintln!("`Some` variants succeed");
    
    let none = unsafe { foo(0) };
    assert_eq!(none, None);
    eprintln!("`None` variant succeeds");
}

My current workaround is to add a zero-valued variant to the Rust version of the enum, but I'm unhappy with that because it intrudes on processing logic in Rust code such as exhaustive matching.

I also tried defining my own type Opt<T, ReprT> which requires a pile of unsafe code I haven't yet figured out how to prove sound at compile time — but even if I solve that problem, the resulting API is going to be verbose and unsatisfactory.

Is there a better way?

scottmcm · October 21, 2021, 7:55pm

It's questionable whether this is something that can ever be nice on the Rust side without a conversion step, since it's not UB on the C side for the value to be outside defined variants, but it is on the Rust side.

Maybe just define TryFrom<u32> for your enum? Then you isolate your Rust code from weird values coming from C.

rectang · October 21, 2021, 8:20pm

Thanks! I agree that TryFrom is an essential piece of the solution. Any time that a value crosses the FFI border from C to Rust, it will need to be validated. The num_enum crate makes this easy, as you can derive TryFromPrimitive.

However, there are still situations where we need to represent 0 — for example, when none of the defined variants make sense as a default value. Consider a Codec enum:

enum Codec {
    Mp3 = 1,
    AAC = 2,
}

struct MyFile {
    length: usize,
    codec: Codec,
}

It is wrong to choose either of the variants of Codec as the default value, because it will be wrong some of the time until it is overwritten. But we need a default value for the MyFile struct. It really ought to be represented like this:

struct MyFile {
    length: usize,
    codec: Option<Codec>,
}
impl Default for MyFile {
    fn default() -> Self {
        Self {
            length: 0,
            codec: None,
        }
    }
}

And theoretically it could be because we know that Codec has no zero-valued variant. But Option doesn't make that guarantee.

system · January 19, 2022, 8:21pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Option of Enum w/o extra size help	2	372	February 23, 2022
What exactly does repr(C) mean on an enum?	4	5839	January 12, 2023
Rust not intialize option type help	9	547	November 8, 2021
Why does an option around T imply a size of std::mem::size_of::<T>() * 2	6	2321	December 24, 2019
Controlling Option layout optimization for PCI	7	503	September 10, 2019

Option<T> for integer-repr enums with no 0-valued variant

Related Topics