Fast convert from any unsigned int to enum?

I'm trying to convert an unsigned integer to an enum. The enum will have #[repr(<size here>)] with any of the unsigned integer types.

I was hoping for a conversion like:

fn uint_to_enum<U, E>(uint: U) -> E {
    unsafe { std::mem::transmute(uint) }
}

where E is the enum and U is the unsigned type it uses in the repr attribute. In searching for solutions, I finally came across this thread. The original post is similar to the experience I've had searching for potential solutions.

In my case, the transmute seems the best option because there will be no way it can be called with an integer outside of the enum's range. For that same reason, a TryFrom impl would be unecessary.

Is this possible?

Update:

Here's a Playground to convey why I'm trying to convert ints to enum variants.

Nothing has changed since my post. This is still unsafe and will cause serious crashing problems whenever U has a value not valid for E. This blanket transmute has no checks, so it's a really bad idea.

8 Likes

In this case, uint_to_enum must be unsafe, with every call site explicitly stating why it is sound (i.e. why the number is indeed in range).

6 Likes

Unless you mean you have ReprType::MAX variants, that function can be called with an integer outside of the enum's range. So that should be an unsafe function listing the requirements. If you have something else guaranteeing the range some u8 takes, it should be behind some sort of privacy barrier with similar unsafe for mutating methods or with checks to make sure you're in range.

// Invariant: `u8` value must correspond to a discriminant value of `E`
#[repr(transparent)]
pub struct S(u8);

// Safety: `value` must correspond to a discriminant value of `E`
unsafe fn uint_to_enum(value: u8) -> E {
    unsafe { std::mem::transmute(value) }
}

impl S {
    pub fn set(&mut self, value: u8) -> Result<(), ()> {
        if /* check the range ...*/ {
           self.0 = value;
           Ok(())
        } else {
           Err(())
    }

    pub fn variant(&self) -> E {
        // Safety: `self.0` is a discriminant value of `E` via
        // our own invariant
        unsafe { uint_to_enum(self.0) }
    }
}
4 Likes

That's exactly right.

It would be used like this..

  1. Define the enum using a certain unsigned type with the repr attribute
  2. A function will take the enum as a type parameter, similar to how uint_to_enum() does.
  3. Said function will generate a fixed array of the same size of the enum
  4. Encapsulated in a single (separate) function there can be some indexes into the generated array, and those indexes will be within the enum's range. Nothing else touches the array.
  5. The return value can be a vector of these indexes, except they need to be converted to the enum variants to hide that implementation detail.

So, I'm more asking if there's a way to make that conversion without hard coding u8 or u16, etc. so that it can work with all of those.

If that's indexes, they should probably be usizes, no matter the enum representation. Unless you might have the enum with more then usize::MAX variants, of course.

1 Like

But, why use usize when you can use u8 (as an example), and it's all encapsulated so API ergonomics isn't a concern?

Hmm, I don't think so automatically... you'd need to somehow get the repr into the type system. Some playing around that would require distinguishing implementations based on where-clause non-overlap.

But if it's all so encapsulated, why not write your own derive macro or similar that builds this for you?

Incidentally, knowing the size isn't actually enough for the scheme you outline (as best I understood it) without further restrictions, as this is a completely valid enum:

enum E {
    A = 13,
    B = 42,
}

But the discriminant values don't correspond to the indices of [E::A, E::B].

5 Likes

I see what you're saying.. From the playground you made, it looks like there's still a need to write duplicate code for each type (u8, u16, etc.), so maybe it's not much different than just writing separate functions..

I'm just not certain a macro would work.. Say that such a macro would be part of the public API, and it takes in an enum definition. It generates uint_to_enum() and the array (both are for internal use). A function would take the generated array so it can do the work I mentioned in Step 4 of my previous comment. How would that function know about uint_to_enum() in order to call it?

knowing the size isn't actually enough

That's a good point I didn't think about. I guess there's no way to tell the user, "This enum is not supposed to have custom discriminants"?

why you are using enum here?

I means, why not

struct EnumWrapper<U>(U);
impl EnumWrapper<u64> {
    const Item0:u64=0;
    const Item1:u64=1;
    const Item2:u64=2;
    const Error:u64=3;
    const Count:u64=4;
    // ... wrote some function you might using here.
}

then, EnumWrapper(uint) could be regard as the enum

Rust's enum is not what it is in other languages, since other language might not allow things like:

Option<T>{
    Some(T),
    None
}

@Neutron3529 I'm not sure I understand your reason for suggesting a struct. An enum is semantically what I want, both for how end users will define it and how they would use it,
e.g. be able to:

// This one's a u8, but it could also be a u16, u32, usize
#[repr(u8)]
enum Choice {A, B, C}

const list_of_variants: [A, C]
if list_of_variants[1] == Choice::C {
    // Choice::C was available
}

It's just that, in order for the library code to create list_of_variants to provide to end users it needs to be able to convert a list of integers to a list of variants of the enum, which it doesn't know about.. One user might call it Choice, another might call it something else.

I updated my original post with a Playground for hopefully a better explanation of what I'm trying to do.

#[derive(Copy,Clone,PartialEq,Debug)]// and everything you want
struct Choice(u8); // as you wrote #[repr(u8)] here
impl Choice{
    const A:u8=0,
    const B:u8=1,
    const C:u8=2,
}
// const list_of_variants: [A, C] // compiler error
if list_of_variants[1] == Choice::C {
    // Choice::C was available
}

Since your program could not compile directly, I modify your code a little bit to:

/*
I want to provide a library that allows someone to define an enum
where the library would take the enum and, based on some criteria,
generate a `vec` of some of it's variants.

The end user could then loop over the `vec` and ask:
Do I have Choice::One?
Do I have Choice::Two?
And so on.
*/

#[repr(u16)]
enum Choice {
    One,
    Two,
    Three,
    Four,
}
type E=Choice;
/////////////////////////////////////////////
//
// Library from some "Codebase A"
//
/////////////////////////////////////////////
pub mod library {
    use crate::E;
    /// Given a list of all possible variants in `E`, return a subset of
    /// some of its variants. To do that:
    /// - `E` must be an enum defined with a `repr` attribute
    /// of one of the unsigned int types
    /// - `U` must be the same as the `repr`
    /// - `LEN` must be the same size as `E`
    fn propose_choices< const LEN: usize, U>(possible_choice_indices: &[U; LEN]) -> Vec<E> {
        // The magic numbers are only for the Playground. In real
        // life, another process would generate indices that
        // are guaranteed to be valid for enum `E`
        let random_choices = vec![0u16, 3, 2];
        let mut choices: Vec<E> = Vec::new();

        for i in random_choices {
            choices.push(unsafe { std::mem::transmute(i) });
        }
        choices
    }
    pub struct VariantChooser<const LEN: usize, U> {
        choice_storage: [U; LEN]
    }

    impl<const LEN: usize, U> VariantChooser<LEN, U> {
        pub fn using(possible_choice_indices: [U; LEN]) -> Self {
            Self {
                choice_storage: possible_choice_indices
            }
        }
        pub fn choose(&self, do_something: impl Fn(&Vec<E>)) {
            let choices = propose_choices::<LEN, U>(&self.choice_storage);
            do_something(&choices);
        }
    }
}

/////////////////////////////////////////////
//
// Everything below is in some "Codebase B"
//
/////////////////////////////////////////////

fn do_something(choices: &Vec<Choice>) {
    for c in choices {
        match c {
            Choice::One => println!("One"),
            Choice::Two => println!("Two"),
            Choice::Three => println!("Three"),
            Choice::Four => println!("Four"),
        }
    }
}

fn main() {
    // I wish I could do:
    // ```
    //    const choices = library::VariantChooser::using(Choice);
    //    choices.choose(do_something);
    // ```
    // But, that doesn't seem possible.
    // The next best thing might be to create a macro and use like:
    // const chooser = library::variant_chooser!(
    //    enum Choice {
    //      One,
    //      Two,
    //      Three,
    //      Four
    //    }
    // )
    // chooser.choose(do_something);

    // Changes to the enum must be made here as well. It's not in
    // a macro because I'm just trying to convey the idea.
    const ChoiceIndices: [u16; 4] = [0, 1, 2, 3];

    library::VariantChooser::using(ChoiceIndices).choose(do_something);
}

In this case, change enum to struct is simple

Just modify

#[repr(u16)]
enum Choice {
    One,
    Two,
    Three,
    Four,
}

to

#[derive(Clone,Copy,Debug,PartialEq,Eq)]
struct Choice(u16);
impl Choice {
    const One:Choice=Choice(0);
    const Two:Choice=Choice(1);
    const Three:Choice=Choice(2);
    const Four:Choice=Choice(3);
}

and add a catch_all arm to every match clause:

        match *c {
            Choice::One => println!("One"),
            Choice::Two => println!("Two"),
            Choice::Three => println!("Three"),
            Choice::Four => println!("Four"),
            _ => panic!("five"),
        }

Here's another modification I threw together -- not because I think it's great, but to illustrate that with the help of some derive macros, you can create a completely safe version where the user doesn't have to uphold any additional (unchecked by the compiler, or the macro) constraints. Your version could feature a pre-computed or otherwise optimized conversion, and you could probably do away with things like needing Hash. Not that it hurts for a fieldness enum. If you're really set on doing integer conversion specifically, you could check the constraints of that too (enforce a contiguous range starting from 0 if you want, etc).

There's only four reprs you're interested in, so if you need to distinguish on that, I'd suggest just handling the four cases explicitly instead of as a generic.

1 Like

Thank you for taking the time to go more in depth in what you were talking about. I ran it and although it does work, I'm still confused about a few things.

  1. You have use crate::E; in the library module, but I think your code is assuming the library module knows about the end user's enum (i.e. Choice), which it would not. Also, you're using E as an actual identifier to something but E was a Generic placeholder I almost had an "Ahaa!" moment with that, but realized it wouldn't work because there's no way to know what the end user's enum would be called.
  2. There's no way to make the u{8,16,32} generic? I forgot to add the type on one of the lines, so when your code used let random_choices = vec![0u16, 3, 2]; I meant for my version to be let random_choices: Vec<U> = vec![0, 3, 2];. It looks like that might not be possible, is that why you hardcoded the 0u16?
  3. You're suggesting the use of struct because it can be passed around and an enum cannot?
  4. What's the cost of using a struct with a bunch of const variables vs the enum?

Thank you @quinedot, I will need to look at your version more tomorrow, but a couple things that might be a misunderstanding:

  1. In the pretend end user's fn main() you have const CHOICE_INDICES: [Choice; 4] = [Choice::One, Choice::Two, Choice::Three, Choice::Four];, but that's precisely what I'm trying to end up with (a subset of the enum's variants), not start with. When the enum is defined, a const array is generated (hopefully in a hidden way) that's the exact size of the enum containing every index. That would be the input and the output would be a subset of variants instead of indices
  2. Are you making it an iterator because in my pretend end user's do_something() function I looped over them? I'm just curious if that was part of the solution or more of a preference.

I used [E; N] instead of [u64; N] because

  • I didn't see why the user of the library, who defines the enum, would need (or want) to refer to an array of integers as opposed to their typed variants
  • If you take enum values as the input, you know they're valid. If you take integers, they could be invalid
    • This is the sort of thing I was referring to with avoiding putting additional constraints on the user of the library
    • I could have taken a list of indices and checked against the number of variants, but then the constructor would have to return a Result

Instead of generating things in a hidden way, you could implement some trait.

// In your macro, check its repr
impl EnumData for Choice {
    const SIZE: usize = 4;

    // You don't need this if you enforce no custom indices
    // as that's just a 0..Self::SIZE
    //
    // This could be [Self; Self::SIZE] too, because...
    const VALUES: [u64; Self::SIZE] = [0, 1, 2, 3];
    // ...since you checked the repr, you know this will work...
    fn discrim(self) -> u64 { self as u64 }
    // ...but presumably you want integers specifically for some reason.
}

Then because it's checked, no one has to get the list or counting right manually.


And I used the iterator just because I knew strum could give it to me and it was a way to map between integers and variants.

Again, the idea isn't "you should use this." It's "you should use a macro to generate everything you need with certainty." strum macros were able to indirectly provide enough for me to produce a safe version off the cuff, and that crate isn't even aimed at your particular use case.

1 Like

You might be interested in any one of

4 Likes

Firstly, sorry for the typo, I fix things in playground, but forgot sync it in the forum.


That's what I do to bypass the "unknown size" bug. (Actually mark E:Sized is better, but I forgot that.)

Actually your code does not compile due to many reasons, I could not give a perfect fix.

It is not difficult to change my version into the one that accept generics.

the reason is the same. You must know the size of U.

maybe what you want is vec![0,3,2].map(|x|U::new(x.into())).collect().

Yes.

The only cost might be, you must write an extra arm, and mark it as either unreachable!() (in the case you wrote an enum rather than a bitflag) or unsafe{std::hint::unreachable_unchecked()} (if you could ensure you create enums, rather than something outside the enums)

1 Like

They wouldn't.

The comment I had in the Playground's main function showed what the user would see and how they'd use it. The

const ChoiceIndices: [u16; 4] = [0, 1, 2, 3];

line is only to demonstrate what the macro would generate.


Using my Playground example,

The user defines an enum. They would later receive a subset of variants and want to ask, "do I have this variant? what about this one?" Everything between defining the enum and ending up with a vec of variants should be hidden from the user.

The subset is determined by library code the user has no control over, and the library code has no control over the user's code.

For the library to know the possible variants (and for the library to store more metadata for them), An array is generated at compile time and passed to it. In my example, I simplified it to just hold integers corresponding to the index. The important part is that it's the same size as the enum.

The library code also needs to know what enum to convert integers (the indices) to, which I was trying to accomplish by passing the generic E. This is so the library can return actual variants instead of indices to them.

No validity checks should be necessary in the library because the array is generated alongside the enum, and the library function taking the array is called by the macro that generates the array. After the macro is done, hopefully all the state is already passed to the library.

I just realized that enum-map may do what's wanted here. EnumMap<E, bool> is essentially [bool; N] indexable with the enum, and the library can populate it via &mut [bool].

If you don't use custom discriminants, then indexing is just a cast, but the crate also supports enums with custom discriminants (by mapping to contiguous indices) at a very slight performance penalty (but this also doesn't have an impact when the index is constant propagated).

2 Likes