Enum crimes - getting the discriminant

I want to use an enum to represent frames in a simple protocol, where the discriminant is the frame identifier and the fields carry the frame payload. Since Rust 1.66 we can set explicit discriminants on enums with fields. The announcement mentions that it's not possible to extract the discriminant without unsafe, but doesn't specify the exact method to do this.

Treating the initial memory as the integer type used to represent the values seems to yield promising results:

#[repr(u16)]
enum Zwoop {
  A = 1,
  B(String) = 2
}

fn main() {
  let v = Zwoop::A;
  let d = unsafe { <*const Zwoop>::from(&v).cast::<u16>().read() };
  println!("{}", d);

  let v = Zwoop::B(String::from("hello"));
  let d = unsafe { <*const Zwoop>::from(&v).cast::<u16>().read() };
  println!("{}", d);
}

How dangerous is this? What guarantees does Rust provide wrt the layout of enums? Is there something similar to struct's #[repr(C)] but for enum? I assume that for enums that carry extra data Rust could theoretically put the discriminant anywhere it wants.

(This is mostly curiosa. I originally went down this path due to DRY, but I'm finding that perhaps a few repetitions has it's advantages over worrying about layout stability of enums).

See here and here.

2 Likes

That’s not true for #[repr(u16)] enums though. See the second link above.


As a corollary, this isn’t true either, AFAICT. The code in the original post actually looks sound to me :slight_smile:

2 Likes

I'm intrigued, when would this occur?

Never on a #[repr(u…)] enum. On ordinary enums (without any #[repr(…)] attributes), it isn’t specified when this happens or not, except for certain use-cases of enums that look like Option, in which case the discriminant is guaranteed to not exist as a value at all.

4 Likes

Oh, that's nice. I'm wondering how reliable this actually is, though.

Nonetheless, I'd definitely still recommend against going the unsafe route, since mem::discriminant() is available and safe.

This is the mentioned section of the release notes

Note: whereas for field-less enums it is possible to inspect a discriminant via as casting (e.g. Bar::C as u8 ), Rust provides no language-level way to access the raw discriminant of an enum with fields. Instead, currently unsafe code must be used to inspect the discriminant of an enum with fields. Since this feature is intended for use with cross-language FFI where unsafe code is already necessary, this should hopefully not be too much of an extra burden. In the meantime, if all you need is an opaque handle to the discriminant, please see the std::mem::discriminant function

I don't understand why it says you need unsafe and then goes on to say that you can use std::mem::discriminant. Where would the differnce in outcome be?

Is it that std::mem::discriminant doesn't necessarily provide the specified discriminant but just a unique and arbitrary yet fixed one?

I believe this it the point, yes. You cannot actually get the u16 value through mem::discriminant.

1 Like

It's even easier: mem::Discriminant provides no way to get the actual underlying value. It's an opaque wrapper.

That’s not true. Internally, core::mem::discriminant just wraps the result of core::intrinsics::discriminant_value, which returns a <T as DiscriminantKind>::Discriminant, in a one-fielded tuple struct Discriminant, so you just can either unwrap this structure somehow or use the intrinsic directly:

#![feature(core_intrinsics)]

#[repr(u16)]
enum Foo {
    Bar = 15,
    Baz(&’static str) = 21,
}

let foo = Foo::Bar;
let bar = core::intrinsics::discriminant_value(&foo);
assert_eq!(bar, 15u16);

You can't "unwrap" it, because its field is private. And so is #![feature(core_intrinsics)], so you are not allowed to use it.

2 Likes

Why am I not allowed to use it? I just ran same code on the playground.

What’s related to “unwrapping” - transmute or use unions. The option of using intrinsics is much better, as it doesn’t involve any unsafe, but if you can use only stable, just transmute it.

Could you clarify what you are suggesting here? Transmuting the std::mem::Discriminant value? That’s unsafe code again.

And it’s much worse than directly inspecting the #[repr(u16)] enum. Using unsafe code to read the u16 directly from a pointer to the enum is well specified behavior, guaranteed safe according to the reference, as indicated in previous posts of mine.

On the other hand, transmuting mem::Discriminant is not allowed, according to the standard library documentation.

This trait is automatically implemented for every type and does not add any guarantees to mem::Discriminant. It is undefined behavior to transmute between DiscriminantKind::Discriminant and mem::Discriminant.


Running code with such a transmute in miri does unfortunately not report the UB here… as far as I understand the situation it’s library UB though, i.e. future versions of Rust are allowed to turn such code into actual UB without warning.

Even though I don’t understand the reason of this, thanks, I will not suggest transmuting as another option anymore. Now, the only correct way is using an intrinsic - discriminant_value function.

If the goal is safe code, there’s always the alternative of writing a match and relying on compiler optimizations to make it efficient.

3 Likes

discriminant_value is safe and sound. This passes all checks, and Miri doesn’t report any UB.

"Not allowed" is not the same as "won't compile". UB is not allowed, either, yet you can make code with UB compile.

You aren't allowed to rely on transmuting, either, because the layout of Discriminant is unspecified (as it's not #[repr(C)]).

That’s true, but doesn’t transmute check both types to have the same layout?

I, by the way, have already abandoned the transmuting idea, because it turns out that doc highly doesn’t recommend to do this on Discriminant and states as UB.

No, it doesn't. It only checks sizes. Anyway, "same layout" doesn't guarantee that transmuting is sound. You need a bunch of additional guarantees (e.g. that all possible bit patterns of the source type are valid for an initialized value of the target type).

1 Like