Inconsistencies in enum discriminants

I've just read about update to rust 166 and tbh I found that explicit discriminants on enums behave in inconsistent (to the rest of the language) way. Couple of examples:

#[repr(u8)]
enum Foo {
    A(u8) = 0,
    B(i8) = 1,
    C(bool) = 42,//compiles, even though doesn't make sense as well as this let b:bool = 42; won't compile
}
#[repr(u8)]
enum Colour {
    BLACK(u128, u128, u128) = 0,//compiles, but BLACK(u128, u128, u128) = (0,0,0) doesn't compile even though it makes sense and the former doesn't. Also the former doesn't compile in the let form: let mut tp = (1,1,1); tp = 1;//doesn't compile which makes sense.
}

You're mixing discriminant with the contained value. In your case, enum Foo might have roughly this layout (note that the layout is technically unspecified, so you must not rely on it, but the semantics should be the same):

  • one byte for the explicit discriminant, since that's what you've requested with repr(u8);
  • one byte for the contained value, since each of them (u8, i8 and bool) can fit in it.

Not that 42 is assigned to the first part, not to the second one.
To see this even more clearly, consider the following code:

#[repr(u8)]
enum Foo {
    A(u8) = 0,
    B(i8) = 1,
    C(bool) = 42,
}

fn main() {
    println!("{}", std::mem::size_of::<Foo>());
    // prints 2, as described above
    
    // Note that this `transmute` is in general unsound - 
    // here it's _probably_ fine, since we don't have any padding bytes, but I wouldn't bet on it
    println!("{:?}", unsafe { std::mem::transmute::<Foo, (u8, u8)>(Foo::C(true))});
    // prints (42, 1), i.e. (discriminant, bool as u8)
}

Playground

3 Likes

I get it but it is inconsistent with the rest of the language. You could also argue that let b:bool = 42; should compile.

Why so? When creating an enum definition, you're not specifying the contained values. Or do you think that struct Foo { b: bool = false } should compile?

2 Likes

But you do:

As for the question you asked about struct, yes, I believe it should compile, it would mean default values if created with let's say new().

What syntax would you propose to set the discriminant, then, if the current syntax would be reserved for setting a default value (not that I think implicit defaults are a good idea in itself, but still)? What would be consistent?

1 Like

Obviously I don't understand what discriminant means in this context - seriously. Could you please explain that to me?

As for implicit default values? Where/How are they implicit. One is very explicit about it, it is the same:

fn new()->Self
{
Self{a:1,b:false}
}

See the playground in my first post. In short, your enum Foo is two bytes: one for "which variant it is?", another for "what is the data inside this variant?"

At the creation site.

1 Like

Thanks, that makes it bit clearer. What situation one would use it in (that enum with discriminants)?

Why would that be bad compared to have it written in new()
So instead of:

fn new()->Self
{
Self{a:1,b:true}
}

we could write:

fn default()->Self
{
Self{}
}

Because new is not the only way for struct creation, it's only a convention, which is not always applicable.

1 Like

I've edited prev post

Or even simpler

#[derive(Default)]
struct Foo
{
a:u8=1,
b:bool=true,
}

Especially the second option is an improvement on what is currently in Rust where the default is implicit.

I believe there's at least interest from the language team for some sort of feature similar to default field values, but thats really not relevant to this syntax. If enum variants got default field values as a first class language feature, it would probably look more like

enum Fake {
    Variant(u8 = 12)
}

I think the syntax of explicit discriminants with variants that contain fields is a little bit confusing, but it makes sense in the context of the feature that existed before this change.

You've been able to set the discriminant for "plain" enums for quite some time with this syntax

#[repr(u8)]
enum Color {
    Red = 1,
    Blue = 2,
    Green = 3,
}

When you don't have fields stored inside the variants, there's much less room for confusion.

1 Like

Rust enums are tagged unions; as well as the visible fields (if any - in your case, each variant in enum Foo carries some data), the variants have a hidden tag value, called the discriminant, which allows Rust to identify which variant you have.

So, given the following:

#[repr(u16)]
enum Bar {
    A = 0,
    B(i8) = 256,
    C(u16, bool) = 1024,
}

Rust first uses the repr(u16) to tell it that the discriminant (the hidden field that marks out which variant you're holding) is a u16. Then, it determines what it needs to store for each variant:

  • Bar::A is a u16 for the discriminant, which will always be 0, nothing else
  • Bar::B(_) is a (u16, i8) - the u16 for the discriminant (which will be 256, because I've set it explicitly), the i8 for the data.
  • Bar::C(_, _) is a (u16, u16, bool) - the first u16 is the discriminant (and will always be 1024), the next one and the bool are the contained data.

It then lays out the variants so that the discriminant is in a fixed location relative to the "start" of the enum. When you do:

fn do_the_thing(b: Bar) {
    match b {
        Bar::A => { ... }
        Bar::B(data) => { ... }
        Bar::C(data, valid) => { ... }
    };
}

Rust accesses the discriminant field (normally hidden), and compares the value of the discriminant to the value it should have for each of the match arms - so Bar::A is a check to see if the discriminant is 0, Bar::B(data) checks to see if the discriminant is 256 and so on.

2 Likes

Are you asking in what case one would want to specify explicit discriminant values for an enum? The first case that comes to my mind is when an enum represents some protocol, and the discriminant is the packet type, and its value is the payload.

2 Likes

Thank you. That is really well explained and only after your explanation I understood it fully.

Thanks, makes sense!

Good explained for me!