Rust enum: extract 'tag' , offset of fields?

I know about offset_of in field_offset - Rust , which for structs, allows me to get the offset of arbitrary fields.

Is there something similar for enum? I want to be able to extract:

  1. the 'tag' bits, representing which 'arm' of the enum we go with

  2. for each 'arm' of the enum, somehow extract the offset of the tuple elements

Is there someway to pull this data? (I'm okay with crates, macros, procedural macros, etc ...)

[ imagine you want C to be able to read a Rust enum ]

Due to niche optimization, these don’t necessarily exist as a separate field. Option<&T>, for example, uses a null pointer to represent None instead of a separate tag.

3 Likes

I'm speechless. Besides rustc, is there anything else in the world that understands how to map bits <-> enums ?

It is documented in the reference here.

1 Like

You gotta use the appropriate #[repr(...)] annotations as described in the link above. If you don't do this, the layout of the enum is unspecificed and may change at any time.

4 Likes

Layout is not defined for #[repr(rust)] enums. But you can use #[repr(C)] or #[repr(<int>)] that have a well-defined layout, see The Reference (#[repr(C)], #[repr(<int>)]).

For example:

#[repr(C)]
enum MyCoolEnum {
    A { data: i32 },
    B { something: Box<i32> },
}

#[no_mangle]
extern "C" unsafe fn get_enum() -> *const MyEnum {
    Box::leak(Box::new(MyCoolEnum::B { something: Box::new(1) })) as *const _
}
typedef enum { MyCoolEnum_A, MyCoolEnum_B } MyCoolEnumTag;
typedef struct { int32_t data; } MyCoolEnumAPayload;
typedef struct { int32_t *data; } MyCoolEnumBPayload;
typedef union {
    MyCoolEnumTagAPayload A;
    MyCoolEnumTagBPayload B;
} MyCoolPayload;
typedef struct {
    MyCoolEnumTag tag;
    MyCoolEnumPayload payload;
} MyCoolEnum;

extern "C" const MyCoolEnum * get_enum();

int main() {
    MyCoolEnum *v = get_enum();
    if (v->tag == MyCoolEnum_B) { /* ... */ }
}
1 Like

I'm a bit slow on this. If I want a Rust struct/enum that implements Copy to be readable outside of Rust, I should probably, as a rule of thumb, just default to #[repr(C)] for every time "seen" outside of Rust right ?

Yeah. If you don't have #[repr(C)] on your struct/enum, then C must treat it as an opaque type that it can manipulate only by passing it back to Rust code.

3 Likes

@chrefr : Is this (1) an example copy/pasted from somewhere, (2) the output of some tool you ran the rust code through or (3) you just manually simulated the process and wrote it out yourself ?

[I'm hoping for (2) because I don't have faith in my own ability to do (3)].

I'm pretty sure that cbindgen can do it.

I did it manually but like @alice said I'm pretty sure cbindgen can do it too. It was just faster for me :smiley:

For repr(Rust) (the default), by definition it's only rustc that knows how to do it (outside of a few defined exceptions like the Option ones). It's intentionally unspecified so that future rustcs can do more optimizations than they do today.

With a non-default repr you can get more guarantees, as alice mentioned.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.