Safety question about unions of integers

If you have a union such as:

union Foo {
    small: u32,
    big: u64,
}

is something like

let x = unsafe { Foo { big: u64::MAX }.small }; 

safe?

The offset of union fields is - in general - unspecified:

Unions - Unsafe Code Guidelines Reference

so the code above is probably not safe.

However, if you add #[repr(C)] to the union, then the code is safe. You'll read the first half of the u64 as a u32; so the exact behavior would depend on your platform's byte-order (big-endian vs. little-endian), but - unlike in C for example - it's okay in Rust to read a union through a different field than how you write to it.

4 Likes

#[repr(C)] works for me, thanks!

MIRI actually doesn't have a problem with this. Keep in mind that unspecified just means the output isn't guaranteed — it can still be safe.

1 Like

I thought "unspecified" meant "it isn't valid to depend on this"

I think MIRI will in general not complain about layout issues. The thing is: you can check layout "dynamically", i.e. by looking at the actual field offsets. If you first check that the offset of small is in-bounds for big, then the code will be sound; just panic when it's not the case and (I suppose) your code will also be very unlikely to ever fail (why should rustc ever make the union bigger than the largest field?) Since MIRI cannot ever know whether you program made sure with appropriate checks that the layout has the minimal necessary properties to avoid any UB, it cannot point out potential UB here even if it does exist. Layout randomization might help, but I don't remember whether MIRI offerst this, and even if it did, I'm not sure whether it would go as far as making unions bigger than their largest field.

So unspecified layout/offset here means that technically, probably, it's possible that the union is bigger than a u64 and the small: u32 lies outside of the big: u64 field.

A perhaps more straightforward example is something like

struct Foo {
    field: u8,
}

fn main() {
    let x: Foo = unsafe {
        std::mem::transmute(1_u8)
    };
}

MIRI doesn't complain, yet without transparent or C representation, this code should be unsound. It might be sound if you first "dynamically" make sure that Foo really happens to have a size of 1 byte. Edit: Actually, this code might be sound because, I think, transmute only works if both types have the same size. (However, I'm not sure it it's guaranteed for transmute to cause a compilation failure if this isn't the case.)


Or something like

struct Foo {
    field: u8,
    field2: bool,
}
struct Foo2 {
    field: u8,
    field2: bool,
}

fn main() {
    let x: Foo = unsafe {
        std::mem::transmute(Foo2 {
            field: 42,
            field2: false,
        })
    };
}

where (among other layout things) the order of the fields isn't specified: definitely unsound, but MIRI won't complain.

I just checked the reference — I'm surprised that unions aren't guaranteed to have a specific size (being that of its largest variant). I don't know the limitations of MIRI, so I'll take your word that it largely doesn't check layout.

Honestly with unions there's no downside to #[repr(C)] that I can think of. As such I would recommend that this be added for certainty. The rest of the discussion doesn't much matter to OP I would think.

Indeed I have done so :slight_smile:

I do find it interesting.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.