Am I triggering undefined behavior here?

Ah, so to be clear, I'm well aware of the invariant on ManuallyDrop. But I think you helped rephrase my question: is the reference to the invalid representation alone enough to trigger UB, or do you need an undefined representation?

My understanding is the danger with undefined memory is LLVM specifically has many optimizations based on, for instance, removing code paths with references to undefined values. Here however I'm not triggering that behavior - just invalid representations which I'm careful not to ever access.

However, any use of the reference certainly does use the validity invariant to describe how to interact with the backing value

Exactly: in this case I don't have any code that would actually interact with the backing value, other than perhaps the drop impl for ManuallyDrop, which of course does nothing.

Hopefully const generics are sometime by the end of 2020 so you can just do literally [u8; {mem::size_of<T>}] .

Actually if I only wanted this for sized types I could implement it today with MaybeUninit:

#[repr(packed)]
struct Bytes<T> {
    inner: MaybeUninit<T>,
}

Essentially while MaybeUninit neither guarantees that the bit representation will be valid or even defined, Bytes would guarantee that the bits would at least be defined and thus safely accessible as a [u8] slice.

But MaybeUninit requires T: Sized right now, so that solution doesn't work for unsized types. In my actual codebase I defined a Pointee trait that for types for which the layout can be calculated from pointer metadata. Thus my real AsRef impl looks like:

impl<T: ?Sized + Pointee> AsRef<[u8]> for Bytes<T> {
    #[inline(always)]
    fn as_ref(&self) -> &[u8] {
        unsafe {
            let ptr_t = self as *const Self as *const T;
            let layout = T::layout(T::metadata(ptr_t));
            slice::from_raw_parts(self as *const Self as *const u8, layout.size())
        }   
    }   
}

...with the Pointee trait used to determine the size of the type. Pointee meanwhile is:

pub trait Pointee {
    type Metadata : Copy;

    fn layout(metadata: Self::Metadata) -> Layout;

    fn metadata(ptr: *const Self) -> Self::Metadata;
}

As for the motivation for all this: in-place "deserialization" of types with bit-representations carefully constrained to be compatible with mem-mapping.

Note that my other option is to make something like:

struct Bytes<'a, T: ?Sized> {
    marker: PhantomData<&'a ()>,
    ptr: *const T,
}

...but then I need a separate BytesMut and so on. :confused: Much more ergonomic if Bytes is an unsized type.

2 Likes