Soundness of DST with a size not multiple of aligment

I have code like this:

use core::ptr::NonNull;

#[repr(C, align(8))]
pub struct AlignedBytes(pub [u8]);

impl AlignedBytes {
    /// # Safety
    /// Same as [`slice::from_raw_parts`], but additionally
    /// the pointer MUST have alignment equal to 8.
    pub unsafe fn from_raw_parts<'a>(p: *const u8, len: usize) -> &'a Self {
        // note that casts between unsized pointers are allowed:
        // https://doc.rust-lang.org/reference/expressions/operator-expr.html#pointer-to-pointer-cast
        unsafe {
            let p = NonNull::new_unchecked(p as *mut u8);
            let p = NonNull::slice_from_raw_parts(p, len);
            &*(p.as_ptr() as *const Self)
        }
    }

    pub fn as_slice(&self) -> &[u8] {
        &self.0
    }
}

Technically, this code allows creation of &AlignedBytes with a length not multiple of the type alignment and Miri even accepts such code (playground). But the reference states the following:

The size of a value is always a multiple of its alignment.

Does it mean that the code in the playground is technically unsound?

note the definition of "size" from the same paragraph:

The size of a value is the offset in bytes between successive elements in an array with that item type including alignment padding.

I don't think this paragraph applies to DST, since you cannot have array of DST in rust (not even an array of size 0 or size 1).

in other words, the statement "the size of a value is always a multiple of its alignmen" is talking about types with statically known size (which, for DST, is "undefined", particularly, it's not 0), and it is a memory layout property of rust's types, not a soundness requirement for (unsafe) implementations.

1 Like

That “definition” of size is, I think, merely overly informal and neglecting to think about DSTs, not intending to avoid DSTs. Every value in Rust (for now) has a defined size which you can observe via std::mem::size_of_val() — as mentioned in that same paragraph.

It's unsound because you're missing the safety condition that p is readable for len.round_up(8) bytes.

I need to be able to convert your &AlignedBytes to &[MaybeUninit<u8>] and read all size_of_val of those possibly-uninitialized bytes, but your safety conditions are insufficient to support that.


A good way to test out if your DST handling is valid is to replace that AlignedBytes with

#[repr(C, align(8))]
pub struct AlignedBytes<T: ?Sized = [u8]>(pub T);

Then if you make a &AlignedBytes of length 2 via your unsafe constructor, it should be indistinguishable from making an AlignedBytes<[u8; 2]> directly, taking a reference to it, and unsizing that to &AlignedBytes.

use core::ptr::NonNull;

#[repr(C, align(8))]
pub struct AlignedBytes<T: ?Sized = [u8]>(pub T);

fn main() {
    let x = AlignedBytes([1_u8, 2]);
    let x = &x;
    let x: &AlignedBytes = x;
    dbg!(std::mem::size_of_val::<AlignedBytes>(x));
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=be5246daa46619ee8ec2bc62b444471f


It's legal to make an allocation that doesn't follow the size-multiple-of-alignment rule, but once you leave pointer-land and make a reference to a single rust type, that must follow the rules to be sound.

3 Likes

In my case it's not a problem to add such requirement, but from what exact part of the spec do we get it?

Is it though? With AlignedBytes<[u8; 2]> the compiler adds padding bytes and they become part of the struct. Meanwhile with AlignedBytes<[u8]> on the first glance it looks like it should be sound to create &mut [u8] to "padding" bytes not covered by the struct.

In other words, I wonder if the following code is sound:

#[repr(C, align(8))]
pub struct AlignedBytes(pub [u8]);

let mut buf = [0u64; 100];
let p: *mut u8 = buf.as_mut_ptr().cast();
// let's assume that lifetimes are properly tied to `buf`
let data: &mut AlignedBytes = unsafe {
    AlignedBytes::from_raw_parts_mut(p, 2)
};
let pad: &mut [u8] = unsafe {
    core::slice::from_raw_parts_mut(p.add(2), 6)
};

Doing a similar thing with AlignedBytes<[u8; 2]> would certainly be unsound.