I have the following function (major simplification from real code):
/// Copy `val` to `buf` and return reference to the copied value.
///
/// # Safety
/// `T` must NOT implement `Drop`.
unsafe fn foo<'a, T: ?Sized>(val: &T, buf: &'a mut [u64; 128]) -> &'a T {
assert!(align_of_val(val) <= align_of_val(buf));
assert!(size_of_val(val) <= size_of_val(buf));
let size = size_of_val(val);
let (src_ptr, metadata) = (val as *const T).to_raw_parts();
let dst_ptr: *mut u8 = buf.as_mut_ptr().cast();
core::ptr::copy_nonoverlapping(src_ptr.cast(), dst_ptr, size);
let p: *const T = core::ptr::from_raw_parts(dst_ptr, metadata);
p.as_ref_unchecked()
}
I want for this function to work with T: Copy + Sized, str, [T] where T: Copy + Sized and potentially with other compatible types.
Is there a way to implement it on stable Rust without making Miri really angry? Currently, I have to use 3 different functions for this, which is pretty inconvenient.
I believe that without ptr_metadata, you do need 3 functions, but those 3 functions can be in 3 impls of the same trait so the caller doesn’t have to pick one explicitly.
Also, your function is missing a piece of safety documentation: T must not contain any uninitialized bytes.
You can read uninit data, it’s just that most reads[1] are typed reads, and uninit bytes are invalid in most places of most types (notably, all bytes of uN must be init). Notable places where uninit bytes are valid include the padding bytes of many structs and all bytes of MaybeUninit.
Though… I suppose you’re technically right that reads of uninitialized values are UB. It’s just that what it means for a value to be “uninitialized” is defined by that value’s type, and uninitialized/uninit bytes can appear even in properly-initialized values. (In particular, uninit bytes cannot appear in u8 “bytes”, but they can in the Abstract Machine’s AbstractByte “bytes”.) It’s perhaps slightly unfortunate for non-experts that we don’t have more distinct words for “abstract bytes”, “u8 bytes”, “uninit/uninitialized bytes”, “init/initialized bytes”, “uninitialized (w.r.t. some type) values”, “initialized (w.r.t. some type) values”, but you get used to it.
Note: Uninitialized memory is also implicitly invalid for any type that has a restricted set of valid values. In other words, the only cases in which reading uninitialized memory is permitted are inside unions and in “padding” (the gaps between the fields of a type).
It's documented. The first place I always go to when reading a raw pointer is https://doc.rust-lang.org/std/ptr/index.html#pointer-to-reference-conversion. (I am quite pedantic, and will literally go through those bullet points in a SAFETY comment, even when someone as experienced in unsafe as me could probably skip the boilerplate in many cases.)
The relevant requirement here is that "The pointer must point to a valid value of type T."
That "valid value" link explicitly documents invariants about (un)initialized memory.
Note that what the std::ptr module refers to as a "valid value" is what some parts of the MaybeUninit docs seem to call "initialized at the type level", which I above referred to as "an initialized value (w.r.t. some type)".
Additionally, there's a slight distinction between "valid/initialized at the language level for a certain type" and "valid/initialized at the library level for a certain type". You can create a Vec<T> with length 100 and capacity 10 and it might not be invalid at the language level -- that is, it might not be immediate UB to create such a Vec<T> -- but it would violate library-level invariants of Vec<T>, making it invalid / not properly initialized for type Vec<T> at the library level. This distinction is also brought up by the MaybeUninit docs.
MaybeUninit<u8>, mostly. In addition to accepting any 0-255 byte value, uninit byte, byte from a pointer (i.e. having provenance), it can also be stored at any location due to alignment 1.
Thank you! Would MaybeUninit<u8> then be described as having invalid values? Is that a compiler special case, or can I create types with invalid values, too?