UB on flexible array member implementation

What I'm trying to do: Allocate memory, write a header struct at the beginning of the allocation, then treat the remainder of the allocation as an array of bytes. In C, I would do this using a flexible array member at the end of the struct.

What is happening: When running my code with Miri, Miri complains:

error: Undefined Behavior: trying to retag from <3555> for Unique permission at alloc1808[0x8], but that tag does not exist in the borrow stack for this location

(full error in code below)

This seems to only happen if I make a reference for the header struct first. If I avoid making a reference, Miri does not complain.

What I expect: I don't think my code has UB. If it does, I want to be told what specific rule I am breaking.

More context:

rustc 1.68.0-nightly (9c07efe84 2022-12-16)

Reduced example program:

#[repr(C)]
struct ChunkHeader {
    len: usize, // Size of the data portion in bytes.

                // data follows the struct.
}

fn main() {
    unsafe {
        // Allocate a ChunkHeader followed by 4096 uninitialized bytes.
        let data_size: usize = 4096;
        let layout = std::alloc::Layout::from_size_align(
            std::mem::size_of::<ChunkHeader>() + data_size,
            std::mem::align_of::<ChunkHeader>(),
        )
        .unwrap();
        let chunk_ptr: *mut ChunkHeader = std::alloc::alloc(layout) as *mut ChunkHeader;
        std::ptr::write(chunk_ptr, ChunkHeader { len: data_size });
        let chunk: &mut ChunkHeader = &mut *chunk_ptr;

        // Get a pointer to the beginning of the data portion of the allocation.
        //
        // If we use technique A, Miri does not complain.
        //
        // If we use technique B, Miri complains:
        //
        // error: Undefined Behavior: trying to retag from <3555> for Unique permission at alloc1808[0x8], but that tag does not exist in the borrow stack for this location
        //    --> [snip]/rustlib/src/rust/library/core/src/slice/raw.rs:145:9
        //     |
        // 145 |         &mut *ptr::slice_from_raw_parts_mut(data, len)
        //     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        //     |         |
        //     |         trying to retag from <3555> for Unique permission at alloc1808[0x8], but that tag does not exist in the borrow stack for this location
        //     |         this error occurs as part of retag at alloc1808[0x8..0x14]
        //     |
        //     = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
        //     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
        // help: <3555> was created by a SharedReadWrite retag at offsets [0x0..0x8]
        //    --> src/main.rs:48:14
        //     |
        // 48  |             (chunk as *mut ChunkHeader as *mut std::mem::MaybeUninit<u8>)
        //     |              ^^^^^
        let data_begin = if false {
            // Technique A: WORKS
            (chunk_ptr as *mut std::mem::MaybeUninit<u8>).add(std::mem::size_of::<ChunkHeader>())
        } else {
            // Technique B: FAILS
            (chunk as *mut ChunkHeader as *mut std::mem::MaybeUninit<u8>)
                .add(std::mem::size_of::<ChunkHeader>())
        };

        // Make a slice for some of the uninitialized data bytes. This possibly triggers the Miri
        // error.
        let _data: &[std::mem::MaybeUninit<u8>] = std::slice::from_raw_parts_mut(data_begin, 12);

        // Intentionally leak chunk_ptr.
    }
}
1 Like

This looks similar to this post from awhile back

As I understand it: when you construct a reference you're granting permission to exactly the memory of the type the reference points to, and no more. Even if the original pointer technically has permission to access the whole allocation, the process of creating an &mut reference narrows that permission down to just the part representing the type of the reference.

When you use a raw pointer it maintains the permission to the whole allocation.

9 Likes

This makes perfect sense to me. Thanks!

In my real program, the data_begin computation was done in a method of ChunkHeader:

impl ChunkHeader {
  fn data_begin(&mut self) -> *mut u8 { ... }
}

I guess I need to avoid this style and instead use a function which accepts a *mut ChunkHeader (rather than a &mut ChunkHeader).

3 Likes

Makes sense. Still, a better error message would be nice :slight_smile:

Better error message would happen when this stuff would stop being experimental and actual rules would be developed.

After about 40 year hiatus industry is starting to think about safety again, but we are still far from being able to tell what the end result would be.

When 40 years choice between safety and efficiency was made it was a different world, we couldn't just go back, but new development is happening both on hardware and software site, but we are not [yet] at the stage where we know what's the best compromise would there be.

1 Like

Both links go to the same page, is this a typo?

One example of how you can do this kind of stuff correctly can be found in Tokio. We define the following type:

/// Raw task handle
struct RawTask {
    ptr: NonNull<Header>,
}

Then, instead of defining the methods on the header, all of the methods are defined on RawTask or other structs like it. This allows the RawTask methods to take &self or &mut self without issue, as they can simply use the raw pointer within the struct.

To access data after the header, methods such as add are used to offset the pointer, like here:

/// Gets a pointer to the `Trailer` of the task containing this `Header`.
///
/// # Safety
///
/// The provided raw pointer must point at the header of a task.
unsafe fn get_trailer(me: NonNull<Header>) -> NonNull<Trailer> {
    let offset = me.as_ref().vtable.trailer_offset;
    let trailer = me.as_ptr().cast::<u8>().add(offset).cast::<Trailer>();
    NonNull::new_unchecked(trailer)
}

or sometimes the header pointer is cast into a Cell<T, S> pointer and ordinary field accesses are used.

#[repr(C)]
struct Cell<T: Future, S> {
    /// Hot task state data
    header: Header,

    /// Either the future or output, depending on the execution stage.
    core: Core<T, S>,

    /// Cold data
    trailer: Trailer,
}

Note that Tokio uses header pointers because *mut Cell<T, S> requires that you know what the T and S types are, which *mut Header does not require.

Notice also that you do not need to cast to *mut MaybeUninit<u8> to offset the pointer. Using *mut u8 is sufficient.

6 Likes

Yes. Fixed. Flat address model, no protection of data and then managed runtime on top were envisioned as solutions.

So world went not with Ada and protected objects but with UNIX, flat memory and Java.

Now it's “take two” (or maybe 3, 4… I'm not a computer history historian, I don't know how many there were already).

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.