Confused about why this unsafe code works

I have this stack struct:

pub struct Stack {
  inner: Vec<u8>,
}

and I found this code online to convert a &[u8] into any piece of data:

    /// Safety: Probably not
    pub fn get<Ty: Copy + Sized>(&self, addr: Addr) -> Result<Ty, Error> {
        let size = std::mem::size_of::<Ty>();

        // Data = <ptr><len>
        let Some(data) = self.inner.get(addr .. addr + size) else {
            return Err(Error::StackIndex);
        };

        //data = <ptr>
        let data = data.as_ptr() as *const Ty;

        Ok(unsafe { std::ptr::read(data) })
    }

In order to test it I used this code:

fn main() {
    let stack = Stack {
        inner: vec![0,0,1,0,0,0],
    };

    println!("{:?}", stack.get::<u32>(2)); //1
    println!("{:?}", stack.get::<u32>(1)); //256
}

In my efforts to understand what's going on, I found this page in the Rustonomicon which says that the address of any u32 must be a multiple of 4 bytes. However, the way I'm calling this code above basically guarantees that this is violated. I feel like I should be using std::ptr::read_unaligned, but I'm not sure if I need that since it works already. Also, running cargo miri test reports no errors.

My two questions are:

  1. Is this safe?
  2. If so, why would I ever use std::ptr::read_unaligned()?

Miri reports UB on your code. You need to use cargo miri run to run the main function. cargo miri test runs unit tests, which your code sample doesn't show any of so it may not have run anything at all.

You can see the Miri failure on this playground by going to Tools > Miri

error: Undefined Behavior: accessing memory with alignment 1, but alignment 4 is required
  --> src/main.rs:18:18
   |
18 |         unsafe { std::ptr::read(data) }
   |                  ^^^^^^^^^^^^^^^^^^^^ accessing memory with alignment 1, but alignment 4 is required
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
   = note: BACKTRACE:
   = note: inside `Stack::get::<u32>` at src/main.rs:18:18: 18:38
note: inside `main`
  --> src/main.rs:27:22
   |
27 |     println!("{:?}", stack.get::<u32>(2)); //1
   |                      ^^^^^^^^^^^^^^^^^^^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error
5 Likes

It's wildly unsound for other reasons too, like

  • If you pulled out a String from something that wasn't a String, you'd try to free a non-allocation when you dropped it
  • If you pulled out a String from something that was a String twice, you'd have a UAF and double-free situation

Explore bytemuck for what a sound approach to things like this looks like.

4 Likes

Undefined Behavior may include working in the way you would think if it were defined. But since it's undefined, that can change at any time, even possibly between different instances at runtime.

3 Likes

So I just have to follow the same rules I would with std::mem::transmute, right? I'm really only considering using this for the primitive numbers like u8-u128, which should be safe since every combination of bits is valid for those types, right?

Ty has a Copy bound, which should mitigate the double free issue at least

2 Likes

"just" is understating things a bit I feel. The way it's implemented, you have to meet the requirements for aligned pointer reads.

But really, if you just need this for Pod types, use bytemuck. They've already done the work to make things sound.

6 Likes

Well, after all, what you are doing is basically a very roundabout transmute_copy().

But there's no reason to use unsafe for this purpose. This isn't big enough of a task to pull in bytemuck either, IMO. For primitives, bit-packing is trivial enough to roll your own (assumes little-endian but easy to modify to BE):

fn read_le<T>(bytes: &[u8]) -> T
where
    T: Copy + From<u8> + BitOr<Output = T> + Shl<usize, Output = T>
{
    bytes
        .iter()
        .take(core::mem::size_of::<T>())
        .enumerate()
        .map(|(i, &b)| T::from(b) << (8 * i))
        .fold(T::from(0), T::bitor)
}

Playground

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.