Soundness of aliasing mutable reference under stacked borrows

Hi folks,
I am trying to get my head around reference aliasing, stacked borrows and UB.

Consider this code which creates an immutable reference to buffer under Vec but does not touch the vec until drop.

struct MyStruct<'a> {
    vec: &'a mut Vec<u8>,
    // aliases vec's buffer but we promise not to touch vec until drop
    buf: &'a [u8]
}

impl<'a> MyStruct<'a> {
    fn new(vec: &'a mut Vec<u8>) -> Self {
        let buf = unsafe { std::slice::from_raw_parts(vec.as_ptr(), vec.len()) };

        Self {
            vec,
            buf
        }
    }

  // hand of ref, people can do something
  fn get_ref(&self) -> &[u8] {
    self.buf
   }
}

impl<'a> Drop for MyStruct<'a> {
    fn drop(&mut self) {
       // do something with vec content
        self.vec[0] = 47;
    }
}

fn main() {
  let v = vec![0];
  let s = MyStruct::new(&mut v);
  let myref = s.get_ref();
  let read = *myref;
  drop(s)
}

For now consider that 'a lifetime does not escape from the struct (e.g. by having fn get_ref(&self) -> &'a [u8] as then it is obvious UB as you can hand out 'a immutable lifetime and then mutate data under it when dropping MyStruct).

My first question is - Is creation of immutable alias immediate UB in MyStruct::new()? Rust says that this shouldn't be allowed. But then, we have stacked borrows and this is clearly a borrow from vec so the aliasing should be allowed.

If this isn't immediate UB, then I believe the stacked borrow says that if we don't access buf reference after drop, it should be sound to modify vec in drop (in stacked-borrow model accessing vec walks the stack-borrow stack and invalidates buf but because we never access buf afterwards everything is shiny).

Now, the real question - if these two above are not UB, then what will prevent optimizer from assuming that there are two independent borrows and moving potential read of &s.buf after write to .vec in drop. Reading a (valid) reference should be non-sideeffect which in terms could be re-ordered with side-effect of writing to "another" memory location. Now this is obviously unsound so at least one of the three things must be unsound.

1 Like

Miri says it’s UB on the drop.

     Running `/playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner target/miri/x86_64-unknown-linux-gnu/debug/playground`
error: Undefined Behavior: not granting access to tag <2634> because that would remove [SharedReadOnly for <2758>] which is strongly protected
   --> /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:192:9
    |
192 |         &mut *ptr::slice_from_raw_parts_mut(data, len)
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ not granting access to tag <2634> because that would remove [SharedReadOnly for <2758>] which is strongly protected
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <2634> was created by a Unique retag at offsets [0x0..0x1]
   --> src/main.rs:28:17
    |
28  |     let mut v = vec![0];
    |                 ^^^^^^^
help: <2758> is this argument
   --> src/main.rs:32:5
    |
32  |     drop(s)
    |     ^^^^^^^
    = note: BACKTRACE (of the first span):
    = note: inside `std::slice::from_raw_parts_mut::<'_, u8>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:192:9: 192:55
    = note: inside `<std::vec::Vec<u8> as std::ops::DerefMut>::deref_mut` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/mod.rs:2839:18: 2839:72
note: inside `<MyStruct<'_> as std::ops::Drop>::drop`
   --> src/main.rs:23:17
    |
23  |         self.vec[0] = 47;
    |                 ^^^
    = note: inside `std::ptr::drop_in_place::<MyStruct<'_>> - shim(Some(MyStruct<'_>))` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:536:1: 536:56
    = note: inside `std::mem::drop::<MyStruct<'_>>` at /playground/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/mem/mod.rs:942:24: 942:25
note: inside `main`
   --> src/main.rs:32:5
    |
32  |     drop(s)
    |     ^^^^^^^
    = note: this error originates in the macro `vec` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

However, this seems to be actually a different phenomenon; the reported UB goes away without the explicit drop(s) call (playground), so I believe the UB with “explicit” drop(s) is because AFAIK passing a reference to a function does assert the reference to be valid at that point; even if it’s part of a struct.

So the call to drop(s) passes s to a function and that’s what asserts both vec and buf to be considered aliasing.

Edit: Some more testing seems to show that passing MyStruct<'a> to or from a function is actually not an issue, at least as far as miri is concerned. It seems instead that there must be something of a rule that a reference passed to a function stays valid throughout the whole function call. And this applies to the parameter of the function fn drop<T>(_: T) {}. I’ll test how changing the immutable reference to a pointer might change the miri report… Edit2: Yes it seems like that was the issue.

Anyways… conservatively, I would however suggest turning both references into raw pointers for MyStruct, so that all possible additional aliasing issues are avoided. (Add a dummy parameter with PhantomData to let the struct still keep the lifetime.)

4 Likes

I think this is the issue for which Ralf Jung proposed MaybeDangling in RFC 3336:

There exist more complex versions of this problem, relating to a subtle aspect of the (currently poorly documented) aliasing requirements of Rust: when a reference is passed to a function as an argument (including nested in a struct), then that reference must remain live throughout the function. (In LLVM terms: we are annotating that reference with dereferenceable, which means “dereferenceable for the entire duration of this function call”)

Though I'm not sure whether liveness and what you call validity are exactly the same thing here.

5 Likes

No, it's not immediate UB. You can have several mutable references to something when one is derived from another. But all uses of a child reference must happen entirely between two uses of the parent reference.

Yes, as above, if the next use of vec after using it to create buf is the self.vec[0] = 47 call, then your code is okay assuming that you don't use buf after the self.vec[0] = 47 call.

A relevant consideration here is whether returning the MyStruct from new counts as a use of vec. Generally, it does in fact count, but the fact that the data is behind a Vec complicates matters. This touches on what the uniqueness guarantees of Box and Vec are. But the current implementation of miri is such that using the Vec<T> does not count as a use of the buffer, so returning MyStruct from new will not invalidate buf. (But self.vec[0] = 47 still does since it writes to the buffer.)

Now, as @steffahn points out, your code triggers a miri failure anyway ... what is the cause? It has to do with something called a protector. Basically, references in function arguments must remain valid for the duration of the function call.

The thing is, in the call to drop, the reference is a function argument so it gets a protector and must remain valid for the entire call to drop. When you call self.vec[0] = 47, this modifies not just the Vec, but also the buffer, which invalidates buf. Since buf has a protector, this is immediate UB.

To solve this, you can wrap buf in MaybeUninit which has the same effects as MaybeDangling in this case.

The compiler uses the existence of a protector as the argument for why inserting such a read is okay. In LLVM, having a protector is implemented by annotating the reference with dereferencable.

So for your code to not be UB you need to use MaybeUninit or MaybeDangling to prevent the reference from getting a protector / being marked with dereferencable.

7 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.