Confusion about stacked borrows

I am trying to write an iterator over mutable elements of a tree. The tree is stored in a Vec. Implementing such an iterator is not straight-forward: the lifetimes of the items returned in Iterator::next cannot be bound to the invocation of Iterator::next. However, if we can guarantee the graph is actually a tree (i.e., during traversal, each element is visited exactly once), then the iterator is allowed to yield items with a longer lifetime.

When I tried to implement this, Miri reported stacked borrow errors. I broke it down to the following minimal example, where the depth-first traversal is replaced by random (but unique) indices.

use rand::prelude::*;
use std::cell::UnsafeCell;

#[allow(clippy::mut_from_ref)]
unsafe fn read<T>(data: &UnsafeCell<Vec<T>>, idx: usize) -> &mut T {
    unsafe { &mut data.get().as_mut().unwrap()[idx] }
}

fn main() {
    const LEN: usize = 8;
    let data = UnsafeCell::new(vec![0; LEN]);

    let mut indices = (0..LEN).collect::<Vec<_>>();
    indices.shuffle(&mut thread_rng());

    let refs = indices
        .into_iter()
        .map(|idx| unsafe { read(&data, idx) })
        .collect::<Vec<_>>();

    for x in refs {
        *x = 1;
    }
}

When I run miri on this code, I get the following error:

error: Undefined Behavior: trying to retag from <71188> for Unique permission at alloc925[0x4], but that tag does not exist in the borrow stack for this location
   --> /home/tibor/.local/share/rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/into_iter.rs:225:23
    |
225 |         Some(unsafe { ptr.read() })
    |                       ^^^^^^^^^^
    |                       |
    |                       trying to retag from <71188> for Unique permission at alloc925[0x4], but that tag does not exist in the borrow stack for this location
    |                       this error occurs as part of retag at alloc925[0x4..0x8]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <71188> was created by a Unique retag at offsets [0x4..0x8]
   --> src/main.rs:16:16
    |
16  |       let refs = indices
    |  ________________^
17  | |         .into_iter()
18  | |         .map(|idx| read(&data, idx))
19  | |         .collect::<Vec<_>>();
    | |____________________________^
help: <71188> was later invalidated at offsets [0x0..0x20] by a Unique retag
   --> src/main.rs:6:47
    |
6   |     unsafe { &mut data.get().as_mut().unwrap()[idx] }
    |                                               ^^^^^
    = note: BACKTRACE (of the first span):
    = note: inside `<std::vec::IntoIter<&mut i32> as std::iter::Iterator>::next` at /home/tibor/.local/share/rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/vec/into_iter.rs:225:23: 225:33
note: inside `main`
   --> src/main.rs:21:14
    |
21  |     for x in refs {
    |              ^^^^

I don't understand why Miri complains here. To further add to my confusion, I rewrote the read function using pointer arithmetics, and Miri no longer complains with that implementation:

#[allow(clippy::mut_from_ref)]
unsafe fn read_ptr<T>(data: &UnsafeCell<Vec<T>>, idx: usize) -> &mut T {
    unsafe {
        let len = data.get().as_ref().unwrap().len();
        assert!(idx < len);
        let ptr_to_slice = data.get().as_ref().unwrap().as_ptr();
        let ptr_to_elem = ptr_to_slice.add(idx);
        (ptr_to_elem as *mut T).as_mut().unwrap()
    }
}

Can someone explain to me why read results in a stacked borrow, but read_ptr does not? Aren't they identical (up to a bounds check)?

each time you call as_mut() use the index operator, you are creating an exclusive reference to the entire Vec data buffer, which invalidates the previously references to individual items.

the solution, as you already find out, is to use raw pointers to avoid creating (even transient) reference to the whole range of data, and only turn the raw pointer into references for the individual items, and you ensure they are disjoint (there are no aliases).

2 Likes

Thanks for the answer. So this is a proper solution, and not just me trying to fight against Miri?

that's correct. if you want to split the single reference into individual smaller references, you must use raw pointers, and be very careful not to (accidentally) reclaim exclusivity (by creating an exclusive reference) to the whole data inside the Vec, until the end of the lifetime of the splitted references.

well, actually, I was a little off in my previous explanation. it's not the as_mut() that is problematic, but the index operation, because it dereferences to slice. the you cannot create a reference to the data buffer, but creating a reference to the Vec itself is ok, as long you make sure it's not aliased. so you can simplify the implementation a little bit:

    unsafe { 
        let data = &mut *data.get();
        assert!(idx < data.len());
        &mut *data.as_mut_ptr().add(idx)
    }

Interesting, thanks. This is how I understand it currently:

  • Getting the reference to Vec is fine, we're only accessing the triple (ptr, len, size), and never alias that one.
  • As soon as we (i.e., the implementation of Index) access the data (ptr), it first mutably borrows the entire data, and then reduces the borrow down to the single element.
  • In the next iteration, we access the data again, dereferencing it and thus removing some labels from it.

That somehow feels like Miri doesn't properly understand that we are only borrowing a single element in the slice. Is that a limitation, as Rust's MIR just looks like first borrowing everything? Is that something that Miri could detect?

note, although the language has a dedicated syntax for index operations, it's actually not a intrinsic operation, but goes through traits (core::ops::Index and IndexMut, though they are not "normal" traits, they are lang items).

because the Index implementation delegates to the slice type (note the double dereference of self):

no, miri is doing what it was told to do, it's the Vec's Index implementation that has this behavior.

could Vec implements Index differently? of course. will it? I think not. IMO, what you are doing is a niche use case, not a common one that everyday users would need.

2 Likes