Parallel Interior Mutability

Hi,
we have been trying to understand UB in relation with interior mutability.
We have the following code:

struct ParallelSlice {
    backend: Box<[AtomicU8]>,
    chunk_size: usize,
}

and we would like to access chunks of the slice with a mutable reference

impl ParallelSlice {
    pub unsafe fn get_mut(&self, index: usize) -> &mut [u8] {
        let offset = index * self.chunk_size;
        let ptr = self.backend[offset].as_ptr(); // *mut u8
        std::slice::from_raw_parts_mut(ptr, self.chunk_size)
    }
}

Now it's easy to see how this is instant UB

let chunk_1 = slice.get_mut(1);
let chunk_2 = slice.get_mut(1);
chunk_1.fill(42);

but if the caller was able to guarantee that this method will only be called with unique indexes, is this still UB no matter what or is it UB only when 2 mutable references to the same memory are actuallly created?
As an example use the following (knowing full well that a par_chunks here would work perfectly but it is just for demonstration)

(0..10).into_par_iter().for_each(|i| slice.get_mut(i).fill(42));

in this case we know that the &mut [u8] that are created are always non-overlapping and no index is used twice, but is this still UB or is this sound?

Miri flags utilizing your get_mut as UB even for a single call. I believe the problem is that you started with a reference that only had provenence over one byte. It's happier with this.

     pub unsafe fn get_mut(&self, index: usize) -> &mut [u8] {
         let offset = index * self.chunk_size;
-        let ptr = self.backend[offset].as_ptr(); // *mut u8
+        let slice = &self.backend[offset..offset+self.chunk_size];
+        let ptr = slice as *const [_] as *mut [u8] as *mut u8;
         std::slice::from_raw_parts_mut(ptr, self.chunk_size)
     }

Including with multiple calls.


However, it's still possible that this is UB that Miri doesn't detect. See this documentation which explicitly calls this out as UB:

unsafe fn not_allowed<T>(ptr: &UnsafeCell<T>) -> &mut T {
  let t = ptr as *const UnsafeCell<T> as *mut T;
  // This is undefined behavior, because the `*mut T` pointer
  // was not obtained through `.get()` nor `.raw_get()`:
  unsafe { &mut *t }
}

And note that Atomics are implement via UnsafeCell.

(Additional note, if you modify the code to just use &mut *ptr on *mut [_], the compiler lints it as UB.)

2 Likes
  • Why are you using AtomicU8 in the definition but nowhere else? You probably want a UnsafeCell instead of AtomicU8.
  • Clippy has a lint that warns against functions that take a & and return a &mut. Clippy Lints but it says that when you return a different object each time it is fine.
  • you currently assume that self.backend.len() is a multiple of self.chunk_size. if you can guarantee this it's fine

Otherwise yes i believe when the caller always drops the &mut to one item before a new one is created this should be fine.
Miri also says it is fine: Rust Playground

1 Like

Oh interesting. Miri is only unhappy if the chunk_size is more than one. Maybe because the indexing returns a ptr that is only valid for that field? That is probably also why your manual ptr math is not flagged.

This seems closer in spirit to the UnsafeCell docs. (It has no slice-specific functions.)

struct ParallelSlice {
    backend: Box<[UnsafeCell<u8>]>,
    chunk_size: usize,
}

impl ParallelSlice {
    pub unsafe fn get_mut(&self, index: usize) -> &mut [u8] {
        let offset = index * self.chunk_size;
        let slice = &self.backend[offset..offset+self.chunk_size];
        let ptr = UnsafeCell::raw_get(slice as *const [_] as *const _);
        unsafe { std::slice::from_raw_parts_mut(ptr, self.chunk_size) }
    }
}

That's what I meant by

2 Likes

However, it's still possible that this is UB that Miri doesn't detect. See this documentation which explicitly calls this out as UB:

Would it be better do have Box<UnsafeCell<[u8]>>?
Then you could do UnsafeCell.get() and then do pointer math on that pointer. Then you don't create references to UnsafeCells you didn't call get on.

That's a good point.

That seems the way to go, as the UnsafeCell is over the whole slice. To add a bit of further hindrance, we (I'm working with @MatteoH2O1999) have the slice as a memory-mapped region. So we have no way to use UnsafeCell as in the last example.

I understood however from a previous post that if you write using pointers rather than references you are still in the business of data races, but having two pointers pointing at the same location is not UB (as in the case of mut references). Of course this means that you have to write setters, rather than methods returning mut references, which is less ergonomic, but not so dramatic.

After several iterations, we came to realize that the simplest idea is a transparent SyncCell struct sporting an as_slice_of_cells method (playground). It can be used to write to a value from several threads when there is no data race, and the as_slice_of_cells method it makes it easy access to a slice (e.g., when only a thread writes at an index). The only unsafe act is creating the SyncCell, as it will be Sync as long as the content is. The only pointer equivalence used is between Cell and SyncCell, which is guaranteed by repr(transparent). There is no possibility, even theoretical, of creating &mut references to overlapping areas.

1 Like

The idea has been materialized in this crate.