Question regarding slice of `UnsafeCell`

EDIT: original statement about miri is complete wrong.

I want to share a vector between two threads, and each thread can get a mutable slice for a portion of the buffer, it is guaranteed the ranges are non-overlapping between the threads (synchronized with atomic indices).

so I have two questions:

  1. if I have a shared reference to a slice of UnsafeCells, is it safe to get mut slices from the shared reference? what I don't know is, for example, I can get a mut pointer from the UnsafeCell at index 3, is it safe to use that pointer to construct a mut slice (using slice::from_raw_parts_mut()) which also covers index 4, 5, etc.? in other words, can I derive pointers to adjacent UnsafeCells, or is the pointer limited to the same UnsafeCell it originated from?

  2. can I safely convert &mut [i32] to &[UnsafeCell<i32>]? basically I want to do something similar to Cell::from_mut() and Cell::as_slice_of_cells() but for UnsafeCell instead of Cell, because I want to get mut slices of non-overlapping parts of the original slice concurrently, but Cell only allows get() and set() single element, it will not give me mut references to the inner element


context: I just want to learn how to use `UnsafeCell` correctly

I already had an implementation using raw pointers, simplified code looks like this:

struct ControlBlock {
	// a bunch of atomic indices
	// and other housekeeping
	start: AtomicUsize,
	end: AtomicUsize,
	// ...
}
struct WorkArea<'b> {
	worker_id: usize,
	base_address: *mut i32,
	control_block: &'b ControlBlock,
}
unsafe impl Sync for ControlBlock { }
unsafe impl Send for WorkArea<'_> { }

fn split_work<'a>(
	memory: &'a mut [i32],
	control_block: &'a ControlBlock,
) -> (WorkArea<'a>, WorkArea<'a>) {
	// must not use `as_ptr().cast_mut()` 
	let address = memory.as_mut_ptr();
	(
		WorkArea {
			worker_id: 1,
			base_address: address,
			control_block,
		},
		WorkArea {
			worker_id: 2,
			base_address: address,
			control_block,
		},
	)
}
impl ControlBlock {
	/// reserve maximum possible range that will not overlap with other worker
	/// SAFETY: each worker id can only reserve if previous reservation is committed
	unsafe fn reserve(&self, worker_id: usize) -> Range<usize> {
		//...
	}
	/// commit previous reserved buffer
	/// SAFETY: len must be less or equal than the reserved size
	unsafe fn commit(&self, worker_id: usize, len: usize) {
		//...
	}
}
impl WorkArea<'_> {
	/// SAFETY: return value of `f` must not exceed len of slice
	unsafe fn with<F>(&mut self, f: F)
	where
		F: FnOnce(&mut [i32]) -> usize,
	{
		let range = self.control_block.reserve(self.worker_id);
		let reserved = slice::from_raw_parts_mut(self.base_address.add(range.start), range.len());
		let finished = f(reserved);
		self.control_block.commit(self.worker_id, finished);
	}
}

Can't you just use split_at_mut()? Seems much safer to rely on the type system than rolling your own unsafe runtime synchronization.

You never transmute pointers, and especially not fat pointers, because their layout is not guaranteed. But you can convert via a regular as cast.

Don't do that either, though. Again, use std instead of rolling your own. You want Cell::<[i32]>::from_mut() followed by &Cell<[i32]>::as_slice_of_cells().

2 Likes

the split is dynamic and changing as the work is going, they are not pre-partitioned. I don't know how split_at_mut() can be used in such cases.

I already know about Cell and what it can do with Copy types, if I understand it correctly, &[Cell<i32>] only allow me to write one i32 element a time, right? what I need is a contiguous slice of buffer, so I was hoping &[UnsafeCell<i32>] would allow me to get a &mut [i32] slice. that's my first question.

can you give some hint please? direct cast from &mut [i32] to &[UnsafeCell<i32>] does not compile:

let mut data = vec![1,2,3,4];
let data = (&mut data[..]) as &[UnsafeCell<i32>];
// error[E0605]: non-primitive cast: `&mut [i32]` as `&[UnsafeCell<i32>]`

I think like this?

let mut data = vec![1, 2, 3, 4];
let data = (&mut data[..]) as *mut [i32] as *const [UnsafeCell<i32>];
// SAFETY: `T` and `UnsafeCell<T>` have the same memory layout
let data = unsafe { &*data };

Playground.

(taken from the last snippet of the this section of the UnsafeCell documentation)

2 Likes

It works just fine with runtime indices. You don't have to split on a compile-time const boundary.

In your OP you wanted a slice of cells given a mut slice. Now which one is it you want?

1 Like

Miri does error Rust Playground

fn main() {
    let data = [42];
    unsafe {
        data.as_ptr().cast_mut().write(666);
    }
    println!("{}", data[0]);
}
error: Undefined Behavior: attempting a write access using <1775> at alloc891[0x0], but that tag only grants SharedReadOnly permission for this location
 --> src/main.rs:4:9
  |
4 |         data.as_ptr().cast_mut().write(666);
  |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |         |
  |         attempting a write access using <1775> at alloc891[0x0], but that tag only grants SharedReadOnly permission for this location
  |         this error occurs as part of an access at alloc891[0x0..0x4]
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1775> was created by a SharedReadOnly retag at offsets [0x0..0x4]
 --> src/main.rs:4:9
  |
4 |         data.as_ptr().cast_mut().write(666);
  |         ^^^^^^^^^^^^^
  = note: BACKTRACE (of the first span):
  = note: inside `main` at src/main.rs:4:9: 4:44
1 Like

oh, cast pointers instead of references, I should have considered this.

my question is still unanswered, is the condition "T and UnsafeCell<T> hvae the same memory layout" enough to ensure cast (slices, not single element) is safe in general, or is it only safe for integers?

sorry I wasn't clear, in my case a buffer is shared between threads and each threads will try to borrow portion of the buffer as working area concurrently. split_at_mut() requires a mut borrow of the whole buffer, but I can't send the mut reference of the whole buffer to each thread when they are spawned, so to use split_at_mut(), the split operation must be serialized in a "control" thread and each worker thread must be requesting working memory at lock steps, this is not what I wanted.

I want to get &[UnsafeCell<i32>] from &mut [i32], in OP I was complaining there's api for Cell, but no equivilance for UnsafeCell. I'm well aware of Cell::from_mut() and Cell::as_slice_of_cells(). my wording is confusing. sorry about that. I'll try to edit it.

ah, I must be hallucinating. I swear I tried the snippet on the playground earlier. thanks for testing out.

Cell is basically just UnsafeCell without any of the unsafe methods, so I don't see why you couldn't do the same thing.

use std::cell::UnsafeCell;

#[repr(transparent)]
struct UnsafeCellWrapper<T>(UnsafeCell<T>);

// SAFETY: UnsafeCell is Sync
unsafe impl<T: Sync> Sync for UnsafeCellWrapper<T> {}

fn main() {
    let mut data = [1i32, 2];
    let data_mut = data.as_mut_slice();
    // SAFETY: UnsafeCellWrapper has the same layout as
    // UnsafeCell, which has the same layout as i32
    let unsafe_cells = unsafe { &*(data_mut as *mut [i32] as *const [UnsafeCellWrapper<i32>]) };
    std::thread::scope(|scope| {
        scope.spawn(|| {
            let item = unsafe { &mut *unsafe_cells[0].0.get() };
            *item += 10;
        });
        scope.spawn(|| {
            let item = unsafe { &mut *unsafe_cells[1].0.get() };
            *item += 10;
        });
    });
    println!("{:?}", data);
}

Miri is fine with this. I was also wondering why UnsafeCell isn't Sync (you can't access the values without unsafe or exclusive access anyway), and it looks like it's just for caution, according to the nightly SyncUnsafeCell:

UnsafeCell doesn’t implement Sync, to prevent accidental mis-use.

Of course, you should have some actual machinery in the wrapper type to make this at least more ergonomic, and safe where possible.

2 Likes

yeah, that's my gut feeling too. I've seen the implementation of Cell::as_slice_of_cells() so I would assume the same can be done with UnsafeCell

sure thing.

my first question in OP is still unclear to me. is the pointer returned from UnsafeCell::get() restricted to only the same cell, or can I derive pointers to adjacent UnsafeCells from it? in other words, is the following function type signature soundly implementable?

// miri gives UB when called with `range.len() > 1`
unsafe fn borrow_mut_bad(buffer: &[SyncUnsafeCell<i32>], range: Range<usize>) -> &mut [i32] {
    slice::from_raw_parts_mut(buffer[range.start].get(), range.len())
}
// this seems accepted by miri, but I don't know if it's really correct
// or miri just can't detect
unsafe fn borrow_mut_ok(buffer: &[SyncUnsafeCell<i32>], range: Range<usize>) -> &mut [i32] {
	let part = &buffer[range] as *const [SyncUnsafeCell<i32>];
	let part = part as *const SyncUnsafeCell<[i32]>;
	SyncUnsafeCell::raw_get(part).as_mut().unwrap()
}

The Miri results reflect my understanding (the first is unsound because you need to start with a reference covering all the memory aliased in your return value).

Another approach is to not leave the realm of raw pointers until absolutely necessary. (On mobile and can't easily check the available APIs for this just now.)

You can't use pointer arithmetic to access something outside of the allocation you've been given (where "allocation" just means a piece of memory you have access to, not necessarily on the heap).

For example, slice::from_raw_parts_mut(buffer[range.start].get(), range.len()) is unsound because buffer[range.start] only gives you a reference to one element, yet you are using it to return a slice pointing to the entire range.

The sound version would look something like this:

unsafe fn borrow_mut(buffer: &[SyncUnsafeCell<i32>], range: Range<usize>) -> &mut [i32] {
    let buffer_ptr = buffer.as_ptr() as *mut i32;
    slice::from_raw_parts_mut(buffer_ptr.add(range.start), range.len())
}

This is sound because I'm starting with a larger allocation (the entire buffer) and the return value is a sub-slice of that allocation.

2 Likes

yes, that's what I'm currently doing, I'm just studying the semantics about UnsafeCell, which I'm not familiar with yet.

oh, that's brilliant, I haven't thought of this approach. intuitively I think I understand it, document of slice::as_ptr() says (emphasis added):

The caller must also ensure that the memory the pointer (non-transitively) points to is never written to (except inside an UnsafeCell) using this pointer or any pointer derived from it.

just to be pedantic, I'm not sure about the cast buffer.as_ptr() as *mut i32, I think better to use UnsafeCell::raw_get(), I found this comment in the source code of UnsafeCell:raw_get():

    pub const fn raw_get(this: *const Self) -> *mut T {
        // We can just cast the pointer from `UnsafeCell<T>` to `T` because of
        // #[repr(transparent)]. This exploits std's special status, there is
        // no guarantee for user code that this will work in future versions of the compiler!
        this as *const T as *mut T
    }
1 Like

Turns out there's actually some documentation (more in-depth and clear than my little blurb) about this concept in the strict provenance experiment section.

Provenance is the permission to access an allocation’s sandbox and has both a spatial and temporal component:

  • Spatial: A range of bytes that the pointer is allowed to access.
  • Temporal: The lifetime (of the allocation) that access to these bytes is tied to.

Spatial provenance makes sure you don’t go beyond your sandbox, while temporal provenance makes sure that you can’t “get lucky” after your permission to access some memory has been revoked (either through deallocations or borrows expiring).

Provenance is implicitly shared with all pointers transitively derived from The Original Pointer through operations like offset, borrowing, and pointer casts. Some operations may shrink the derived provenance, limiting how much memory it can access or how long it’s valid for (i.e. borrowing a subfield and subslicing).

Shrinking provenance cannot be undone: even if you “know” there is a larger allocation, you can’t derive a pointer with a larger provenance. Similarly, you cannot “recombine” two contiguous provenances back into one (i.e. with a fn merge(&[T], &[T]) -> &[T]).

A reference to a value always has provenance over exactly the memory that field occupies. A reference to a slice always has provenance over exactly the range that slice describes.

1 Like

thank you so much, this is the exact information I'm looking for.

thanks to everyone, I really appreciate your help.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.