Vec cell sound during allocation?

I am trying to use a Vec as part of a shared state struct that will only be used within a single thread (ie. it doesn't have to be Sync).

Normally, I understand that the recommendation would be to use RefCell<Vec<T>>. However, I would strongly prefer to avoid this if possible, since accessing to the vector occurs on a hot path very frequently, and RefCell does impose a performance penalty. I expect a lot of reads, each of which requires immutably borrowing the RefCell, which is expensive. (And I cannot hold on to a std::cell::Ref between operations in my use case).

I did consider using Cell<Vec<T>>, but it doesn't seem to optimize perfectly. For example, even when disabling bound checks using get_unchecked() (Godbolt):

pub unsafe fn cell_vec_get_unchecked(v: &Cell<Vec<i32>>, i: usize) -> i32 {
    let vec = v.take();
    let res = unsafe { *vec.get_unchecked(i) };
    v.set(vec);
    res
}

I am considering making my own UnsafeCell<Vec<T>> wrapper which enforces safety simply by being !Sync and preventing reentrancy; ie., it behaves similarly to Cell<Vec<T>>, except we skip the steps of v.take() and v.set(vec). However, in the case of pushing a value, which can potentially reallocate, I am wondering if this is still sound?

I believe the only way this could be unsound is if the allocator itself somehow triggers a recursive call on my shared state, while it is in the middle of a push operation. How reasonable is it to assume that this will not happen? GlobalAlloc is unsafe to implement, and requires that the allocator is infallible (and doesn't panic), which gives me some confidence -- however, it doesn't explicitly spell out that, for example, allocators can't have user-specified callback code which runs during allocation; in my case, this could trigger UB if the user registers a callback which runs during a push operation and accesses the shared vector.

Yes, if the shared struct can in any way, including through allocation or panic, call code from your crate's dependents, then you have to consider reentrancy. So, your struct needs some state to signal "I'm in the middle of an operation"; that can be one of many things:

  • a RefCell,
  • a Cell temporarily containing None,
  • your own flag for the whole struct rather than a field, serving to guard your UnsafeCell usage,
  • or maybe even a thread-local variable shared among all of your structs,

but you can't do without having some kind of flag unless you can statically prevent the reentrancy by not allocating or panicking (which is how Cell works by itself: none of its operations allocate or panic while manipulating the value).

1 Like

You can turn a &mut [i32] into a &[Cell<i32>] in what should be a cheap way. Maybe doing that between pushes would work.

use std::cell::Cell;

fn main() {
    let mut a = vec![1, 2, 3];

    let b = Cell::from_mut(a.as_mut_slice()).as_slice_of_cells();

    println!("{}", b[0].get());

    b[0].set(4);

    println!("{}", b[0].get());
}
1 Like

Have you measured it? The main performance penalty RefCell have over the Cell is that it have runtime branch and requires additional space to store the counter. Since you want RefCell<Vec<T>> not Vec<RefCell<T>> that additional size would unlikely matter. And the branch is unsynchronized local condition which is not that hard for compilers to optimize out when possible.

2 Likes

Not that I recommend this, but since it looks like you are okay using unsafe anyways, you may want to look into Cell::as_ptr. It optimizes how you'd expect, and unlike UnsafeCell you still have access to a safe API for when you do need to mutate the shared state.

1 Like

You could always check whether push is going to reallocate and only do the swap dance if so. That way, you only pay the cost when you touch the allocator.

1 Like