Avoiding Arc when it is logically guaranteed the code is sound

Hello.

I am writing a high performance memory manager app with parallelism and I ran into a bit of a stall.
Imagine the following case:

I have a struct with a bunch of data:

pub struct BVec<T>{
    data: SendPtr<T>,
    tail: SendPtr<T>,
    len: usize,
    grow_pow: usize,
    push_mutex: std::sync::Mutex<()>,
    pop_mutex: std::sync::Mutex<()>,
    is_growing: Arc<bool>,
    capacity: usize,
    alloc_strategy: fn(vec: &Self, len: usize, capacity: usize) -> bool,
    grow_thread: Option<thread::JoinHandle<()>>,
}

Just a little bit of context:
I am using a specialized concurrent memory allocation strategy which copies all the "safe" data from the old memory region to the new memory region, without stalling the control thread and allowing the user to still push data while the untouched parts of the old memory are unaffected.

The data that cannot be copied because of synchronization will be eventually locked by the mutex and the copying will be done in a synchronized manner and the pointers will be swapped.

This essentially creates only a very small synchronization window, compared to other copy methods.

I have a grow function in which I perform these memory operations, but I need to pass the

    data: SendPtr<T>,
    tail: SendPtr<T>,

Which is coming from self. **Since the thread gets detached ** in the function, the compiler thinks that the borrowed self reference will outlive the instance of the BVec.

This could be generally problematic, but I made sure that this can't happen and can guarantee that self in the thread is going to be valid for as long as the thread exists.

I know I could use Arc, but using arc for all the internal fields:

    data: SendPtr<T>,
    tail: SendPtr<T>,
    len: usize,
    grow_pow: usize,
    capacity: usize,

Seems like a massive overkill to me.

I know what I am doing is inherently unsafe but I am trying to squish the highest amount of performance possible out of this and will be covering all the cases with proper testing.

Are there any tips how can I avoid these problems in an "Ideomatic" way without having to resort to hard-core hacks like casting the references to usize (Or anything similar to that)?

Other methods I thought about:

  1. Splitting the shared data into a sub structure and making it an Arc in the BVec structure
   pub struct BVecShared<T> {
       data: SendPtr<T>,
       tail: SendPtr<T>,
       len: usize,
       capacity: usize,
   }

   pub struct BVec<T> {
       shared: Arc<BVecShared<T>>,
       grow_pow: usize,
       push_mutex: std::sync::Mutex<()>,
       pop_mutex: std::sync::Mutex<()>,
       is_growing: Arc<AtomicBool>,
       alloc_strategy: fn(vec: &Self, len: usize, capacity: usize) -> bool,
       grow_thread: Option<thread::JoinHandle<()>>,
   }

This is the best solution I have on my mind right now, but if I could just somehow tell the compiler that I know what I am doing so I can avoid the overhead, that would be awesome!

This is the error:

error[E0521]: borrowed data escapes outside of method
   --> src\block_vec.rs:91:22
    |
82  |       fn grow(&mut self){
    |               ---------
    |               |
    |               `self` is a reference that is only valid in the method body
    |               let's call the lifetime of this reference `'1`
...
91  |           let handle = thread::spawn(||{
    |  ______________________^
92  | |             let new_data = self.alloc_new();
93  | |             if self.data.data.is_null(){
94  | |                 self.data.data = new_data;
...   |
123 | |
124 | |         });
    | |          ^
    | |          |
    | |__________`self` escapes the method body here
    |            argument requires that `'1` must outlive `'static`

EDIT:

My current solution is this:

pub struct SharedData<T>{
    data: AtomicPtr<T>,
    tail: AtomicPtr<T>,
    is_growing: AtomicBool,
    capacity: AtomicUsize,
    grow_pow: AtomicUsize,
    push_mutex: Mutex<()>,
    pop_mutex: Mutex<()>,
    len: AtomicUsize,
    layout: Option<RwLock<Layout>>,
}
pub struct BVec<T>{
    shared_data: Arc<SharedData<T>>,
    alloc_strategy: fn(vec: &Self) -> bool,
}

So far seems like the best option. Will have to benchmark it tho

There's no reference stored in the posted struct. Wher and what is the error, exactly?

Added the error.

It is in a function called Grow.

Since the thread is not in a scoped contex, it causes this problem.

Issue is, I don't want it to be scoped as the function is supposed to detach internally and join elsewhere.

Using the shared_data Arc is the safest approach, and if you've got that working, you should probably stick to that.

When you want to cheat on lifetimes, the solution is to use pointers instead of references. Don't capture self in the closure, but let this = self as *mut Self; instead…

…Except, depending on your exact other code, that could easily result in causing UB. fn grow takes &mut self, so for as long as (pointers derived from) that reference is still used for accesses, no other pointers/references (that aren't derived from this one) may be used for accesses, even if all of the access are shared non-mut accesses[1].

So just use the Arc. It's less of a headache for everyone.

(I'm a member of T-opsem but not speaking on behalf of the team nor with any authority.)


  1. As stated, this is true under the Stacked Borrows model, which invalidates &mut provenance when a conflicting read access occurs. Under Tree Borrows, however, a conflicting read access only downgrades &mut provenance, and it remains valid for reads. But an important further caveat: conflicting access within the scope of a function call with the reference as an argument is always immediate UB. That's not the risk in this case, but it's important to clarify this. ↩︎

3 Likes

Can you use scoped threads?

You won't be able to create a scope inside of your grow function, but maybe you could receive a scope as an argument created by the caller.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.