Auto optimizing mutex?

Today it is possible to call mutex.lock()?.deref_mut() to get a mutable reference to the contents of the mutex while holding the lock.
It is also possible to call mutex.get_mut()? to get a mutable reference to the contents without aquiring the lock, if the borrow checker can prove no other code could be doing this at the same time.

My question is: Is it possible to create a method or macro that would allow a single syntax to do either of these depending on whether the caller can mutably borrow the mutex? This way the lock is optimized out when it is not needed without the caller thinking about it.

If this is not possible today, is there some feature that could enable this?

Calling Mutex::lock on a &mut Mutex should be very cheap because taking a lock that’s not held and not waited for should be cheap.

I don’t think there’s a way to let the compiler infer whether to call one of two methods that only differ in the mutability of a reference.

1 Like

In theory I think that is not possible since operations will be across threads. If all the operations are within one thread then you don’t need mutex anyways.

If you don’t want to wait then use try lock with short timeout for acquiring lock.

Oh but it is https://doc.rust-lang.org/std/sync/struct.Mutex.html#method.get_mut

1 Like

He means making the compiler pick between lock and get_mut depending on context isn’t possible.

1 Like

You can use Arc to have conditional exclusivity, like:

use std::sync::{Arc, Mutex};

fn with_mut<T, F, R>(x: &mut Arc<Mutex<T>>, f: F) -> Option<R>
where
    F: FnOnce(&mut T) -> R,
{
    match Arc::get_mut(x) {
        Some(mutex) => Some(f(mutex.get_mut().ok()?)),
        None => Some(f(&mut *x.lock().ok()?)),
    }
}

I used a callback to avoid borrowing issues with the guard, and I threw away the errors because the poison types are different. Maybe Either could be used to return the differences, but then the caller has to deal with that instead.

1 Like

I got close, even hiding the Either with impl DerefMut:

use either::Either::{Left, Right};
use std::ops::DerefMut;
use std::sync::{Arc, LockResult, Mutex, PoisonError};

fn get_mut<'a, T>(x: &'a mut Arc<Mutex<T>>) -> LockResult<impl DerefMut<Target = T> + 'a> {
    match Arc::get_mut(x) {
        Some(mutex) => match mutex.get_mut() {
            Ok(ref_mut) => Ok(Left(ref_mut)),
            Err(poison) => Err(PoisonError::new(Left(poison.into_inner()))),
        },
        None => match x.lock() {
            Ok(guard) => Ok(Right(guard)),
            Err(poison) => Err(PoisonError::new(Right(poison.into_inner()))),
        },
    }
}

But NLL isn’t yet strong enough to make the lifetimes work on the different branches:

error[E0502]: cannot borrow `*x` as immutable because it is also borrowed as mutable
  --> src/lib.rs:11:23
   |
5  | fn get_mut<'a, T>(x: &'a mut Arc<Mutex<T>>) -> LockResult<impl DerefMut<Target = T> + 'a> {
   |            -- lifetime `'a` defined here
6  |     match Arc::get_mut(x) {
   |                        - mutable borrow occurs here
7  |         Some(mutex) => match mutex.get_mut() {
8  |             Ok(ref_mut) => Ok(Left(ref_mut)),
   |                            ----------------- returning this value requires that `*x` is borrowed for `'a`
...
11 |         None => match x.lock() {
   |                       ^ immutable borrow occurs here

That said, there’s still the fact that an uncontended lock should be fast anyway.

I think we can hack around the lifetime errors here by calling get_mut twice - once to check if it’s possible, then again to get the value. Copying your example,

use either::Either::{Left, Right};
use std::ops::DerefMut;
use std::sync::{Arc, LockResult, Mutex, PoisonError};

fn get_mut<'a, T>(x: &'a mut Arc<Mutex<T>>) -> LockResult<impl DerefMut<Target = T> + 'a> {
    if Arc::get_mut(x).is_some() {
        match Arc::get_mut(x).unwrap().get_mut() {
            Ok(ref_mut) => Ok(Left(ref_mut)),
            Err(poison) => Err(PoisonError::new(Left(poison.into_inner()))),
        }
    } else {
        match x.lock() {
            Ok(guard) => Ok(Right(guard)),
            Err(poison) => Err(PoisonError::new(Right(poison.into_inner()))),
        }
    }
}

It’s not ideal, but it should work - if the arc is unique, then it’s not going to become un-unique while we hold a mutable borrow. The four atomic compares this involves probably still faster than locking a contention-free mutex?

Internally Arc uses relaxed consistency so it can cause a race condition if in another thread content’s state has changed.

There is no(longer*) race condition in safe use of Arc. (* a bug was fixed.)
I’m not sure that the above functions would give any optimization, but too lazy to profile.

As a minor suggestion, always do pattern match instead of if followed by unwrap.

match Arc::get_mut(x) {
  Some(mutex) => match mutex.get_mut() { .. },
  None => match x.lock() { .. }
}

There is no bug in Arc, but using Arc as performance hack for data synchronization will have race condition/bug.

Either there is a very serious vulnerability with Arc::get_mut, or there is no issue at all:

  1. If Arc::get_mut succeeds, it gives back a &mut _ which means that it guarantees a unique access to the pointee, no matter the situation

    • if there is any race condition whatsoever, then Arc::get_mut is unsound and needs to be fixed.
      I am skeptical that it is the case (Relaxed-ordered atomic operations in the ::std are only used after careful code auditing has “proven” that it is sound to do so), but it is not impossible. So if you have spotted a problem, please submit an issue.
  2. Given a guaranteed unique access on a RwLock / Mutex, etc., no interior mutability is required, and thus no synchronization mechanic is required either (in other words, no locks).


Back to the topic, get_mut does require a check on an AtomicUsize, which is kind of what an uncontested lock does anyway.

And the case where the lock is uncontested but accessible from two places gets slower in this case.

That’s why I don’t expect the suggested solution to improve performance on average.

3 Likes

I was talking about how Arc increments the count. But Arc::get_mut locks the access so it will be unique access but as said it doesn’t give any performance improvement. Plain mutex is simpler and perhaps faster but that needs testing to know.

Ah, you’re right, that does compile. Feels wrong to need that though – here’s hoping for Polonius.

The fast path on glibc’s futex implementation is just two atomic operations – a compare-exchange to lock and an exchange to unlock. It’s similar in parking_lot, and probably a little faster just for having less memory indirection.

In general, I agree with you. But in this particular case the code you quoted doesn’t compile - the compiler thinks that x is still mutably borrowed in the None branch. @cuviper’s original code was like yours, and I modified it to have the is_some() followed by unwrap() instead.

I guess this isn’t worth it, then. At least with a parking_lot::Mutex - maybe std::sync::Mutex would still be slower?

This is the check I was looking at Arc::get_mut doing: https://doc.rust-lang.org/src/alloc/sync.rs.html#917-940