tokio::sync::Mutex lock().await vs try_lock()

Hi community!

Guys I have some code with usage of tokio::sync::Mutex and in the first iteration I used lock().await everywhere. But with growth of code amount I began to get deadlocks and I decided to switch to try_lock() usage and fix logic to avoid deadlocks. And also I have the same usage pattern for tokio::sync::RwLock.

Could you please advice or explain me what potential problems may I have with such approach?
The first thought that I need robust retry mechanism but for now I don't have a good solution for that either.

Thanks in advance :hugs:

Generally, the way to avoid deadlocks with mutexes is to utilize lock ordering. If you have two mutexes A and B, and you need to lock both at the same time, then you should decide on an ordering and make sure that you always call A.lock() first before you call B.lock(). If you always lock multiple locks in the same order, there will be no deadlocks.

Another approach is to simply not lock multiple things at the same time. For example, replace this code:

let guard_a = a.lock().await;
guard_a.use_a().await;
let guard_b = b.lock().await;
guard_b.use_b().await;

with this:

let guard_a = a.lock().await;
guard_a.use_a().await;
drop(guard_a);
let guard_b = b.lock().await;
guard_b.use_b().await;

By not locking both at the same time, you don't have to worry about the order.

2 Likes

@alice thank you for quick response!

I'd like to ask that is it a difference approach to use scopes instead of explicit drop? For example if I use small functions for such small operations:

You example:

let guard_a = a.lock().await;
guard_a.use_a().await;
drop(guard_a);
let guard_b = b.lock().await;
guard_b.use_b().await;

With scopes of functions:

async fn change_a(a: Arc<Mutex<A>>) -> Result<()> {
    let guard_a = a.lock().await;
    guard_a.use_a().await;

    Ok(())
}

async fn change_b(b: Arc<Mutex<B>>) -> Result<()> {
    let guard_b = b.lock().await;
    guard_b.use_b().await;

   Ok(())
}

async fn together(a: Arc<Mutex<A>>, b: Arc<Mutex<B>>) -> Result<()> {
    change_a(a).await?;
    change_b(b).await?;

    Ok(())
}

Or even better example with shared state:

struct UpdatePayload {
     id: u64,
     new_field: String,
};

struct ModifyPayload {
    id: u64,
    new_other_field: String,
}

enum ChangeEvent {
    Update(Update),
    Modify(Modify),
}

struct StructWithState {
    state: Arc<Mutex<State>>
}

impl StructWithState {
    pub async fn on_change(&self, change: ChangeEvent) -> Result<()> {
        match change {
            ChangeEvent::Update(update) => self.on_update_event(update).await?,
            ChangeEvent::Modify(modify) => self.on_modify_event(modify).await?,
        }
   }

   async fn on_update_event(&self, update: UpdatePayload) -> Result<()> {
         let state = self.state.lock().await;
         state.update(update)?;

        Ok(())
   }

   async on_modify(&self, modify: ModifyPayload) -> Result<()> {
       let state = self.state.lock().await;
       state.modify(modify)?;

       Ok(())
   }
}

And somewhere outside we run this pub function on_change().await for streaming events. And if I use lock().await I have some deadlocks

Using scopes is good too, of course!

With regards to on_change(), you should make sure to never hold the lock while calling on_change(). That's certainly not going to work. In fact, you probably should not make the mutex public in the first place.

Hi, interesting topic. Can you please explain in more details your last response?

1 Like

As I understand how tokio works in general that it creates separate "tasks" (state machines) for each await call. And in our case on each call of on_change().await tokio creates this task and if we have one more change before previous operation finishes and run on_change().await on another thread, can we have a deadlock?

as @alice said, the key to avoid deadlocks is lock ordering.

unfortunately, the Mutex api in the standard library is not capable of any sort of ordering guarantee. however, there do exist different ways that can help define some locking order.

I remember one powerful technique was presented in the rustconf 2024 talk "safey in an unsafe world", which can prevent deadlock at compile without any runtime overhead, but it requires a statically defined total order among all the possible locks, which might not be possible in all use cases.

I have also stumbled upon the happylock crate, which, if I understand it correctly, combines multiple mutexes into groups, and lock them atomically at runtime in an (unspecified but consistent) sorted order (I believe the order is based on their memory addresses, pretty much the same as C++'s deadlock avoiding locking api).

the described techniques are based on the blocking mutex, but I believe the same principle also applies to async mutexes, I just don't know any existing libraries in the async ecosystem.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.