Im using tokio
, and I encountered a deadlock issue when trying to lock a Sync Mutex
in async code.
I think I have figured out how the deadlock issue occurs, but Im not entirely sure if my analysis is correct. If there are any mistakes in my analysis, please point them out.
Here's a simplified version of the code I'm working with:
use parking_lot::{Mutex, RwLock};
use std::path::Path;
use std::sync::Arc;
use tokio::task::JoinSet;
struct NumberPool {
numbers: Vec<Arc<RwLock<i32>>>,
}
impl NumberPool {
fn new() -> Self {
NumberPool {
numbers: Vec::new(),
}
}
async fn get_num(&mut self, i: i32) -> Arc<RwLock<i32>> {
let num = self.numbers.iter().find(|n| *n.read() == i);
// if number exists, return it
if let Some(num) = num {
return num.clone();
}
// if number not exists, create a new one
// simulate a slow async operation to create a new number
tokio::time::sleep(std::time::Duration::from_secs(1)).await;
println!("new number: {i}");
let num = Arc::new(RwLock::new(i));
self.numbers.push(num.clone());
num
}
pub fn save(&mut self) -> anyhow::Result<()> {
let numbers_path = Path::new("numbers.json");
let numbers: Vec<i32> = self.numbers.iter().map(|n| *n.read()).collect();
let numbers_json = serde_json::to_string_pretty(&numbers)?;
std::fs::write(numbers_path, numbers_json)?;
Ok(())
}
}
async fn async_task(number_pool: Arc<Mutex<NumberPool>>, i: i32) {
let num = get_number_from_pool(number_pool.clone(), i % 3).await;
if i == 0 {
*num.write() = 100;
let mut num_pool = number_pool.lock();
num_pool.save().unwrap();
}
}
async fn get_number_from_pool(num_pool: Arc<Mutex<NumberPool>>, i: i32) -> Arc<RwLock<i32>> {
let mut num_pool = num_pool.lock();
num_pool.get_num(i).await
}
#[tokio::main(worker_threads = 8)]
async fn main() {
let number_pool = Arc::new(Mutex::new(NumberPool::new()));
let mut join_set = JoinSet::new();
for i in 0..10 {
join_set.spawn(async_task(number_pool.clone(), i));
}
join_set.join_all().await;
}
There are only 8 worker threads available, but 10 tasks are spawned. This leads to 8 tasks starting execution, while 2 tasks remain waiting for available worker thread.
One of the running tasks locks the mutex and calls get_num
, then it will call tokio::time::sleep
(simulating a slow async operation) and yield its worker thread, one of the 2 waiting tasks gets a chance to execute and attempts to lock the mutex.
Now, all 8 worker threads are waiting for the mutex to be released by the sleeping task.1 task sleeping, 1 task not start yet.
When the sleep
ends, the sleeping task needs to continue, but it cannot because all worker threads are blocked waiting for the mutex.
Since the sleeping task cannot continue and release the lock, the 8 tasks waiting for the Mutex
will be blocked forever, Deadlock!
So the simplest solution I can think of is to use tokio::sync::Mutex
. Is this the best solution?