Awaiting future does not return

I have a unit test being run via #[tokio::test(flavor="multi_thread", worker_threads=2)] which deadlocks. I have tried to strip it down to a minimal reproducer, but have been unable. The Mutex is a from Tokio, and I'm on the latest version of the library. The "offending" code I have been able to track down to the following function:

    pub async fn is_freed(&self) -> bool {
        let state_clone = self.state.clone();

        trace!("is_free: {:?} {:?}", Arc::as_ptr(&self.state), self);

        let ret = tokio::spawn(async move {
            let ret = {
                let lock = state_clone.flags.lock().await;
            trace!("AFTER: {}", ret);


        trace!("is_free returning: {:?}", ret);


The code never gets past the .await on the JoinHandle. The first question someone will have is why I am spawning a task at all, and the answer is to more clearly/easily demonstrate that my .await never finishes. The same happens if I remove the spawn call, and await the is_free function. When calling the above, I only see the following in the console (ie, I never see the is_free returning message):

Apr 03 20:28:39.564 TRCE[dal/src/] is_free: 0x7faa900116b0 Page { state: PageState { flags: Mutex { data: 2 }, in_use_count: 0 }, memory: 0x7faa94c6a000 }
Apr 03 20:28:39.564 TRCE[dal/src/] BEFORE
Apr 03 20:28:39.564 TRCE[dal/src/] AFTER: false

What could cause an .await to not return? Nowhere else in my code do I cancel tasks, and the unit test never finishes. I do have another task that is in a pretty tight loop trying to grab the same lock; however, it uses try_lock(), and keeps on looping if it cannot acquire the lock. This seems like a deadlock, and I know that if I call .lock() on the same Mutex twice from the same task, Tokio will deadlock; however, by AFTER: false I should have dropped the MutexGuard, so I don't think that is what is happening.

It sounds like you are blocking the thread. This can prevent other tasks from being executed. You should fix the code that is spinning on the lock and have it use an .await instead.

Changing the try_lock() to lock().await in the "spinning task" still results in a deadlock/stall. The "spinning task" looks like this:

  for idx in 0..len {
      let page = {
          let buffers =;
      let mut flags = page.state.flags.lock().await;

      // let mut flags = match page.state.flags.try_lock() {
      //     Err(e) => continue,
      //     Ok(flags) => flags
      // };

      if flags.is_freed() || !flags.is_dirty() {

      trace!("Writing page to backing store {}: {:?} {:?}", idx, Arc::as_ptr(&page.state), page);

      if let Err(e) = writer.write_page(idx, slice).await {
          error!("Error writing page: {}", e);

      // mark the page as clean
      *flags = flags.unset_dirty();

      trace!("Done writing to backing store: {}: {:?} {:?}", idx, Arc::as_ptr(&page.state), page);

Here buffers_clone is a Tokio RWLock around a Vec.

You might want to add a yield_now() call above the flags.lock() call to be completely sure.

Anyway, I recommend you give tokio-console a try to debug this. Are there any tasks whose busy duration keeps going up while the poll count is constant?

There are only 2 tasks shown. The top one, both the busy and poll counts continue to increase, and it corresponds (Location) with the "spinning task". The second task appears to have only been polled once:

Warn  ID  State  Name  Total▿     Busy       Idle       Polls   Target      Location             Fields
│        1 ▶              11.9994s   11.9590s  40.4934ms 224    tokio::task  kind=task
│        2 ⏹              1.1159ms 755.1520µs 360.7140µs 1      tokio::task   kind=task

If I did call lock().await on the same Mutex from the same task, causing a deadlock, how could I go about detecting that? How can I go about preventing it?

I continued to trace through the code, and the "main task" was creating a stream which had a bad poll_next implementation that was not properly polling the future. I eliminated this Stream implementation by using stream::unfold to create the stream, and everything seems to be working as expected. Thanks @alice for the pointers towards Tokio console. While it is seemingly a powerful tool, without your clue/help of "Are there any tasks whose busy duration keeps going up while the poll count is constant", I wouldn't really know what I was looking at.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.