Using sync RwLocks in async code with yield_now()

At some point in the past few years I wrote something that uses an std::sync::RwLock in async code. In the pursuit of premature optimization and in fear of potential deadlocks, I wrote the following code:

pub async fn yield_if_would_block<T>(
    r: impl Fn() -> TryLockResult<T>,
) -> Result<T, PoisonError<T>> {
    loop {
        match r() {
            Ok(val) => break Ok(val),
            Err(e) => match e {
                TryLockError::Poisoned(e) => break Err(e),
                TryLockError::WouldBlock => tokio::task::yield_now().await,
            },
        }
    }
}

And then I read and write from the lock like this:

    let read_lock = yield_if_would_block(|| lock.try_read()).await?;
    let mut write_lock = yield_if_would_block(|| lock.try_write()).await?;

The intent was that this would help prevent deadlocks since if another task is currently holding a write lock and this code tries to read then it'll yield_now, giving that other task a chance to finish and release the lock.

My understanding is this is basically the same as what read does, but where I yield, read just makes that thread sit and wait, and blocking in an async task is of course forbidden.

I'm still not holding these locks across awaits. If I understand correctly, with a single-threaded executor, that should be enough to ensure that it wouldn't ever block, since only one task will execute at a time and it won't start working on another task until it hits an await. However, in production this code uses a multi-threaded executor.

I'm hoping to get a better understanding of how these locks work in async code in general, but I'm curious about these specific questions:

  1. Is my code useful? Like is it worth worrying about this at all if I'm not taking it across an await?
  2. Is this achieving the desired goal of preventing deadlocks? Either in general or in the case that the lock was held over an await?
  3. Is this basically what the async versions of RwLock are doing?
    a. I'm told that the async locks come with a performance hit, but it doesn't seem like my code would be less performant than the sync try_read since I'm not doing any kind of additional checks before trying to read or write. So I assume the answer to 2. must be no and the async locks are doing something else to be actually safe - what is that?

Both the blocking methods and async locks do this in a more efficient way than your implementation. In particular when the lock is taken they both "suspend" the computation (either the thread, for the blocking locks, or the async task, for the async one) until the current owner of the lock releases it. In comparison your implementation will try again and again until it succeeds, wasting CPU cycles in the process.

1 Like

This code doesn't prevent other async tasks from making progress, but it is an inefficient spin lock that blocks the current task. To fix the inefficiency, it needs to park the current task until the lock is released. An async RwLock is the obvious solution.

An alternative, if you need to keep the sync lock is parking on a separate async primitive like a oneshot channel or Notify.

It might have its uses, but the nature of spin locks can be a concern. The CPU time spent on that loop might be better spent on doing actually useful work.

Deadlock occurs when there are cycles with mutual waits. In the simplest form, task A is waiting on task B, while task B is waiting on task A. The code as written does not avoid deadlock in the general case, because the cycle could be caused by reentrancy. Task A takes the lock, then calls a function which ends up in this spin lock waiting for itself to release the lock. That of course cannot happen.

There aren't any generalized solutions to guarantee deadlock does not occur [1]. Consider a channel where the sender is held but a message is never produced into it: The receiver will wait forever. Even more naively, a lock is acquired and a function is called that never returns. Any code that waits to acquire will deadlock. These are contrived examples, but it's enough to make the point. Deadlock is a runtime phenomenon, and I know of no static analysis that can detect it.

The short answer to your question is that no, it does not prevent deadlocks. But there is a lot of nuance to this question, and the application may not deadlock due to the way it is written, not because of this spin lock code.

Not at all, unfortunately. Async locks keep a list of waiters and coordinate the release by scheduling the next waiter to acquire. There is no spinning in this model; the waiters are all parked (aka suspended, idle).

Yeah, there's a performance hit to everything. Storing waiters isn't free. But it's also not that expensive. It's certainly less expensive than loop { yield_now().await }! At least in terms of CPU resources. The async locks are more expensive in terms of memory -- probably.

I hope I've been able to help explain what async locks are doing differently. Others might have different insights.


  1. To my knowledge, at least. I'm unstudied on recent research in this area. â†Šī¸Ž

1 Like

Thanks so much for your responses! I dug into the source of the tokio RwLock and it's starting to make a lot more sense -- I see the list of Waiters, each with the task's Waker that'll be used to wake/unsuspend/unpark it.

What I can't find is how the tasks get parked in the first place. Is it just as soon as they return Poll::Pending? In that case, is it true that the executor could wake the task waiting for the lock if it got to the front of the executor's task queue even if the lock wasn't available?

In comparison your implementation will try again and again until it succeeds, wasting CPU cycles in the process.

From the docs for yield_now:

The current task will be re-added as a pending task at the back of the pending queue. Any other pending tasks will be scheduled.

So it should only spin if there are no other tasks in the queue right? Those docs do say it's not actually guaranteed that it'll start a different task unfortunately, but assuming that there are plenty of other tasks that can make progress. Still not ideal but if it did have something better to do the hope is it'd do that instead.

My hope with this implementation was that the happy path, where the lock is available, doesn't have any performance differences compared to just using the sync lock directly. And that if it did hit that wouldblock case it'd eventually work, which seems better than just erroring. As opposed to using the async lock, which has a (small) performance overhead for successful reads.

Since I don't think I need the async one (since the code isn't holding it over an await) the tokio docs recommend using the sync one. So basically I don't want to add a performance hit if this is never going to come up, but I don't want to risk leaving it with no recourse if it does come up because I've failed to reason correctly about what can cause a deadlock.

Then the other wrinkle is single-threaded vs multi-threaded executors. Is it still true that you should use the sync locks with a multi-threaded executor since there can be multiple tasks that all want the lock? If so, is there something better I could do to handle the WouldBlock case that doesn't hurt me in the (muuuuch more common) case that the lock is immediately available?

Here's a playground that has a simplified version of one of the ways I'm using this pattern. Basically, it only needs to write once the first time a request is made, when it establishes the connection, then all subsequent calls can read from that. (Effectively a OnceCell type of deal -- this one might be an XY problem but I'm using this code other places where I don't think I could replace the RwLock, and this is what I'm working on right now so it's front of mind)

Yes, returning Pending tells the executor that the task is unable to make progress right now. When you do that, Tokio will wait for the waker to be notified, and then poll the task again.

The task is not in the task queue at all, so it won't get woken. When something calls wake on the task's waker, then it gets added to the queue.

It's still considered spinning. The task is going to get woken unnecessarily thousands or millions of times (or more) depending on how long the lock is held for.

The async lock does pretty much exactly the same in the successful case. There's no real perf cost to the successful case.

You can't always use the sync one. If you need to do the kind of yield_now loop you're doing, then you can't use the sync one and need the async one.

3 Likes

Thanks everyone for all the clarifications! I feel like I definitely have a better understanding of the internals of async code in general. I've been monitoring it and determined that this case never does actually get hit but if it does in the future we'll likely swap to the async lock.