Lock order reversals: how to prevent them?

Whenever a program must acquire more than one lock simultaneously, it must always acquire them in the same order every time. Otherwise the program is vulnerable to deadlocks like this:

  • Thread A acquires mutex X
  • Thread B acquires mutex Y
  • Thread A blocks trying to acquire mutex Y
  • Thread B blocks trying to acquire mutex X
  • Deadlock!

This is known as the "lock-order reversal problem". What is the best way to avoid it in Rust?

The FreeBSD kernel has a handy facility known as witness(4) to help detect these LORs. Basically, when the kernel is built in debug mode witness will record the order of every lock acquisition. If it ever detects that a pair locks were acquired in both orders, it will print a warning, along with the stack trace of the offending thread.

But I can't find anything similar for Rust. It seems like such a facility would be invaluable. It could be a 3rd party library that drops-in for std::sync::Mutex at compile-time, for example. But there is no such library that I can find. Is this an unsolved problem, or does Rust have some other solution I'm not aware of?

2 Likes

I think the Linux kernel's lockdep is similar, but I don't know of anything for Rust.

rust-lock-bug-detector uses heuristics to look for lock order bugs statically. (This is a different approach from dynamic detectors like witness and lockdep.)

1 Like

Nightly builds of the Rust toolchain also support LLVM's ThreadSanitizer, which I believe can detect lock order inversion dynamically.

1 Like

Hm, those options aren't very good. rust-lock-bug-detector might have promise, but right now it can't even build with a current compiler. ThreadSanitizer OTOH might be too good. It crashes whenever it tries to build certain proc-macro crates. And for the only crate where I can get it to compile, it reports data races all over the standard library, chiefly in std::thread::spawn.

An obvious sort of solution is to put Y inside X. Rust's Mutex is a container and by creating a Mutex<(X, Mutex<Y>)> (for example) you guarantee that the inner mutex cannot be locked first and cannot be unlocked last.

This has the potential drawback that you cannot lock Y by itself (without locking X), even though doing so does not lead to deadlock if you don't lock X. There might be something clever you can do with RwLock here but that's just a stray thought. In general I think it will depend on why you need two mutexes in the first place.

4 Likes

Can this be built using:

and then complain whenever we try to acquire a lock smaller than the max of the Vec

There have been previous reports of false positives in std from ThreadSanitizer, apparently because it didn't like some memory fence instructions that were used for synchronization:

I wonder if the putative races you're seeing are caused by something similar.

(Also, emoji shortcode replacement strikes again… we should consider disabling that.)

Something like this would be cool to have, but I agree that it is currently an unsolved problem.

One thing you could attempt would be a token based approach like this:

struct TokenAB;
struct TokenB;

impl MutexA<T> {
    fn lock<'a>(&'a self, token: &'a mut TokenAB) -> (&'a mut TokenB, MutexGuard<'a, T>);
}

impl MutexB<T> {
    fn lock_1<'a>(&'a self, token: &'a mut TokenAB) -> MutexGuard<'a, T>;
    fn lock_2<'a>(&'a self, token: &'a mut TokenB) -> MutexGuard<'a, T>;
}

(There are various ways to avoid the duplicated lock method)

With these tokens, you are able to lock either one or both mutexes, but you cannot lock them in a way that would deadlock. (This assumes that you can somehow prevent a thread from obtaining more than one token.)

4 Likes

I think thread_local storage is key. But how would you assign the ID? Statically? That requires extra effort on the programmer's part. An easier tool would build the relationships automatically. I think it could build a DAG as the program runs, and every time it a lock is acquired it could check whether this would create a cycle in the DAG.

1 Like

I was thinking a global Arc<Cell<usize>> that gets incremented every time a Lock is constructed.

That would require locks to be acquired in the same order as they were constructed, right?

Yes, it does impose this restriction.

You may be interested in Compile time lock ordering approach to deadlocks where I did some work developing a compile-time solution to this. I subsequently changed jobs, so never got a chance to try fully implementing it, but I still believe it has some potential.

That said, the consensus of the rust ecosystem seems to be mostly to rely on other sorts of concurrency primitives rather than multiple mutexes, so if that's a possibility in your domain, you may find more existing support with those approaches.

1 Like

Sometimes you can explicitly sort the locks. This typically comes up in database updating, where you need to lock multiple rows, perform a transaction that affects those rows, and release the locks. Other transactions may be going on at the same time. If all updating threads sort the rows to be locked by some immutable ID, you can avoid a lock order conflict.

This is the totally run-time case, as opposed to the class of locking problems for which compile-time analysis would help.

2 Likes

A simple approach I just thought of ( so it may be nonsensical / not work ).

(1) Assume that each thread has a priority, and no two threads have the same priority.

(2) Whenever a thread acquires a lock, the priority of the thread that has the lock is recorded in the lock.

(3) When a thread wants to acquire a locked lock it, it examines the priority of the locker, LP.

(4) If LP is lower than it's own priority, then it simply waits for the lock to be released.

(5) If LP is higher than it's own priority, it releases all the locks it has, waits for the blocking lock to be released, and then retries. It is the "victim" of a possible deadlock.

One way of assigning a priority is to use negative the time the thread started it's transaction, this should ensure that every thread eventually makes progress. The more "senior" (older) threads have priority.

No need to invent as this problem is well studied. You can start with the Wikipedia article and follow up from there. The first answer on this StackExchange post sounds a lot like your proposal.

Static lock ordering would be useful to have in Rust. One of my comments reads:

/// #Locking protocol
/// 
/// Locks must be locked in this order, to avoid deadlocks.
/// * LockByName
/// * UndupDatabasesLink
/// * SceneObjectLink
/// * Statistics
///
/// No more than one lock of each type can be locked by a thread.
/// All locks must be held only by a scoped construct.

That restriction could be statically enforced by something that can follow the call chain at compile time. It doesn't cover all cases, but if applies, it prevents deadlocks.

One possibility is to represent the current lock chain as an HList of zero-sized types, one per lock. Then, use generics to only allow locking when the current chain is compatible with the proscribed order.


Edit: I don't have time to test this properly, but I think something along these lines should work:

use typenum;
use typenum::type_operators::IsLess;

use std::marker::PhantomData;
use std::sync::{Mutex, MutexGuard};

pub struct LockToken<'a, PRI>(PhantomData<&'a mut PRI>);
impl LockToken<'static, typenum::U0> {
    pub fn new_thread_root_unchecked()->Self { LockToken(PhantomData) }
}

pub struct OrderedMutex<T, PRI>(Mutex<T>, PhantomData<PRI>);

impl<T, PRI> OrderedMutex<T, PRI> {
    pub fn lock<'a,'b:'a, TPRI>(
        &'a mut self,
        _: &'a mut LockToken<'b, TPRI>,
    ) -> (MutexGuard<'a, T>, LockToken<'a, PRI>)
    where
        TPRI: IsLess<PRI, Output = typenum::True>,
    {
        unimplemented!()
    }
}

I don't have time to test this properly, but I think something along these lines should work.

That looks like a way to declare a hierarchy of OrderedMutex types, but I don't see how the use checking works.