Lock order reversals: how to prevent them?

asomers · September 20, 2021, 10:54pm

Whenever a program must acquire more than one lock simultaneously, it must always acquire them in the same order every time. Otherwise the program is vulnerable to deadlocks like this:

Thread A acquires mutex X
Thread B acquires mutex Y
Thread A blocks trying to acquire mutex Y
Thread B blocks trying to acquire mutex X
Deadlock!

This is known as the "lock-order reversal problem". What is the best way to avoid it in Rust?

The FreeBSD kernel has a handy facility known as witness(4) to help detect these LORs. Basically, when the kernel is built in debug mode witness will record the order of every lock acquisition. If it ever detects that a pair locks were acquired in both orders, it will print a warning, along with the stack trace of the offending thread.

But I can't find anything similar for Rust. It seems like such a facility would be invaluable. It could be a 3rd party library that drops-in for std::sync::Mutex at compile-time, for example. But there is no such library that I can find. Is this an unsolved problem, or does Rust have some other solution I'm not aware of?

cuviper · September 20, 2021, 11:45pm

I think the Linux kernel's lockdep is similar, but I don't know of anything for Rust.

mbrubeck · September 21, 2021, 12:02am

rust-lock-bug-detector uses heuristics to look for lock order bugs statically. (This is a different approach from dynamic detectors like witness and lockdep.)

mbrubeck · September 21, 2021, 12:06am

Nightly builds of the Rust toolchain also support LLVM's ThreadSanitizer, which I believe can detect lock order inversion dynamically.

asomers · September 21, 2021, 12:37am

Hm, those options aren't very good. rust-lock-bug-detector might have promise, but right now it can't even build with a current compiler. ThreadSanitizer OTOH might be too good. It crashes whenever it tries to build certain proc-macro crates. And for the only crate where I can get it to compile, it reports data races all over the standard library, chiefly in std::spawn.

trentj · September 21, 2021, 1:43am

An obvious sort of solution is to put Y inside X. Rust's Mutex is a container and by creating a Mutex<(X, Mutex<Y>)> (for example) you guarantee that the inner mutex cannot be locked first and cannot be unlocked last.

This has the potential drawback that you cannot lock Y by itself (without locking X), even though doing so does not lead to deadlock if you don't lock X. There might be something clever you can do with RwLock here but that's just a stray thought. In general I think it will depend on why you need two mutexes in the first place.

anon80458984 · September 21, 2021, 1:43am

Can this be built using:

Cell that assigns an ID to every lock
thread_local in std - Rust for a Vec<usize> that tracks all the locks current thread has

and then complain whenever we try to acquire a lock smaller than the max of the Vec

cole-miller · September 21, 2021, 1:57am

There have been previous reports of false positives in std from ThreadSanitizer, apparently because it didn't like some memory fence instructions that were used for synchronization:

github.com/rust-lang/rust

Make std::sync::Arc compatible with ThreadSanitizer

rust-lang:master ← tmiasko:arc

opened 11:44AM - 04 Oct 19 UTC

tmiasko

+22 -4

The memory fences used previously in Arc implementation are not properly unders…tood by thread sanitizer as synchronization primitives. This had unfortunate effect where running any non-trivial program compiled with `-Z sanitizer=thread` would result in numerous false positives. Replace acquire fences with acquire loads to address the issue. Fixes #39608.

I wonder if the putative races you're seeing are caused by something similar.

(Also, emoji shortcode replacement strikes again… we should consider disabling that.)

alice · September 21, 2021, 7:37am

Something like this would be cool to have, but I agree that it is currently an unsolved problem.

One thing you could attempt would be a token based approach like this:

struct TokenAB;
struct TokenB;

impl MutexA<T> {
    fn lock<'a>(&'a self, token: &'a mut TokenAB) -> (&'a mut TokenB, MutexGuard<'a, T>);
}

impl MutexB<T> {
    fn lock_1<'a>(&'a self, token: &'a mut TokenAB) -> MutexGuard<'a, T>;
    fn lock_2<'a>(&'a self, token: &'a mut TokenB) -> MutexGuard<'a, T>;
}

(There are various ways to avoid the duplicated lock method)

With these tokens, you are able to lock either one or both mutexes, but you cannot lock them in a way that would deadlock. (This assumes that you can somehow prevent a thread from obtaining more than one token.)

asomers · September 21, 2021, 4:42pm

I think thread_local storage is key. But how would you assign the ID? Statically? That requires extra effort on the programmer's part. An easier tool would build the relationships automatically. I think it could build a DAG as the program runs, and every time it a lock is acquired it could check whether this would create a cycle in the DAG.

anon80458984 · September 21, 2021, 9:36pm

I was thinking a global Arc<Cell<usize>> that gets incremented every time a Lock is constructed.

asomers · September 21, 2021, 9:37pm

That would require locks to be acquired in the same order as they were constructed, right?

anon80458984 · September 21, 2021, 9:45pm

Yes, it does impose this restriction.

baumanj · September 25, 2021, 1:13am

You may be interested in Compile time lock ordering approach to deadlocks where I did some work developing a compile-time solution to this. I subsequently changed jobs, so never got a chance to try fully implementing it, but I still believe it has some potential.

That said, the consensus of the rust ecosystem seems to be mostly to rely on other sorts of concurrency primitives rather than multiple mutexes, so if that's a possibility in your domain, you may find more existing support with those approaches.

John_Nagle · September 25, 2021, 7:58am

Sometimes you can explicitly sort the locks. This typically comes up in database updating, where you need to lock multiple rows, perform a transaction that affects those rows, and release the locks. Other transactions may be going on at the same time. If all updating threads sort the rows to be locked by some immutable ID, you can avoid a lock order conflict.

This is the totally run-time case, as opposed to the class of locking problems for which compile-time analysis would help.

geebee22 · September 27, 2021, 9:26pm

A simple approach I just thought of ( so it may be nonsensical / not work ).

(1) Assume that each thread has a priority, and no two threads have the same priority.

(2) Whenever a thread acquires a lock, the priority of the thread that has the lock is recorded in the lock.

(3) When a thread wants to acquire a locked lock it, it examines the priority of the locker, LP.

(4) If LP is lower than it's own priority, then it simply waits for the lock to be released.

(5) If LP is higher than it's own priority, it releases all the locks it has, waits for the blocking lock to be released, and then retries. It is the "victim" of a possible deadlock.

One way of assigning a priority is to use negative the time the thread started it's transaction, this should ensure that every thread eventually makes progress. The more "senior" (older) threads have priority.

alanhkarp · September 27, 2021, 9:42pm

No need to invent as this problem is well studied. You can start with the Wikipedia article and follow up from there. The first answer on this StackExchange post sounds a lot like your proposal.

John_Nagle · December 15, 2021, 6:15pm

Static lock ordering would be useful to have in Rust. One of my comments reads:

/// #Locking protocol
/// 
/// Locks must be locked in this order, to avoid deadlocks.
/// * LockByName
/// * UndupDatabasesLink
/// * SceneObjectLink
/// * Statistics
///
/// No more than one lock of each type can be locked by a thread.
/// All locks must be held only by a scoped construct.

That restriction could be statically enforced by something that can follow the call chain at compile time. It doesn't cover all cases, but if applies, it prevents deadlocks.

2e71828 · December 15, 2021, 6:26pm

One possibility is to represent the current lock chain as an HList of zero-sized types, one per lock. Then, use generics to only allow locking when the current chain is compatible with the proscribed order.

Edit: I don't have time to test this properly, but I think something along these lines should work:

use typenum;
use typenum::type_operators::IsLess;

use std::marker::PhantomData;
use std::sync::{Mutex, MutexGuard};

pub struct LockToken<'a, PRI>(PhantomData<&'a mut PRI>);
impl LockToken<'static, typenum::U0> {
    pub fn new_thread_root_unchecked()->Self { LockToken(PhantomData) }
}

pub struct OrderedMutex<T, PRI>(Mutex<T>, PhantomData<PRI>);

impl<T, PRI> OrderedMutex<T, PRI> {
    pub fn lock<'a,'b:'a, TPRI>(
        &'a mut self,
        _: &'a mut LockToken<'b, TPRI>,
    ) -> (MutexGuard<'a, T>, LockToken<'a, PRI>)
    where
        TPRI: IsLess<PRI, Output = typenum::True>,
    {
        unimplemented!()
    }
}

John_Nagle · December 16, 2021, 11:59pm

I don't have time to test this properly, but I think something along these lines should work.

That looks like a way to declare a hierarchy of OrderedMutex types, but I don't see how the use checking works.

Topic		Replies	Views
Compile time lock ordering approach to deadlocks help	3	1307	November 6, 2019
How bad is the Potential deadlock mentioned in RwLock's document? help	34	4315	February 13, 2022
Can we catch a deadlock during compile time? help	13	394	December 3, 2024
Deadlock: Is it a bug, or is it intentional?	6	5433	January 12, 2023
Mutex lock twice help	3	612	April 30, 2023

Lock order reversals: how to prevent them?

Related topics