Any examples of recovering from a poisoned lock?

Does anyone have some non-trivial uses of actually recovering from a poisoned lock? That is, Mutex::lock returns an error, and you use the MutexGuard within the error to get back into a known state?

I can imagine that with great care in how your locked updates are structured it might be possible to recover the state, but I’m wondering if anyone has bothered in practice.

Context: I’m proposing a Mutex::with API, and I’m wondering if its worthwhile making it return an error, or just panic on poisoned Mutexes (the rationale being that its meant to be a quick and simple helper, and if you want to do complex recovery you can use the normal API).

To my knowledge, lock in stdlib returning a Result instead of panicing is widely considered a mistake. All use I have seen to date uses m.lock().unwrap() immediately.

2 Likes

If a lock is poisoned, it means some thread having acquired it has panic!ked. Now, the important thing to know, here, is what that thread was doing with the acquired lock:

  • either it had acquired the lock for read-access only,
    in which case the data guarded by the Mutex is still in the state it was when that thread acquired the lock (and succeeded), that is, a safe state.
    In that case, instead of using .lock().unwrap(), it is safe to use .lock.unwrap_or_else(|poisoned| poisoned.into_inner());

  • or the thread may have been mutating the data guarded by the Mutex,
    and the panic! may then have occurred in the middle of a critical section, thus having potentially not reached a “re-establish the invariant” clean-up phase at the end of such section. The data guarded may therefore no longer have the invariant it is supposed to have, which may lead to logic bugs; if unsafe is used somewhere, these logic bugs may very likely compromise memory safety.
    Thus the only sane thing to do here is to either panic! (e.g., with .lock().unwrap()), or have a way to start from scratch: wrap a new value in a new Mutex, and ensure other threads use that new Mutex instead…

Long story short, if there were to be such a clear distinction between read-access locks, and write-access locks, then RwLock should be used instead (of Mutex); and in that case, a poisoned lock can only come from the latter “almost-unrecoverable” case where panic!king with .unwrap() is not only acceptable, but advised.

So you might as well panic!() on poisoned Mutexes, using something like .lock().expect("Mutex was poisoned") so that users of the library can get a more descriptive panic! message in that case :slight_smile:

3 Likes

Poisoning feels like a right thing to do, but I’ve never found it actually useful. Every time I ran into a poisoned lock, the right solution was to prevent the panic from happening in the first place, and/or use Mutex<Result>.

I’ve used parking_lot, which doesn’t have poisoning, and it worked just fine.

3 Likes