I have some simple multithreaded code in which newly spawned threads are parked using thread::park_timeout (simplified version below):
while !self.shared_state.lock().unwrap().ready {
thread::park_timeout(Duration::from_millis(100)) ;
}
The application will typically spawn several threads, all of which will end up parked in this loop. More often than not, these threads will never leave thread::park_timeout
.
When looking at the thread's stack traces, it looks like the threads are in a deadlock caused by thread::park_timeout
's call to ConvVar::wait_timeout
, which in turn calls sync::Once
when fetching duration data. My understanding is as follows, for two threads A & B:
- A & B's
wait_timeout
calls run in parallel and eventually reachOnce::call_once
- The "selected" thread – let's say A – gets to run its closure, while B gets parked. This latter call to
thread::park
is blocked on B's inner mutex lock since this is a reentring call. - Once A finishes running its code, the cleanup process tries to unpark waiting threads, thereby trying to access B's inner mutex lock, which is still blocked, thus resulting in a deadlock.
Unless I misunderstood something, this looks like a standard lib bug. What do you guys think?
You'll find below the typical stack traces I'm getting:
Thread locked in unpark
#0 0x000000011189bc22 in __psynch_mutexwait ()
#1 0x00000001118d0dfa in _pthread_mutex_lock_wait ()
#2 0x000000010f7f9d59 in std::sys::imp::mutex::{{impl}}::lock [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/mutex.rs:67
#3 0x000000010f7f9d54 in std::sys_common::mutex::{{impl}}::lock [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys_common/mutex.rs:40
#4 0x000000010f7f9d54 in std::sync::mutex::{{impl}}::lock<bool> [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sync/mutex.rs:222
#5 0x000000010f7f9d50 in std::thread::{{impl}}::unpark at /Users/travis/build/rust-lang/rust/src/libstd/thread/mod.rs:992
#6 0x000000010f816528 in std::sync::once::{{impl}}::drop at /Users/travis/build/rust-lang/rust/src/libstd/sync/once.rs:380
#7 0x000000010f8163fa in core::ptr::drop_in_place<std::sync::once::Finish> [inlined] at /Users/travis/build/rust-lang/rust/src/libcore/ptr.rs:61
#8 0x000000010f8163f5 in std::sync::once::{{impl}}::call_inner at /Users/travis/build/rust-lang/rust/src/libstd/sync/once.rs:310
#9 0x000000010f816aa3 in std::sync::once::{{impl}}::call_once<closure> [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sync/once.rs:227
#10 0x000000010f816a7e in std::sys::imp::time::inner::info [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/time.rs:233
#11 0x000000010f816a7e in std::sys::imp::time::inner::{{impl}}::sub_instant [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/time.rs:143
#12 0x000000010f816a7e in std::time::{{impl}}::duration_since [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/time/mod.rs:181
#13 0x000000010f816a7e in std::time::{{impl}}::sub [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/time/mod.rs:246
#14 0x000000010f816a7e in std::time::{{impl}}::elapsed at /Users/travis/build/rust-lang/rust/src/libstd/time/mod.rs:205
#15 0x000000010f81a025 in std::sys::imp::condvar::{{impl}}::wait_timeout at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/condvar.rs:160
#16 0x000000010f7f980b in std::sys_common::condvar::{{impl}}::wait_timeout [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys_common/condvar.rs:61
#17 0x000000010f7f97fa in std::sync::condvar::{{impl}}::wait_timeout<bool> [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sync/condvar.rs:346
#18 0x000000010f7f97ca in std::thread::park_timeout at /Users/travis/build/rust-lang/rust/src/libstd/thread/mod.rs:839
Thread locked in park
#0 0x000000011189bc22 in __psynch_mutexwait ()
#1 0x00000001118d0dfa in _pthread_mutex_lock_wait ()
#2 0x000000010f7f948d in std::sys::imp::mutex::{{impl}}::lock [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/mutex.rs:67
#3 0x000000010f7f9488 in std::sys_common::mutex::{{impl}}::lock [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys_common/mutex.rs:40
#4 0x000000010f7f9488 in std::sync::mutex::{{impl}}::lock<bool> [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sync/mutex.rs:222
#5 0x000000010f7f9484 in std::thread::park at /Users/travis/build/rust-lang/rust/src/libstd/thread/mod.rs:766
#6 0x000000010f8162e5 in std::sync::once::{{impl}}::call_inner at /Users/travis/build/rust-lang/rust/src/libstd/sync/once.rs:341
#7 0x000000010f816aa3 in std::sync::once::{{impl}}::call_once<closure> [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sync/once.rs:227
#8 0x000000010f816a7e in std::sys::imp::time::inner::info [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/time.rs:233
#9 0x000000010f816a7e in std::sys::imp::time::inner::{{impl}}::sub_instant [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/time.rs:143
#10 0x000000010f816a7e in std::time::{{impl}}::duration_since [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/time/mod.rs:181
#11 0x000000010f816a7e in std::time::{{impl}}::sub [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/time/mod.rs:246
#12 0x000000010f816a7e in std::time::{{impl}}::elapsed at /Users/travis/build/rust-lang/rust/src/libstd/time/mod.rs:205
#13 0x000000010f81a025 in std::sys::imp::condvar::{{impl}}::wait_timeout at /Users/travis/build/rust-lang/rust/src/libstd/sys/unix/condvar.rs:160
#14 0x000000010f7f980b in std::sys_common::condvar::{{impl}}::wait_timeout [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sys_common/condvar.rs:61
#15 0x000000010f7f97fa in std::sync::condvar::{{impl}}::wait_timeout<bool> [inlined] at /Users/travis/build/rust-lang/rust/src/libstd/sync/condvar.rs:346
#16 0x000000010f7f97ca in std::thread::park_timeout at /Users/travis/build/rust-lang/rust/src/libstd/thread/mod.rs:839