Hi,
We currently test for outages occurring during operation to see how we can recover.
The artificial panic!() occurs after we got a RwLock for write.
The RwLock is in a Mmaped file.
If I remove the poison from the lock right after the program panicked via another program, without trying to read/write the lock again, it works: once the program finishes and I relaunch my main program, everything is working properly.
But if the script reloads the main executable (without the one only clearing the poison beforehand), then I have a deadlock that no poison clearing can recover.
Simplified section:
match take_my_lock {
Some(out) => {
// ------------------------
// Check for healthy lock
// ------------------------
let j = &mut *out;
match j.lock_admin.write() {
Ok(e) => {
// -----------------
// Checked healthy
// -----------------
e
}
Err(f) => {
println!("Write Poison error {:?}", f);
// -----------------------
// Poisonned. Retry once
// -----------------------
j.lock_admin.clear_poison();
match j.lock_admin.write() { <- deadlocked here IFF it runs after a panic!(). Does it happen too fast after the poison clearing ?
Ok(out) => {
// -----------------
// Checked healthy
// -----------------
println!("Write OK for second write");
out
}
Err(f) => {
// -------------------
// Definitely f***d.
// -------------------
println!("Write Poison error 2 {:?}", f);
return Err("Poisonned lock".into());
}
}
}
}
}
None => return Err("No admin header".into())
}
};
So clearing the poison in that specific case is not enough anymore
Again the panic!() is just to simulate an outage/catastrophe, there are no panics/unwrap in the code otherwise.
How can I do ? I need to "force reset" the lock in my mmaped memory.
Just in case, adding a sleep doesn't change anything:
j.lock_admin.clear_poison();
let th_millis = time::Duration::from_millis(1000);
thread::sleep(th_millis);
match j.lock_admin.write() {
What I would need is a my_lock.release() that would work no matter the thread calling it.
The platform is linux x86_64 only.
Sleeping during
Or please propose an alternative that could go in a persistent memory and easy to recover without reinventing the wheel on my side.
Thanks!