Statically check no panic

Hi everyone, in my code I use a Mutex wrapper that do not expose lock but only safe_lock:

    pub fn safe_lock<F, Ret>(&self, thunk: F) -> Ret
    where
        F: FnOnce(&mut T) -> Ret,

This accept a closure that do something with the value in the Mutex I used it for to main resons:

  • Make deadlock less likley
  • Ensure no async execution while locked

When I saw GitHub - dtolnay/no-panic: Attribute macro to require that the compiler prove a function can't ever panic I thought that I could have applied the same trick to my safe_lock in order to completely remove the possibility of PoisonLock.

The trick consist in:

  • instantiate a null struct that impl Drop in a way that make the linker fail when builded
  • write the code that can not panic
  • mem forget the struct instantiated at step 1

If the compiler can prove that no panic are in the code that can no panic will never call drop on the null struct and we do not have link errors. If it can not the build fail and you know that you can have panic so you modify the code.

My issue is that I get errors when the compiler should be able to remove all the panic. Passin a Vec that contain the below struct to a function when we are inside the safe_lock make the compiler panic:

struct PanicableStruct<'a> {
    _a: Vec<Inner<'a>>, // Comment this line in order to be able to build the crate
    _b: Vec<Inner<'a>>,
}

But if I comment out the _a field everyting works. :exploding_head:
Also I get the above issue on rustc 1.66.0 but not on rustc 1.77.0

ATTENTION
You have to build the code for release and with lto otherwise it will always fail

[profile.release]
opt-level = 3
lto = true
debug = true

GODBOLT
From the below link you can see that commenting out the field or changing the version make the build successful.

GITHUB
Here the same code on github with also the Cargo.toml if someone want to clone it.

CODE
Below the code just for reference.

use std::sync::{Arc, Mutex as Mutex_, MutexGuard};

pub enum Inner<'a> {
    Ref(&'a u8),
    Owned(Vec<u8>),
}

struct PanicableStruct<'a> {
    _a: Vec<Inner<'a>>, // Comment this line in order to be able to build the crate
    _b: Vec<Inner<'a>>,
}

#[inline(never)]
fn new_fs(_uno: Vec<PanicableStruct<'static>>) {}

#[inline(never)]
fn apply(x: Arc<Mutex<()>>) {
    x.safe_lock(|_s| {
        let uno: Vec<PanicableStruct<'static>> = vec_new();
        new_fs(uno);
    });
}

fn main() {
    let x = Arc::new(Mutex::new(()));
    apply(x);
}

/// This allow to initialize a vector inside a safe_lock. It just abort the process if the vector
/// initialization fail, for that can no panic.
pub fn vec_new<T>() -> Vec<T> {
    match std::panic::catch_unwind(|| Vec::new()).map_err(|_| std::process::abort()) {
        Ok(r) => r,
        Err(_) => std::process::abort(),
    }
}

/// Safer Mutex wrapper
pub struct Mutex<T: ?Sized>(Mutex_<T>);

impl<T> Mutex<T> {
    /// `safe_lock` takes a closure that takes a mutable reference to the inner value, and returns the
    ///  value of the closure.
    /// This is used to:
    ///     * ensure no async executions while locked.
    ///     * make dead lock less likley
    ///     * prevent `PoisonLock`
    ///
    /// To prevent `PoisonLock` errors, the closure can not panic (statically checked).
    ///
    /// Arguments:
    ///
    /// * `thunk`: A closure that takes a mutable reference to the value inside the Mutex and returns a
    /// value of type Ret.
    ///
    /// This function in splitted in 3 part safe_lock lock and execute just to better analyze the
    /// builded assembly code
    #[inline(never)]
    pub fn safe_lock<F, Ret>(&self, thunk: F) -> Ret
    where
        F: FnOnce(&mut T) -> Ret,
    {
        {
            let mut lock = self.lock();
            let mut __guard = __NoPanic;
            let return_value = self.execute(thunk, &mut lock);
            drop(lock);
            core::mem::forget(__guard);
            return_value
        }
    }

    #[inline(never)]
    pub fn lock(&self) -> MutexGuard<T> {
        match self.0.lock() {
            Ok(r) => r,
            Err(_) => std::process::abort(),
        }
    }

    #[inline(never)]
    pub fn execute<F, Ret>(&self, thunk: F, lock: &mut MutexGuard<T>) -> Ret
    where
        F: FnOnce(&mut T) -> Ret,
    {
        thunk(&mut *lock)
    }

    pub fn new(v: T) -> Self {
        Mutex(Mutex_::new(v))
    }
}

// based on https://github.com/dtolnay/no-panic
pub struct __NoPanic;
extern "C" {
    #[link_name = "safe_lock called on a function that may panic"]
    fn trigger() -> !;
}

impl core::ops::Drop for __NoPanic {
    #[inline(always)]
    fn drop(&mut self) {
        unsafe {
            trigger();
        }
    }
}
1 Like

Why are you trying to avoid panics / PoisonError?

I ask because, in my mind, that's something the user of your safe_lock would want to control; they may actual want panics.

Good point. The fact is that is for internal use so is fine. I want to avoid having PoisonError for 2 main reason:

  1. it force the user to have closure that can not panic, that means that the overall quality of the error handling in the project is better.
  2. we have a lot of boiler plate code that just transform the PoisonError and send it to the main thread and terminate the process.

Panics shouldn't be used for normal errors but for failed assertions that indicate a bug in the code that panics.

yes so if we statically check that the called function can not panic error handling quality increase cause you can not panic but you have to manage all possible cases. And if you have to panic you do that outside of the safe_lock.

What if there is a bug in a function of an external library you are calling? Then panicking would be better than silently continuing with corrupted state.

Also for lock poisoning the expectation is that you will unwrap the error unless you actually want to recover from a panic and even then only at a point where you can safely recover because all inconsistent state has been thrown away. So for example in a web server you did recover at the main request loop and return a 500 internal server error for just this single request rather than continue with potentially inconsistent state and cause user data to be corrupted or the entire server to crash.

2 Likes

If there is a bug in an external library that I'm calling I will return an error, this error will be handled when the Mutex is not locked.

Also for lock poisoning the expectation is that you will unwrap the error unless you actually want to recover from a panic and even then only at a point where you can safely recover because all inconsistent state has been thrown away.

The fact is that with the above solution I can not have lock poisoning because I statically check at compile time that the closure that handle the MutexGuard can not panic.

If the external library itself detects the results of the bug the most sensible thing for it to do is to panic. This is incompatible with no-panic checks unless you can statically prove the absence of bugs, in which case all panics would trivially be unreachable. Also what if the bug resulted in an inconsistent state of a type owned by this external library? If this type is used as value of the mutex lock poisoning will ensure that you can't accidentally observe this inconsistent state, but are forced to either recover from it by eg throwing the value away and replacing it with a new instance or to propagate the panic.

1 Like

In general I do not want to call external libraries that panics inside the safe_lock closure. There are exception like Vec::new that I will either propagate outside the closure as a result or just abort.

Most external library are library that I can control (in the same workspace) and so I can just make sure that the function that I want to call inside the safe_lock do not panics

You're always going to be fighting the compiler to do this, since the compiler is not designed to prove such a thing (yet. It's definitely been discussed and wanted by kernel devs). You're relying on the optimizer to deduce that it can remove these code paths, which, in my experience, it does not 100% of the time. You definitely need the closure to be inlined, for example.

That said, one suggestion I have is that you might have some more luck when compiling with -Z build-std

2 Likes

You're always going to be fighting the compiler to do this

Ok It was too beautiful to be true.

-Z build-std

Not work.
Since updating rustc to last version seems to work what I will do is that.

  1. bump rustc to last version
  2. Add feature eg no-panic
  3. inside the safe_lock use the no panic trick only if the feature is enabled
  4. use the no-panic feature in the CI to check that I'm using closure that can not panic
  5. if at any time I will find that the compiler become too hard to fight I will just remove the no-panic check and fix it without hurry and without blocking everyone

Btw I'm still very curios if any one have any insight of why this

struct PanicableStruct<'a> {
    _a: Vec<Inner<'a>>, // Comment this line in order to be able to build the crate
    _b: Vec<Inner<'a>>,
}

do not compile.
But this:

struct PanicableStruct<'a> {
    _b: Vec<Inner<'a>>,
}

Does.

And thanks everyone for your answear and your time

I'm imagining that this is because the drop glue for PanicableStruct becomes sufficiently complex that the optimizer doesn't inline it and therefore doesn't recognize it as dead code.

1 Like

I'm not sure why you have so much #[inline(never)], as that's the opposite than you want here. In order for the optimizer to remove __NoPanic::drop_in_place (thus avoiding the error), you need the closure and anything potentially-panicking that Mutex::safe_lock calls (transitively) to be inlined. This includes (in my testing) new_fs, the closure itself (i.e. #[inline(always)] |_s| { ... }), and Mutex::execute. Marking those three as #[inline(always)] gets your example to compile for me.

This means that your users also have to ensure that everything potentially panicking is always inlined as well (in addition to LTO being enabled and opt-level=3).

I feel like it's worth noting that Vec::new doesn't itself allocate, so it should never panic.

2 Likes

I split that function (safe_lock) in 3 part (safe_lock execute lock) with inline never only to have a binary than when disassembled is easier to analyze. That do not seem to affect the ability of the compiler to optimize out code paths (if you try to put all of them in one function or just use inline always code still do not compile). But in the real code I just have one function.

new_fs must be inline never in order to have a min repr issue to show here. If I inline new_fs the compiler will just remove completely it cause it does nothing.

you are right wrong example. Btw there are a lot of thing from std lib that I want to use that allocate in that case I just ignore the panic. Please not that this will be done only in CI when I compile with the no-panic feature. The actual binary will still have panics for allocations issues, but this is not an issue for me.

new_fs does do something -- it drops its argument.

1 Like

I ended up solving it in another way POC statically check if paths are panic free

2 Likes

Wow, I'm stunned you figured this out so quickly in such a creative way, very cool!!

1 Like