Panicking: Should you avoid it?

But just to be clear: "not panicking" is not the same thing as "doesn't include any panics"

3 Likes

I guess it's too late to change design of Rust but I think the panic!() should always report diagnostics (if at all possible) and then loop { std::thread::sleep(1) } forever. That way you could extract any information you need from the process (because it's not dead yet and the stack contains everything there was at the moment of panic) but it will not definitely do anything else. And in case of multithreaded code, the current thread wouldn't be overwriting any memory. Alternative solution would be to send signal 9 to self (that's POSIX way to killing current process without any cleanup) and loop forever while waiting the OS to complete the removal of the process.

And if you ever write panic!(...) in your code, that's the behavior you explicitly want. It would also match the logical behavior of panic() in Linux kernel.

If you want to save user data, you have to do snapshots while the system is still in known good state. It's not safe to wait for panic and then try to supposedly / hopefully avoid losing data.

Had panic!() been defined this way, nobody would ever try to use it for real "error handling". Instead, everybody would be returning Option or Result and using the "?" shorthand everywhere unless you truly want to panic for real.

The explicit "?" in many places would add some extra clutter to source code but it would add an explicit mark to every possible place that can fail.

For example, I'd prefer that instead of a = b * c; I would need to write a = b * c?; (or maybe the syntax should be a = b *? c;?) to be explicit about the fact that the multiplication may overflow and the caller must handle that case. Of course, if I want to handle that case, I cannot use the shorthand. I think the C++ mechanism where literally anything can throw an exception is really powerful when used correctly (basically RAII everywhere) but without explicit markup in the code, you can never have any idea how many things can throw if you're reading the code written by somebody else. With explicit markup, seeing e.g. 13 "?" shorthands in a couple of lines of code, you'd be instantly aware that the code is probably very error prone.

Of course, with the type system one could have been used to add a lot of optimizations. For example, the system could have had SmallI32 type which would have been stored as i32 but it's known to be small enough to not overflow when multiplied with another SmallI32. One could write a = b * c; where b and c are of type SmallI32 and a would be i32 without any runtime checking. I think this idea would be very similar to NonZeroU32.

With such a design, everything that even potentially needs to allocate RAM or do any other stuff that might potentially fail would need to return Result instead of returning something else and more or less randomly panic!()ing.

However, I believe this could have been implemented without a huge overhead because the compiler would be aware of this pattern and basically do the same thing as corresponding well written C code would do: run the code and check the return value. Then unwrap() could have been unsafe and it would have always returned the Ok value (or the memory range that would have contained the Ok value if everything actually was successful) so it could have been used in performance critical code where it's somehow already known that the call cannot possibly fail. Of course, if compiler could compute at compile time that the function call cannot possibly fail, the error handling and error checking would naturally be totally optimized away without using any unsafe code so using Result instead of pure data as return value would have caused zero overhead whenever possible.

And it would have been great if Err could contain full stack trace of the code location that emitted it but that might make sense for debug builds only to avoid having too much overhead for all use of Result type.

While this may be a nice "trick" in a debugging scenario, this would be a big problem in productive environments as, for example, file handles would not be properly closed. This could lead to "lingering" processes that never get terminated.

You can opt-in to abort on panic (instead of unwinding the stack), which sort-of achieves something similar, plus it creates a memory dump for debugging.

I believe it's also possible to run a process in a way to allow debugging once the signal gets SIGABRT, but I'm not sure.

2 Likes

This may be helpful to debug in certain restricted contexts (like certain embedded devices), but would be absolutely disastrous in any other case. Imagine if an app you run, instead of crashing, just hangs indefinitely. You wouldn't even know whether it hangs or busy, and even if you do, you may not have the source code to debug it.

This would also forbid recovering safely from a panic in a thread. The thread would hang forever, consuming resources, and if you send SIGKILL to the process, then a single panicking thread means a crash of the whole process. Terrible design.

Finally, maybe for your applications explicit overflow checks everywhere are ok, but outside of security-critical applications it's usually more trouble than it's worth. Just crash on unexpected inputs. Also, with your design it would be even more boilerplate to try to track the place where overflow happened, basically guaranteeing that error reporting will be as useful as returning NAN. I can't see how you can argue it to be better than a crash on the errored line with a full stack trace.

That would really annoy me. Because my code should never panic!. If some strange, rare, situation arises that causes it to panic I want it to exit as soon as possible and return control to systemd, or whoever ran it, where the panic is logged and the program can be restarted automatically. Having it hang in a loop would ruin everything.

If you want hang-on-panic for your embedded microcontroller, simply write the panic_semihosting example here: #[panic_handler] - The Rustonomicon

I have a difficult time imagining any common scenario in which I would want this. Probably you can do this for cases where you need/want it with a panic hook: set_hook in std::panic - Rust

I think you're basically arguing for this (conceptual) style of error handling: Using unwrap() in Rust is Okay - Andrew Gallant's Blog

I talk about why it's not a good idea in the blog. For folks who insist otherwise, I generally ask to see real world production code that utilizes this style of error handling, and also ask about the resources required to write and maintain such code.

I've been pretty staunchly opposed to unwrap() and panic!(); the only places I've added them is when an "impossibility" has occurred. A hyperbolic example perhaps, but sort-of along the line of:

let x = 1;
match x {
  1 => { /* .. do something .. */ },
  _ => panic!("We're *really* in trouble now")
}

i.e. basically in the domain of "galactic rays did it".

I'm currently working on a library crate where the programmer must pass in a "batch size", and it makes absolutely no sense to pass 0 (in fact, it would lead to interesting (undefined) breakage). Before reading this thread I would have translated this check into a Result, but after reading your blog I'm starting to realize that it's okay to panic (as long as it's clearly documented that it will panic if the batch size is zero). My reasoning against this has been "what if it's not the application developer passing in a hard-coded value, but they for some reason allowed the user to specify it, and the user passes a zero", but I I've come around to the thinking that if the application developer does that, it's up to them to validate the size (and return a Result as appropriate) before passing it on to a function that may panic [as documented].

4 Likes

Note there is also the unreachable! macro.

2 Likes

Note that you could also use std::num::NonZero{U32, U64, Usize, ...} for this purpose.

1 Like

Error handling libraries such as snafu and thiserror offer this feature.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.