Negative views on Rust: panicking

As I have said: the solution is to use monads and higher kinded types. You handle lack of panic similarly to how you handle lack of NULL in Rust.

But I think for now the compromise chosen in Rust (no higher kinded types thus no monads and panic as band-aid for lack of all that) is better: people already complain that it's hard to learn Rust, I think if it had all the proper machinery to live without panics people would have just never picked it up in the first place.

3 Likes

But you can define your own panic handlers?

1 Like

You can but that means either bringing pretty heavy unwinding machinery to the microcontgroller or adding threads to your firmware. Which is even heavier.

More realistic alternative is to structure your code in way that panic is never triggered, but it's also quite non-trivial. And fragile: change in third-party crate may easily make your code uncompileable.

I want to clarify this a bit. I agree that there are costs to panics, though I don't know if the runtime costs when you don't use panics are as notable as you imply.

(Or put another way, I would say the cost of unused panics are similar to that of bounds checking, which is in a similar position of in a perfect world, unnecessary.)

First off: panics are (intended to be) not an error handling mechanism, and anyone using them as such are making an explicitly nonportable decision.

Instead, panics are an alternative to aborting. Rather than taking down the whole thread or process with an abort, unwinding allows destructors to run and for the program to only take down the logical inflight task. Panics are (supposed to be) non-recoverable errors; the only way to continue after a panic is to trash that task and move on. For short-lived programs that's effectively an abort (modulo the drop cleanup), but for long-lived programs that is an important feature to have.

Also important is that while unwinding support is default, panic=abort is a normal compilation mode for Rust. The cost of an untaken panic is primarily one of code size. On Windows, it is purely code size; the metadata to allow for unwinding is required by the OS, and only the landing pads can be omitted. On other OSes, you can have non unwindable frames, so there is a bit larger cost to setting it up, but still almost entirely just in code size.

When you set panic=abort, Rust doesn't emit the unwind landing pads, so basically all of the language runtime costs of panics disappear. What remains afterwards is the checks to panic (and abort), and the implementation cost of being correct in the face of unwinding. If you know that you're only run with panic=abort, you can just ignore unwind correctness. For a general purpose library, though, you need to be unwind safe, so we'll focus on that under the "don't pay for what you don't use" principle.

(And I want to note that writing code to avoid panics is at worst as burdensome as writing code to avoid UB in C or C++. Rust is a younger language, so there's fewer best practices and less tooling to help with doing so, but it's effectively the same. With some Code Crimes™️, you can even effectively just make panics UB (by making the panic handler call unreachable_unchecked).)

Mutex poisoning is generally understood to be a library implementation mistake (well, suboptimality). Poisoning is not required for safety, and the parking_lot primitives don't have any panicking. Rather, poisoning is a lint against potentially logically incorrect state, that can be ignored, even for the std mutex.

The better design would've been to have poisoning be a separate composable functionality, rather than baked into the design of Mutex. It won't change the API, but perhaps we'll be able to put a const bool onto Mutex to control whether it poisons in the future.

Only insomuch as they need to be unwind safe. If you're writing safe code, this is a non-problem, and you don't have to worry about it. If you're writing unsafe code, this basically falls to the "pre poop your pants" principle—any time you pass control flow back to downstream safe code, you should be in a safe state even if no more of your code is run. However, this is the case even without unwinding, as safe code can forget to run your code.

The only case where unwind safety is separate from forget safety is when you call a user closure (or trait impl, which does include basic operators on unknown types tbf). Here you are guaranteed that control flow returns to you, though it may do so via an unwind. The requirement is that you do your world fixup in a Drop handler (or have an abort-on-drop bomb), which I may note is probably a good idea anyway, so it's much more difficult to accidentally forget to perform the fixup. The cost when your code doesn't panic is just that of developer cost to be unwind safe. Not even unwind correct (where stuff is all dropped correctly), I might add, just unwind safe (where nothing is double dropped and things might get forgotten).

It's also very much worth noting "don't pay for what you don't use" only refers to runtime cost. It doesn't say anything about developer implementation costs, or compile time costs. In fact, basically as a rule, every feature which runtime-free has non-negligible developer cost to use and compile time costs to optimize out.

The only example here really is implementing a += b; as ptr::write(a, ptr::read(a) + b);. This is fundamentally incompatible with a world where a + b could unwind. Instead, this has to be implemented primitively or as ptr::write(a, { let bomb = AbortOnDrop; let a = ptr::read(a) + b; forget(bomb); a }); to temporarily force unwinds to abort the process.

(However, the presence of large types and the fact that move elision is spotty at best means that AddAssign would still want to be a separate trait, even if default implemented. And it's just as easy to point the default in the other direction, without any unwinding issues; a + b can be implemented as { a += b; a } just fine.)


I just also want to point out one more time that in C, every operation that panics in Rust is "just don't do that, it's UB" in C. In specific situations, it might be easier to accidentally panic in Rust or easier to accidentally execute UB in C, but this is a property of library design, not of the language. There are huge parts of libc which go unused due to safety issues or just that they're unsuitable to your problem, and even huger tracts of the C++ STL go unused. While Rust isn't perfect for a must-not-unwind environment (which might just be a kernel? since anything that isn't the unwind implementation itself can probably afford unwinding support for the extra reliability?) it's clearly at worst as-good-as C or C++ in these environments, just younger, so there's less knowledge to go around of language-specific best practices for writing such code.

If a library causes an operation that previously was valid to now cause UB panic, then that is either a) a clear bug in the library (which, I may note, is language independent), or b) a case where you did library UB, which is your fault in unsafe. [1]

In all cases, I would rather my program halt (hopefully with some sort of indication of why) than to execute UB and have some sort of unpredictable, broken behavior with minimal, if any, indication of why. Obviously, (hopefully a message plus) not taking down the whole program is better, either via a caught task side-channel unwind or main-channel error unwind


One last thing to point out: you can't have subscript syntax without them having side-channel failure. A key feature of subscripting is that a subscript expression produces an lvalue in C/C++ terms, or a place in Rust terms. This is not a type you can wrap into an error handling reference type (easily... C++ has T& lvalue references, and while you can't write std::optional<T&> (and people complain about it), you do have std::reference_wapper<T>).

The key problem is that arr[ix] is a move and &arr[ix] is a reference-to-place. Additionally, to add onto this, is that how you treat a place (by-move, by-ref, by-mut) is (in Rust) not a property of syntax, but inferred by how the place is used (via method autoref).

This is easily "resolvable" by just using methods to choose the behavior you want, rather than overloaded syntax. Perhaps it could even be served by an explicit return type overloading feature? But at the very least, it's a complicated problem space to address, without an existing solution.

On panic alternatives

I suspect that any such mechanism will look from a language level heavily like (potentially checked) unwinding. The fact that unwinding is implemented via OS unwinding is an implementation detail (kind of by definition in C++, but explicitly in Rust).

(Not-a-mod hat: yeah, pulling the how-to-avoid-panics talk into a different thread seems a good idea. If a mod splits it, sorry for mixing in this post...)


  1. I think you're more talking about the case where you're using panic=link-failure, and a library refactors such that a code path you're taking no longer manages to optimize out the panicking path.

    I believe that safe, (provably) unwind/abort free code clearly falls under "make hard things possible" in language design. Hard-realtime systems with such stringent requirements of proof of panic freedom (that I should add can't use C since it can't prove the lack of UB; if you would allow code review to prove the lack of UB in C, then you can do the same for lack of panics in Rust) are such a niche application, even in systems programming (non-safety-critical hard-realtime, where a (diagnosed) failure is "just" (expensive) downtime, can afford best-effort panic elimination rather than a full machine-checked proof) that it is unfair to push that requirement onto general purpose library authors. (The ability to reuse general purpose code for hard-realtime in the first place is a massive benefit of Rust!)

    And with that said, panics no longer being optimized out in a library is a likely indicator of a performance regression, so potentially a perf bug report to be made and fixed upstream. ↩︎

15 Likes

OK, yep.

Having spent many years working on such embedded systems in the avionics, military and other industries...Absolutely yes.

Better the program dies immediately than try and limp along in some undefined state. Which is what a panic is indicating.

If things are that safety critical then:

  1. Typically a hardware watchdog trips out and restarts the system. Perhaps fast enough that it's almost unnoticeable that some control loop has been interrupted.

  2. There are multiple redundant systems.

  3. In cases like the Boeing 777 the pilot can shutdown all the Primary Flight Computers and revert to analog electronics control. A bit harder to fly but doable.

  4. The amount of testing that goes on to try and cover every execution path and input data is immense (or at least it was back in the days I was involved in testing the 777's PFC's)

Perhaps. But this worries me. A panic indicates that the program is in an undefined state. How can one be sure the work saved is not corrupt?

Anyway, I'd still like to see someone post an example, hopefully not to huge and complex, of where a panic is problematic.

7 Likes

(documentation for future readers: this thread started as a discussion in Negative views on Rust ( not mine! ) and was moved here; this is the line where the posts in the new thread start)

The big reason to have AddAssign is to allow the writing of efficient and pretty code. Of course, if you don't want to use the += operator, you don't need it. But if you accept the value of += and want it to be efficient, then you can't implement it for large data (e.g. matrices) using Add.

2 Likes

These two sentences contradict each other. Either panics are non-recoverable or they are important for long-lived programs.

The truth is somewhere in the middle: panics are recoverable (and thus have nontrivial overhead), but they are not designed as general-purpose error reporting mechanism which makes them even more cumbersome (you pay a lot for something you very rarely use).

No. The main cost is the fact that it makes certain things unsound. One example was already shown: you couldn't move value out of variable, change it and put it back because operations in the middle may panic.

That's not 100% true: Native Client under Windows happily executes code which doesn't include any such metadata.

Of course if you want to use third-party DLLs then you may need these.

Unwinding tables, very often, also have non-trivial size. Sometimes larger than actual code. Or do you count them as “code size”, too?

How can I do that? With unsafe? Hardly a convenient way to go.

But that's the exact opposite from what you usually want to achieve. The goal is not to turn panic into UB (what good would it do for you?) but to ensure that certain code wouldn't ever trigger panic (this is, basically, the explicit request from Linus and in general makes sense for system-level language)!

Rust doesn't offer any sensible way of achieving that (there are some hacks, but they are quite fragile).

That's quite a strange argument. Being as good as C is not an achievement. If I was happy with C I would have continued to use C!

Embedded doesn't like unwind tables in general. They tend to be quite big and flash space is scarce on many platforms.

This, again, contradicts everything you wrote before: if panic was treated like UB is treated in C/C++ world and developers would have actively avoided it then yes, it would have been similar. But that not how people use Rust. Most developers are quite happy to call function which may panic in case of severe errors. Which, in turn, makes their crates unsuitable for cases where you can not panic. And you may only know about that post-factum, after implementation is changed to call panic.

Yes, it doesn't have an easy solution in Rust. Haskel, Ocaml (and other similar languages) use monads to resolve that issue.

Well… the main problem with panic is it's “Schrödinger's nature”: you couldn't know if certain code may cause panic or not unless you would investigate the generated code (and sometimes not even then: compiler may add generation of panic to places which may not, in reality, be ever executed).

I don't know what you mean by “heavily like (potentially checked) unwinding” thus it's hard for me to say if monadic error handling is what you had in mind or not.

2 Likes

I get more and more confused by this conversation.

My understanding is that a panic is a sure indication of UB. Like for example an out of bout array access.

The great thing being that Rust panics and the program is aborted rather than limping along after UB, as C or C++ would do.

What am I missing here?

Still waiting for an example that demonstrates this.

1 Like

UB (in Rust similarly to C and C++) is something that should never happen. If it's triggered in a C, C++, or Rust program (in Rust this can only happen if you have incorrect unsafe code, but it may happen in safe code and much later) the literally anything at all may happen.

It's only UB if you access it without checking. In that case anything can happen: you program may return random result, kill your program or destroy the universe…

panic is most definitely not an UB: Option::<i8>::None.unwrap() would never format your SSD.

The fact that Rust, too, have an UB? Which is distinct from panic?

This is probably the simplest example I can think of:

fn main() {
    let i = Some(unsafe { core::num::NonZeroI8::new_unchecked(0)});
    if let Some(_) = i {
       println!("i is Some: {:?}", i);
    } else {
        println!("i is None");
    }
}

Here you have Some which is also a None and in general you may have the exact same UB as in C/C++.

panic is most definitely not an UB, it's safe and sound… although it's not always convenient.

1 Like

I have actually implemented it for large data (bignums) by using Add, and I believe it's efficient.

Is it because you think that passing arguments by reference is more efficient than passing by value?

I don't believe this is true if the objects are just handles to large data, like Vec, String or BigInt. I can even think of reasons why it might be less efficient to pass by reference, due to an extra indirection.

If we're passing a large struct that stores the big data directly without indirection, I could believe it might potentially be true, but is it actually?

If it's true, it can potentially be solved with a better calling convention. Passing large objects by value can be done by just passing a pointer to the stack in a register, behind the scenes, rather than actually moving these bytes to a different place on the stack.

And regardless of the calling convention, it's likely that all the moves will be optimized out, if the delegation to Add is going to be inlined into AddAssign. The compiler should see directly that the input and output are ultimately in the same place.

Totally with that idea.

Except I expect Rust code that does not use "unsafe" to panic on UB. Modulo integer overflow perhaps.

Thank you.

Ah, yes. You have "unsafe" in there. At which point I would say all bets are off regards UB or panic. With "unsafe" you have taken your life into your own hands. Good luck.

1 Like

Rust code that does not use unsafe never produces any UB. That's the whole point behind the distinction between safe and unsafe.

2 Likes

I would expect native client to generate the unwind info for you based on it's understanding of the written x86 code. Native client places a lot of restrictions on the x86 code that make this kind of analysis feasable, unlike normal code.

2 Likes

Rust will panic on things like out-of-bounds to prevent UB from happening. If Rust didn't panic and actually performed the buffer overflow, then that would qualify as UB, but simply trying to do it and panicking is not UB.

7 Likes

Purely safe Rust code can not cause UB by itself, thus discussing what would happen on UB in purely safe code is a moot point: it just never happens.

Integer overflow is not an UB in Rust. Neither in safe nor unsafe code. It's behavior can be changed with compiler flags (usually it panics in debug mode and wraps around in release mode), but it's never an UB.

Well, sure, but this makes the whole idea of turning panic (which is, again is not an UB and can be easily caused by purely safe code) into UB a bit bizzare: why would you bring unsoundness into safe Rust on purpose? What would be the point? What may you try to achieve?

Certainly it does. Rust code that never uses "unsafe" can still result generating code that can try to use an out of bounds array index, for example.

Luckily (well, by design) that UB I have written is caught by a run-time check.

All is well.

And hence I said: "I expect Rust code that does not use "unsafe" to panic on UB.".

Exactly.

2 Likes

Perhaps, but that's not what Native Client does.

For x86 code it would have been absolutely impossible (unwinding machinery in Windows is not ready to cope with non-zero CS/DS offsets) and x86-64 doesn't generate any such tables either.

When you say "by design" you mean it is defined to behave so by the language specification. Hence it is not undefined. Undefined behavior means precisely that -- the behavior of the system is not defined by the specification.

2 Likes