Catching panics in rust web service? Is this a sound error handling strategy?

I've got a backend web service written in Rust and I want to make sure I have sound error handling strategy for both forms of rust errors: Panics and Results.

The preferred path for error handling is obviously Result. So all of the code I have written thus far is based around bubbling errors up the stack with ? using https://docs.rs/anyhow/1.0.28/anyhow/ This is pretty nice as it is similar to bubbling a java runtime exception up. So i am good for 99% of my errors!

But what about the panic? Many std library functions can panic and as the code grows with more code & dependencies the risk of the odd panic only grows higher. I think there are 2 options:

  • Abort process on panics I don't like this option. It will cause all other in flight requests on the server to get cancelled and it will trigger a reboot causing the init/warmup logic to be re-executed. The system that I am working on obviously can handle a situation like this as there is a cluster of many instances and the callers all have retry and failure handling logic, but I don't like that it is causing a needless rolling of instances due to "background" levels of panics. But it is even worse than that. If there is an accidental bug introduced that causes a panic that only affects 1% of callers. If those 1% make frequent enough calls the entire fleet will start rolling over DOSing the entire service.

  • Catching panics Configure the server so that panics are caught, logged, and an error is immediately returned to the user. I guess the only issue here is what happens with legitimately non-recoverable errors such as OOM? I would want to abort the process for OOM. In java you can easily distinguish between non-recoverable (Throwables) vs recoverable (Exception). In rust can i make sure non-recoverable panics will abort?

Typically web servers will use catch_unwind to catch any panics that happen when handling a connection and log the panic. Note that out of memory errors will not panic — they immediately abort the process.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.