I've got a backend web service written in Rust and I want to make sure I have sound error handling strategy for both forms of rust errors: Panics and Results.
The preferred path for error handling is obviously Result. So all of the code I have written thus far is based around bubbling errors up the stack with ? using https://docs.rs/anyhow/1.0.28/anyhow/ This is pretty nice as it is similar to bubbling a java runtime exception up. So i am good for 99% of my errors!
But what about the panic? Many std library functions can panic and as the code grows with more code & dependencies the risk of the odd panic only grows higher. I think there are 2 options:
Abort process on panics I don't like this option. It will cause all other in flight requests on the server to get cancelled and it will trigger a reboot causing the init/warmup logic to be re-executed. The system that I am working on obviously can handle a situation like this as there is a cluster of many instances and the callers all have retry and failure handling logic, but I don't like that it is causing a needless rolling of instances due to "background" levels of panics. But it is even worse than that. If there is an accidental bug introduced that causes a panic that only affects 1% of callers. If those 1% make frequent enough calls the entire fleet will start rolling over DOSing the entire service.
Catching panics Configure the server so that panics are caught, logged, and an error is immediately returned to the user. I guess the only issue here is what happens with legitimately non-recoverable errors such as OOM? I would want to abort the process for OOM. In java you can easily distinguish between non-recoverable (Throwables) vs recoverable (Exception). In rust can i make sure non-recoverable panics will abort?