Handling unintended panics in Tokio tasks gracefully

polybuildr · October 27, 2019, 9:56am

Hello!

I've been looking into using Tokio for a toy project and I'm trying to understand how to handle unintended panics in a Tokio task gracefully.

I understand that my code should never panic if it could instead use normal error handling but I would still like to handle panics gracefully. They could accidentally come from my code or maybe some crate panics in certain cases that I didn't properly account for.

As far as I understand, Tokio will by default let the thread crash, print the error to stderr and carry on. I noticed some work was being done to offer more control (tokio-rs/tokio#495, tokio-rs/tokio#700).

Ideally, I want to be able to see which tasks failed by panicking, log this somewhere, possibly restart the task that failed or maybe start another task that has the same purpose as the task that just failed. Is the right way to do this to use catch_unwind on the Future? Is there some solution involving channels that's the right way to do this? Am I trying to do too much and would end up fighting with the Tokio runtime - should I use another solution/crate? Any suggestions, advice or thoughts would be helpful, thanks!

alice · October 27, 2019, 12:01pm

You can call catch_unwind on your future before giving it to Tokio. This will turn it into a fallible future that results in an error when the inner future panics.

vorner · October 27, 2019, 12:14pm

It depends on what you mean by „gracefully“. I usually do one of two things:

Declare that panics are serious and not tolerated. I compile with panic = "abort".
Can happen sometimes. For these I handle panics in „critical“ places (eg. a singleton background thread that, if it dies, makes the application in half-dead state) where I abort manually. For the rest, I divert the panics to log with log-panics. Then they appear in logs (unlike when printed on stderr) and can appear in some kind of monitoring system.

I've also made a wrapper in hyper-based service to turn panics into 500s, using catch_panic.

polybuildr · October 27, 2019, 1:03pm

Thanks for the responses!

@alice, thanks for confirming that the approach of catch_unwind on the Future passed to Tokio is a reasonable one.

@vorner, this is about the second case. I don't want to abort because I suspect some tasks will have panics that are recoverable wrt what the system needs from them. I think the approach I'm considering is similar to yours which is to have some kind of "supervisor" which can deal with other tasks panicking but if the supervisor panics for some reason, then the program aborts. log-panics looks useful, thanks! Do you have any examples of the setup you've described that I can take a look at?

vorner · October 27, 2019, 3:01pm

Sorry, no, the code's owned by the company I work for and is not public.

system · January 25, 2020, 3:01pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Panic stops the app when using FuturesUnordered	2	488	July 6, 2020
Panic in Tokio task does not end the program execution help	5	5046	October 10, 2020
Tokio and UnwindSafe help	8	511	August 16, 2023
How to avoid "panic " output in linux when this panic has been catch? help	4	307	January 30, 2024
Can this Tokio task get canceled if I await its handle? help	5	1002	April 26, 2023

Handling unintended panics in Tokio tasks gracefully

Related Topics