Cost of `catch_unwind`

Can anyone explain to the the cost of catch_unwind on x86_64 and arm64 machines? Would it be fine to call on every poll of the average io bound Future? What about tight loops?

Does it bloat the abi of the function (like adding some other parameters etc), some instructions (hot/cold path?), or maybe it will just add something to .data?

1 Like
playground::bar:
	pushq	%rax
	callq	*foo@GOTPCREL(%rip)
	xorl	%eax, %eax
	popq	%rcx
	retq
	movq	%rax, %rdi
	callq	*std::panicking::catch_unwind::cleanup@GOTPCREL(%rip)
	popq	%rcx
	retq
	callq	*core::panicking::panic_cannot_unwind@GOTPCREL(%rip)

So basically it adds a bit to the happy path to construct the Result (probably optimizes away if you immediately match on it) and some more code with a call for the exception path.

3 Likes

For what it's worth, Tokio calls it for every poll of every future.

5 Likes

That's interesting. Do you know why?

you don't want panics in a single async task to tear down tokio's runtime.

6 Likes

To ensure that tokio::spawn works similarly to std::thread::spawn by catching panics.

1 Like

Apologies for reviving this old post. I was looking at the documentation of futures::FutureExt::catch_unwind and found this:

The important bit is:

It’s most commonly used within task executors. It’s not recommended to use this for error handling.

At first I thought it's for performance but if tokio is calling this for every single poll, that probably isn't the case. What then could be the reason? For example axum just closes the tcp connection in a panic happens while handling the request. However we can use a middleware that catches panics and return 500.

It's something that people disagree about, but many people recommend that panics are used only for bugs and that Result should be used for any errors expected during runtime.

One reason is that some environments enable abort on panic, which makes it impossible to catch them. Some libraries panic on invalid input, making them incompatible with such environments.

2 Likes

Two more reasons not to use panics for error reporting:

  • It’s not statically typed — when you catch you get Box<dyn Any>, so you have no help in confirming what types you should be trying to downcast to at that point. Result gives you static error types.
  • If a panic occurs while unwinding is already in progress (i.e. in Drop code), it becomes an abort and cannot be caught. This means your application is more likely to abort in edge-case situations where, if you had been using Result instead, there still would have been a panic in Drop but the panic could have been caught.

Note that this is a very good reason for libraries to avoid panicking in any situation that is not a bug, but an application — more precisely, a crate that is written to be compiled by its authors, not compiled as a dependency of someone else’s program — can reasonably decide “I only support these platforms and I don’t support being compiled with panic=abort”. If you take on those restrictions, and with consideration for the fact that double panics abort, you can successfully use unwinding for error handling in an application, and there can be good reason to do so (e.g. rare errors in deeply nested high-performance code where branching on Results is a significant cost).

But, overall, it’s much more robust to use Result, and libraries should always do so.

3 Likes