I'm trying to piece together a better mental model of FFI unwinding. The nomicon makes the following statement:
Rust's unwinding strategy is not specified to be fundamentally compatible with any other language's unwinding. As such, unwinding into Rust from another language, or unwinding into another language from Rust is Undefined Behavior. You must absolutely catch any panics at the FFI boundary! What you do at that point is up to you, but something must be done. If you fail to do this, at best, your application will crash and burn. At worst, your application won't crash and burn, and will proceed with completely clobbered state.
Which certainly makes it sound like panicking in a c++-caller-rust-callee would cause unpredictable behavior. But I haven't experienced this - in every case I've tried, it appears that C++ properly catches the Rust panic. In fact, it seems like the C++ stacktrace functionality actually works for Rust too:
# stderr
thread '<unnamed>' panicked at 'this is a panic from Rust', udf-examples/src/lookup.rs:29:9
# backtrace
Thread pointer: 0x7f7a04000c68
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f7a40117c78 thread_stack 0x49000
Printing to addr2line failed
# below functions are from Rust library (truncated repeated files)
mariadbd(my_print_stacktrace+0x32)[0x563385202882]
mariadbd(handle_fatal_signal+0x488)[0x563384cd9178]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f7a56d94520]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f7a56d7a7f3]
/usr/lib/mysql/plugin/libudf_examples.so(+0x56ba7)[0x7f7a4007aba7]
/usr/lib/mysql/plugin/libudf_examples.so(lookup6_init+0x35d)[0x7f7a4003fc2d]
# below this point is C++
mariadbd(_ZN11udf_handler10fix_fieldsEP3THDP16Item_func_or_sumjPP4Item+0x369)[0x563384d52909]
mariadbd(_ZN13Item_udf_func10fix_fieldsEP3THDPP4Item+0x26)[0x563384d670a6]
# ...
(side note - how do you decode those addresses in the backtrace?)
So, what is going on here? Are there specific cases that causes unwinding into C or C++ to be unreliable? (I haven't tried anything with vanilla C).
If these operations are truly unsafe in all circumstances, what is the correct behavior of catch_unwind? Should every exposed C>Rust FFI boundary have a catch_unwind call that turns panics into aborts? Or is there a way to do this automatically?
Rust panics are just exceptions, where the exception object being thrown is whatever you pass to std::panic::panic_any().
Under the hood it uses whatever mechanism LLVM uses for exceptions on that platform, which is why C++ can catch Rust exceptions. However, none of that is guaranteed and it's just an implementation detail.
The nomicon is saying that Rust currently declares unwinding into C to be undefined behaviour, and the compiler is free to generate code under that assumption.
In theory, yes. In practice, you can skip the catch_unwind() if you know the code being called will never ever panic (e.g. it might just allocate a Vec or return a pointer to a field).
I've also seen a fair amount of production code that is happy to take on the risk of UB because it simplifies the code or the practical fallout of UB is minimal[1].
Yep. Crates like safer-ffi will automatically wrap each FFI function with catch_unwind().
The language pedants will probably crucify me here, but programs containing undefined behaviour won't actually cause demons to fly out of your nose. Often the code will still do what you would expect and you'll get away with it. ↩︎
Under the hood it uses whatever mechanism LLVM uses for exceptions on that platform, which is why C++ can catch Rust exceptions. However, none of that is guaranteed and it's just an implementation detail.
That does make sense - does that mean that in some cases (e.g. GCC/LLVM mix, or vanilla C) that the behavior would likely be something like a stuck thread where the program doesn't terminate at all? Or would the panic theoretically propegate past any calling C code and hit some syscall mechanism to abort? (unreliable of course, just wondering what the common failure modes look like)
In theory, yes. In practice, you can skip the catch_unwind() if you know the code being called will never ever panic (e.g. it might just allocate a Vec or return a pointer to a field).
It seems like this is fairly common. I really haven't seen it used in many of the Rust-C libraries that I've delved into, which sort of prompted this question. Probably just lack of awareness, plus awkwardness - catch_unwind, with AssertUnwindSafe and all, is really not at all ergonomic.
Yep. Crates like safer-ffi will automatically wrap each FFI function with catch_unwind().
Thanks for the link, that's easier than doing it manually. It almost seems like there could be a panic=unwind-abort-at-extern compiler option to make this not necessary, but at least the crate is cleaner than doing it manually.
The intent is that, eventually, panicking out through an extern "C" fn will abort the program. If you want to unwind out of Rust code into FFI, you'll need to use extern "C-unwind" in the future.
On current Rust, extern "C" acts roughly like extern "C-unwind" will, but this isn't guaranteed; we are already telling LLVM that calling an extern "C" function can't unwind.
Rust catching a foreign unwind will always abort, and a Rust unwind caught in a foreign language will best-effort abort (if the foreign side does what it's supposed to with a foreign unwind). Either unwinding over either and just running stack destructor cleanup functions as expected.
As for what unwinding over C frames will do, ask your compiler vendor what a C++ unwind over code compiled as C will do. The two likely answers you might get are "don't do that" or "compile the C code with -fexceptions and it'll cross over safely."
TL;DR: cross-language extern "C" unwinds are in "formally UB but mostly works" territory for the time being. Unwinds out of Rust are not unsafe and thus will eventually be guaranteed to abort. Unwinds into Rust will remain UB.
extern "C" declarations have not emitted the nounwind LLVM attribute since PR 86155, unless #![feature(c_unwind)] is specified, which makes the compiler add an abort-on-panic shim instead. (It's surprisingly tricky to find the PR that finally fixed that, since there was a big tug-of-war at the time over adding the abort-on-panic shim and removing nounwind for the time being.) To repeat the summary there:
This PR fixes a soundness hole
The fix is temporary which is to assume that "C" functions can unwind
In the future "C-unwind" will be stable and "C" will no longer be assumed to unwind
In the future programs will break if they assume unwinding through extern "C"-defined functions in Rust will works. These functions will need to switch to extern "C-unwind"