How panic! calls drop functions

Rust calls all of the Drop::drop functions of living structs if it encounters a panic!. We can clearly see this in a basic example:

struct MyStruct {}

impl Drop for MyStruct {
	fn drop(&mut self) {
		println!("{:?}", "dropped!");
	}
}

fn main() {
	let a = MyStruct {};
	panic!("{:?}", "panic!");
}

Rust nomicon says that if your program doesn't panics, there is no overhead of existence of it:

Rust's current unwinding implementation is heavily optimized for the "doesn't unwind" case. If a program doesn't unwind, there should be no runtime cost for the program being ready to unwind.

So if it doesn't have any overhead, I wonder how it knows where the necessary drop implementations and structs that have to be dropped are.

It drops it by calling the drop function.

For sure, but how it knows which structs needs to be dropped?

I mean, the compiler looks at the shape of the code and figures out which ones are in scope.

The code that performs the unwinding is stored next to the actual function code, and doesn't run unless the function panics. Of course, when unwinding up the stack, it needs to know which method call the panic happened inside, but it needs to store that information anyway — otherwise how could it continue running from the right place when it returns normally?

1 Like

We can make a case study with this code:

pub struct MyStruct {}

impl Drop for MyStruct {
	fn drop(&mut self) {
		println!("{:?}", "dropped!");
	}
}

extern "Rust" {
    fn unknown_api();
}

pub fn f() -> MyStruct {
	let a = MyStruct {};
	unsafe {
    	unknown_api();
	}
	a
}

Playground link

This code is not intended to run, just compile. I've inserted a call to a function Rust doesn't know - the unknown_api. Rust will have to assume it can panic/unwind, and inserts the necessary cleanup code in the f function to handle this case.

Enable Release compile, compile the code to MIR and look at the output.

For the f function we can see that the code to the unknown_api has both a following edge and an "unwind edge" which leads to cleanup. See the unknown_api line.

fn f() -> MyStruct {
    let mut _0: MyStruct;                // return place in scope 0 at src/lib.rs:13:15: 13:23
    let _1: MyStruct;                    // in scope 0 at src/lib.rs:14:6: 14:7
    let _2: ();                          // in scope 0 at src/lib.rs:16:6: 16:19
    scope 1 {
        debug a => _1;                   // in scope 1 at src/lib.rs:14:6: 14:7
        scope 2 {
        }
    }

    bb0: {
        StorageLive(_1);                 // scope 0 at src/lib.rs:14:6: 14:7
        StorageLive(_2);                 // scope 2 at src/lib.rs:16:6: 16:19
        _2 = unknown_api() -> [return: bb1, unwind: bb2]; // scope 2 at src/lib.rs:16:6: 16:19
                                         // mir::Constant
                                         // + span: src/lib.rs:16:6: 16:17
                                         // + literal: Const { ty: unsafe fn() {unknown_api}, val: Value(Scalar(<ZST>)) }
    }

    bb1: {
        StorageDead(_2);                 // scope 2 at src/lib.rs:16:19: 16:20
        _0 = const MyStruct {  };        // scope 1 at src/lib.rs:18:2: 18:3
                                         // ty::Const
                                         // + ty: MyStruct
                                         // + val: Value(Scalar(<ZST>))
                                         // mir::Constant
                                         // + span: src/lib.rs:18:2: 18:3
                                         // + literal: Const { ty: MyStruct, val: Value(Scalar(<ZST>)) }
        StorageDead(_1);                 // scope 0 at src/lib.rs:19:1: 19:2
        return;                          // scope 0 at src/lib.rs:19:2: 19:2
    }

    bb2 (cleanup): {
        drop(_1) -> bb3;                 // scope 0 at src/lib.rs:19:1: 19:2
    }

    bb3 (cleanup): {
        resume;                          // scope 0 at src/lib.rs:13:1: 19:2
    }
}

You can also see the asm output of function f

playground::f:  # @playground::f
# %bb.0:
	pushq	%rbx
	callq	*unknown_api@GOTPCREL(%rip)
# %bb.1:
	popq	%rbx
	retq
	movq	%rax, %rbx
	callq	core::ptr::drop_in_place
	movq	%rbx, %rdi
	callq	_Unwind_Resume@PLT
	ud2

As you can see there is a code size cost to this feature - everything after retq, the cleanup code that's added, but it's outside the main flow of the code. Code size is partly a runtime cost, but it's hard to quantify and it's not an "instructions executed" cost.

How does it all work? It's platform specific, and there are some links in this file for this implementation: rust/gcc.rs at 1.49.0 · rust-lang/rust · GitHub

5 Likes

To expand on @bluss's answer, or rather, to illustrate it, I like looking at the graph representation of the MIR:

We can distinctly see the check & branch on the "return" status of unknown_api(): such a function can return, and then the left branch is taken (which does not drop _1: MyStruct), or it can unwind, in which case the "cleanup all the non-dropped locals in scope" branch is taken, which does drop it.

4 Likes

On most platforms (the biggest exception being windows), unwinding happens with the help of DWARF unwind tables stored in the .eh_frame section of the executable. For each function that can unwind there is an FDE which describes exactly for each point at which the function can unwind where to find the values of each register prior to calling the function if this value is still known. This information can be used to unwind the stack frame by frame. In addition before a group of FDE's there is a CIE which describes a set of common information of the following FDE's. This CIE among other things can contain a pointer to a personality function and an LSDA value. The personality function is called every time a frame is unwinded and based on the LSDA value and instruction pointer can decide to run the cleanup code. This personality function can also decide to stop unwinding (std::panic::catch_unwind).

3 Likes

A little bit off-topic – any idea why there are two instances of MyStruct there? – _0 is used for returning and _1 is used only in case of unwind-drop. I needed to stop and think a little bit to understand what's going on. Curiously, this weird split seems to go away if MyStruct stops being zero-sized.

What do FDE, CIE, and LSDA mean? :grinning:

2 Likes

They mean frame descriptor entry, common information entry and language specific data area (different languages can use different formats, but rustc uses the same format as gcc and clang, as that is what llvm implemented).

_0 = const MyStruct {} is the result of const propagating _0 = _1 in the originally constructed MIR I guess.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.