View type generated by capturing closure

The reference says that capturing closures are similar to generating an anonymous struct with the captured data. I'm curious on how this would work, so is there a way to possibly view the generated struct?

I've tried Godbolt, but all type data is erased at the assembly level so I just have to guess. (Not to mention using println bloats the output with formatting calls.) I've tried using mem::size_of_val on a closure variable, but it just returns the size of a regular pointer. I've tried looking at the MIR and HIR, but I'm not familiar enough with either format to properly understand them.

Do you have any tips that might help me? Thanks :slight_smile:

This means your closure is capturing one pointer (or you passed a reference to a reference to size_of_val). The closure will be the size of all of its captured parts. In most cases, this means it'll have one pointer for each captured variable. If you add the move keyword, it'll have the variables themselves.

1 Like

An important thing to note is that while closures are “similar” to generating a struct, they don’t literally generate code for a struct. Even on MIR level, there’s still, so some degree, a special notion of closure, I believe (from glancing at MIR printouts… I should probably at some point read more about MIR to be sure).

Nonetheless, actually with MIR, I feel like one can see a lot about the struct-like structure of the thing.

Taking something like

fn test(x: i32) -> i32 {
    let a = 1;
    let b = 2;
    let f = |c, d| a + b + c + d;
    f(x, 42)
}

in the playground, and generating the MIR there, gives

// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn test(_1: i32) -> i32 {
    debug x => _1;                       // in scope 0 at src/lib.rs:1:9: 1:10
    let mut _0: i32;                     // return place in scope 0 at src/lib.rs:1:20: 1:23
    let _2: i32;                         // in scope 0 at src/lib.rs:2:9: 2:10
    let mut _5: &i32;                    // in scope 0 at src/lib.rs:4:13: 4:33
    let mut _6: &i32;                    // in scope 0 at src/lib.rs:4:13: 4:33
    let mut _7: &[closure@src/lib.rs:4:13: 4:19]; // in scope 0 at src/lib.rs:5:5: 5:6
    let mut _8: (i32, i32);              // in scope 0 at src/lib.rs:5:5: 5:13
    scope 1 {
        debug a => const 1_i32;          // in scope 1 at src/lib.rs:2:9: 2:10
        let _3: i32;                     // in scope 1 at src/lib.rs:3:9: 3:10
        scope 2 {
            debug b => const 2_i32;      // in scope 2 at src/lib.rs:3:9: 3:10
            let _4: [closure@src/lib.rs:4:13: 4:19]; // in scope 2 at src/lib.rs:4:9: 4:10
            scope 3 {
                debug f => _4;           // in scope 3 at src/lib.rs:4:9: 4:10
            }
        }
    }

    bb0: {
        _2 = const 1_i32;                // scope 0 at src/lib.rs:2:13: 2:14
        _3 = const 2_i32;                // scope 1 at src/lib.rs:3:13: 3:14
        _5 = &_2;                        // scope 2 at src/lib.rs:4:13: 4:33
        _6 = &_3;                        // scope 2 at src/lib.rs:4:13: 4:33
        _4 = [closure@src/lib.rs:4:13: 4:19] { a: move _5, b: move _6 }; // scope 2 at src/lib.rs:4:13: 4:33
                                         // closure
                                         // + def_id: DefId(0:4 ~ playground[e813]::test::{closure#0})
                                         // + substs: [
                                         //     i8,
                                         //     extern "rust-call" fn((i32, i32)) -> i32,
                                         //     (&i32, &i32),
                                         // ]
        _7 = &_4;                        // scope 3 at src/lib.rs:5:5: 5:6
        _8 = (_1, const 42_i32);         // scope 3 at src/lib.rs:5:5: 5:13
        _0 = <[closure@src/lib.rs:4:13: 4:19] as Fn<(i32, i32)>>::call(move _7, move _8) -> bb1; // scope 3 at src/lib.rs:5:5: 5:13
                                         // mir::Constant
                                         // + span: src/lib.rs:5:5: 5:6
                                         // + literal: Const { ty: for<'a> extern "rust-call" fn(&'a [closure@src/lib.rs:4:13: 4:19], (i32, i32)) -> <[closure@src/lib.rs:4:13: 4:19] as FnOnce<(i32, i32)>>::Output {<[closure@src/lib.rs:4:13: 4:19] as Fn<(i32, i32)>>::call}, val: Value(<ZST>) }
    }

    bb1: {
        return;                          // scope 0 at src/lib.rs:6:2: 6:2
    }
}

fn test::{closure#0}(_1: &[closure@src/lib.rs:4:13: 4:19], _2: i32, _3: i32) -> i32 {
    debug c => _2;                       // in scope 0 at src/lib.rs:4:14: 4:15
    debug d => _3;                       // in scope 0 at src/lib.rs:4:17: 4:18
    debug a => (*((*_1).0: &i32));       // in scope 0 at src/lib.rs:2:9: 2:10
    debug b => (*((*_1).1: &i32));       // in scope 0 at src/lib.rs:3:9: 3:10
    let mut _0: i32;                     // return place in scope 0 at src/lib.rs:4:20: 4:20
    let mut _4: i32;                     // in scope 0 at src/lib.rs:4:20: 4:29
    let mut _5: i32;                     // in scope 0 at src/lib.rs:4:20: 4:25
    let mut _6: i32;                     // in scope 0 at src/lib.rs:4:20: 4:21
    let mut _7: i32;                     // in scope 0 at src/lib.rs:4:24: 4:25
    let mut _8: (i32, bool);             // in scope 0 at src/lib.rs:4:20: 4:25
    let mut _9: (i32, bool);             // in scope 0 at src/lib.rs:4:20: 4:29
    let mut _10: (i32, bool);            // in scope 0 at src/lib.rs:4:20: 4:33
    let mut _11: &i32;                   // in scope 0 at src/lib.rs:4:13: 4:33
    let mut _12: &i32;                   // in scope 0 at src/lib.rs:4:13: 4:33

    bb0: {
        _11 = deref_copy ((*_1).0: &i32); // scope 0 at src/lib.rs:4:20: 4:21
        _6 = (*_11);                     // scope 0 at src/lib.rs:4:20: 4:21
        _12 = deref_copy ((*_1).1: &i32); // scope 0 at src/lib.rs:4:24: 4:25
        _7 = (*_12);                     // scope 0 at src/lib.rs:4:24: 4:25
        _8 = CheckedAdd(_6, _7);         // scope 0 at src/lib.rs:4:20: 4:25
        assert(!move (_8.1: bool), "attempt to compute `{} + {}`, which would overflow", move _6, move _7) -> bb1; // scope 0 at src/lib.rs:4:20: 4:25
    }

    bb1: {
        _5 = move (_8.0: i32);           // scope 0 at src/lib.rs:4:20: 4:25
        _9 = CheckedAdd(_5, _2);         // scope 0 at src/lib.rs:4:20: 4:29
        assert(!move (_9.1: bool), "attempt to compute `{} + {}`, which would overflow", move _5, _2) -> bb2; // scope 0 at src/lib.rs:4:20: 4:29
    }

    bb2: {
        _4 = move (_9.0: i32);           // scope 0 at src/lib.rs:4:20: 4:29
        _10 = CheckedAdd(_4, _3);        // scope 0 at src/lib.rs:4:20: 4:33
        assert(!move (_10.1: bool), "attempt to compute `{} + {}`, which would overflow", move _4, _3) -> bb3; // scope 0 at src/lib.rs:4:20: 4:33
    }

    bb3: {
        _0 = move (_10.0: i32);          // scope 0 at src/lib.rs:4:20: 4:33
        return;                          // scope 0 at src/lib.rs:4:33: 4:33
    }
}

Important things we can see here:

The closure is constructed here

        _2 = const 1_i32;
        _3 = const 2_i32;
        _5 = &_2;
        _6 = &_3;
        _4 = [closure@src/lib.rs:4:13: 4:19] { a: move _5, b: move _6 };

from which it’s clear that this closure is just a struct containing two fields that each hold a &i32 value.

And it is called afterwards

        _7 = &_4;
        _8 = (_1, const 42_i32);
        _0 = <[closure@src/lib.rs:4:13: 4:19] as Fn<(i32, i32)>>::call(move _7, move _8) -> bb1;

which does make a call to an invisible Fn implementation which will apparently be automagically filled in by the compiler.

What’s passed to this call to “call” is

  • a reference to the closure, and
  • a tuple of the arguments, one being _1 from the surrounding function’s argument, the other the constant 42.

If you know a bit about closures and Fn… traits, you’ll know that the former point can differ from closure to closure and from Fn… trait to Fn… trait, and calling a closure by reference to the closure, or mutable reference to the closure, or giving up ownership of the future closure, is all within the real of what’s possible.

Ultimately the automagically-inserted-invisible call method call will land back in code that we can see, the

fn test::{closure#0}(_1: &[closure@src/lib.rs:4:13: 4:19], _2: i32, _3: i32) -> i32 {

part. Here, the arguments, previously in a tuple, arrive in no-longer-bundled-up form again, and if we were to have used a FnMut or FnOnce abstraction at the call-site, we’d probably (haven’t tested it) still see a &… reference here, so the automagically-inserted-invisible call implementations can probably also “downgrade” the self-access.

Anyways, inside of this function, the only notable thing is that the places the code accessed a and b is replaced by field-access to the closure plus a dereference of the reference indirection the compiler had introduced for us.

_11 = deref_copy ((*_1).0: &i32);
_6 = (*_11);

extracts a. (I don’t really understand MIR syntax either, but this clearly accesses the first field of the closure in _1, and then dereferences the resulting &i32 in the next line)

And b right afterwards:

_12 = deref_copy ((*_1).1: &i32);
_7 = (*_12);

Then everything is just added and returned, nothing interesting anymore.

1 Like

Now if we wanted to generate our own code that’s similar to this, it would look like this. Still for the same example

fn test(x: i32) -> i32 {
    let a = 1;
    let b = 2;
    let f = |c, d| a + b + c + d;
    f(x, 42)
}

The result would be something like

#![feature(fn_traits)]
#![feature(unboxed_closures)]

fn test(x: i32) -> i32 {
    let a = 1;
    let b = 2;
    let f = ClosureInTest(&a, &b);
    f(x, 42)
}

struct ClosureInTest<'a>(&'a i32, &'a i32);
impl ClosureInTest<'_> {
    fn implementation(&self, c: i32, d: i32) -> i32 {
        *self.0 + *self.1 + c + d
    }
}

impl FnOnce<(i32, i32)> for ClosureInTest<'_> {
    type Output = i32;
    extern "rust-call" fn call_once(self, args: (i32, i32)) -> i32 {
        self.implementation(args.0, args.1)
    }
}
impl FnMut<(i32, i32)> for ClosureInTest<'_> {
    extern "rust-call" fn call_mut(&mut self, args: (i32, i32)) -> i32 {
        self.implementation(args.0, args.1)
    }
}
impl Fn<(i32, i32)> for ClosureInTest<'_> {
    extern "rust-call" fn call(&self, args: (i32, i32)) -> i32 {
        self.implementation(args.0, args.1)
    }
}

though without the whole Fn… trait ceremony, you could also think of it as simply

fn test(x: i32) -> i32 {
    let a = 1;
    let b = 2;
    let f = ClosureInTest(&a, &b);
    f.implementation(x, 42)
}

struct ClosureInTest<'a>(&'a i32, &'a i32);
impl ClosureInTest<'_> {
    fn implementation(&self, c: i32, d: i32) -> i32 {
        *self.0 + *self.1 + c + d
    }
}

If you generate the MIR for either of these, you’ll see a high degree of similarity; just (for the first) the Fn… implementations are also visible, and for the latter the packing into a tuple of the args doesn’t happen.

A typo or a sneaky pun?..

typo :smiling_face:

After some more playing around, I ran this program:

#![feature(fn_traits)]

use std::mem;

fn main() {
    unsafe {
        let (a, b, c) = (4u8, 3u8, 2u8);
        let func = move || a + b + c;

        // Prints 3 bytes
        println!("Closure Size: {}", mem::size_of_val(&func));
        
        // Transmute back to bytes
        let (d, e, f): (u8, u8, u8) = mem::transmute_copy(&func);
        
        // Prints (4, 3, 2)
        println!("def: ({}, {}, {})", d, e, f);
        
        // Prints 9
        println!("Closure Result: {}", func.call_once(()));
    }
}

Playground

My conclusion is that the func returned from a capturing closure expression is really just the data it captures. In order to find the actual function pointer, you need func::call_once. Thank you all for your responses, it was very helpful! :smiley: