Common type of closure

In the following code , if let compiler infer the type of Vec , line A will not work , so I need to specify the element type explicitly. But I don't know how. Is there a common type of closure or any other workarround?

fn vec() {
  let mut v : Vec<_> = Vec::new();
  v.push(||{});
  v.push(||{});   // A not work
}

You can use a boxed closure. E.g.:

fn vec() {
  let mut v : Vec<Box<dyn Fn()>> = Vec::new();
  v.push(Box::new(||{}));
  v.push(Box::new(||{}));
}

Closures in general do not have any common type, even if they have the same signature. This is because they may capture different variables, and thus the storage required for the closure is different — and they execute different code. In Rust, unlike many other languages, calling a closure is, by default, statically dispatched — it does not involve any function pointer or other indirection. So, if you want to collect different closures, you need to add something that supports dynamic dispatch. alice's example of boxed closures introduces it via the type Box<dyn Fn()> which contains a pointer to the closure's data and a pointer to the closure's code (as a vtable).

Another option is to use a function pointer type:

fn vec() {
    let mut v: Vec<fn()> = Vec::new();
    v.push(|| {});
    v.push(|| {});
}

This is similar to the dyn Fn option in that it coerces multiple types of closures to the same type, but it avoids needing the individual Box allocations. The price is that these closures must not capture any variables: that is, they must be functions that you could have written as named fn functions. This is because function pointers point to code only — they have no place to store captured data.

3 Likes

but it seems we can cheat compiler this way

fn vec() {
  let mut v : Vec<_> = Vec::new();
  for i in 1..=2 {
    v.push(move|| println!("{}",i));
  }
}

You aren't cheating the compiler. Each time you write a closure in the code, that's a separate type, but the same written closure results in the same type, so it's accepted.

I found the rule :
if there is only 1 infer point

for i in 1..=2 {
  vec.push(||{}); // ok
  vec.push(||{}); // not ok
}

or can be inferred to the same type

let x = ||{};
vec.push(x);
vec.push(x); // ok

it will be ok.

It's not a cheat, it has to be that way. Rust is statically typed. If it wasn't that way, what is the type of c here?

for i in 1..=2 {
    let c = move || println!("{}", i);
}

Or of this return value?

fn f() -> impl Fn() {
    || {}
}

(You can't name the types, but they have to be the same each iteration/call, because Rust is statically typed.)

To expand on this, consider the expression:

// global/compile-time type definition
// vvvvvvvvvvvvvvvvv
 { pub struct Foo(); Foo() }
//                   ^^^^^
//                   runtime instantiation of that type
  • Or, similarly, the expression:

    {
        fn foo() {} // <- compile-time: "global" function definition (a constant and a hidden type; let us call its type `Foo`)
        foo // <- runtime: refer to this very function to "instantiate" it 
    }
    

Each time the expression is written, a new unique Foo type is defined, and you get an instance of that type.

So you can't do:

let mut vec = Vec::new();
vec.push({ struct Foo(); Foo() }); // l2:
vec.push({ struct Foo(); Foo() }); // l3: Foo@l3 ≠ Foo@l2 => Error!

But you can do:

let mut vec = Vec::new();
for _ in 0 .. 2 {
    vec.push({
        pub struct Foo(); // Single global/compile-time definition
        Foo() // runtime: instanced multiple times
    });
}

since, modulo scoping, this is equivalent to:

mod compiler_generated {
    pub mod for_loop_body {
        pub struct Foo();
    }
}

fn … (…)
  -> …
{
    …
    let mut vec = Vec::new();
    for _ in 0 .. 2 {
        vec.push(compiler_generated::for_loop_body::Foo());
    }
    …
}

Compare that to the "unsugaring" of the one with the two struct Foo(); definitions:

mod compiler_generated {
    pub mod l2 {
        pub struct Foo();
    }
    pub mod l3 {
        pub struct Foo();
    }
}

fn … (…)
  -> …
{
    …
    let mut vec = Vec::new();
    vec.push(compiler_generated::l2::Foo()); // l2
    vec.push(compiler_generated::l3::Foo()); // l3: Error!
    …
}

Well, this is the exact same mechanism with closures: every time one writes a literal |…| … closure, it's actually sugar for

{
    struct Closure<…> { /* captured environment / "upvars" */ }
    impl<…> Fn…(…) -> … for Closure<…> {
        /* Make it `()`-callable */
    }
    Closure { /* upvars */ }
}

If the literal is written once inside a for loop, then there shall be only one struct Closure… impl … Closure… global/compile-time definition, as well as its Closure { /* upvars */ } runtime-instantiation, which can thus occur multiple times if the for loop so does:

Whereas with vec.push(|| ()); vec.push(|| ());, we have two literal closures, and so, two global/compile-time struct Closure… definitions, and thus, two distinct types (each runtime-instanced once).

3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.