Borrowing parameter to flat_map closure in inner closure, what's the difference between these two?

Hello!

I'm trying to get an intuitive understanding of the difference(s) between the two scenarios shown in examples below, resulting in a compile error (borrow checker error) in the first one. Note that I'm not really looking for ways to make the first example compile (e.g using move or calling .collect on the inner iterator), but rather looking to really understand why it doesn't compile.

In the non-compiling example the compiler seems to treat name: &String as data owned by the outer closure, which I guess makes sense if we're referring to the reference value itself. But the second example also borrows name, with the only differences being the call to .as_str and collecting into a Vec<(&str, &i32)>. The one reasonable idea I've managed to come up with is that the compiler somehow is able to use the call to .as_str to infer the lifetime as a borrow of the arg function input parameter rather than the name closure parameter.

Playground Link

fn borrow_error(arg: HashMap<String, Vec<i32>>) {
    let all_tasks: Vec<(&String, &i32)> = arg
        .iter()
        .flat_map(|(name, values)| {
            values.iter().map(|x| (
                           // ^^^ may outlive borrowed value `name`
                name,
             // ---- `name` is borrowed here
                x,
            ))
        })
        .collect();
}
fn works(arg: HashMap<String, Vec<i32>>) {
    let all_tasks: Vec<(&str, &i32)> = arg
        .iter()
        .flat_map(|(name, values)| {
            values.iter().map(|x| (
                name.as_str(),
                x,
            ))
        })
        .collect();
}

I tried to implement the closures by hand in the manner usually described by people explaining how closures are implemented: Playground Link, but this attempt failed to reproduce the borrowing error.

Any help appreciated! Thanks.

I can't explain why, but reborrowing name as follows causes the first case to compile. However, then I get the clippy::borrow_deref_ref warning, so that would have to be suppressed.

                &*name,

The way I look at it: Sometimes the borrow checker needs a little help.

At the end of the compile error message is a hint that helps a bit:

error[E0373]: closure may outlive the current function, but it borrows `name`, which is owned by the current function
  --> src/lib.rs:6:31
   |
6  |             values.iter().map(|x: &i32| (
   |                               ^^^^^^^^^ may outlive borrowed value `name`
7  |                            // ^^^ may outlive borrowed value `name`
8  |                 name,
   |                 ---- `name` is borrowed here
   |
…
help: to force the closure to take ownership of `name` (and any other referenced variables), use the `move` keyword
   |
6  |             values.iter().map(move |x: &i32| (
   |                               ++++

and that also helps to resolve the problem:

values.iter().map(move |x| (…) }

When you use name in the inner closure without move then it effectively takes a reference to name of type &'a &'b String (with 'a being bounded by the lifetime of the name argument to the outer closure and 'b being bounded by the lifetime of the HashMap).

You then try to return the Map<impl Iterator, {closure with &name}> created by values.iter().map(…) from the outer closure and because the name variable only exists for the duration of the outer closure, the borrow checker isn't happy:

fn borrow_error(arg: HashMap<String, Vec<i32>>) {
    let all_tasks: Vec<(&String, &i32)> = arg
        .iter()
        .flat_map(|(name, values)| {
                 // ^^^^ lifetime of `name: &String` starts here
            values.iter().map(|x| (
                name,
             // ---- `name` is borrowed here
             // -> the closure contains a `&&String`
                x,
            ))
          // ^ the closure gets returned (wrapped in a `Map<…>`) here
          //   but the reference `name` doesn't exist anymore after
          //   the outer closure returns -> borrow checker error
        })
        .collect();
}

Once you reborrow with &*name or implicitly with name.as_str() the compiler realizes that it can just reborrow the reference name instead of taking a reference to a reference. I think this works because of the minimal capturing rules introduced in RFC 2229.

2 Likes

@cg909's analysis is correct, and I'm just adding some side notes.


Heuristically the compiler prefers to capture by shared reference if it can, then exclusive reference, and then by move only if it feels it "has to". But the heuristic isn't perfect as this case demonstrates: You only need to capture a &T to return a T if T: Copy, but capturing &T was the wrong thing here. (T = &String.)

Here's a playground using some internal features to show what was captured in a different form.

// Erroring version
note: Min Capture name[] -> Immutable

// Working version (`name.as_str()` or `&*name`)
note: Min Capture name[Deref] -> Immutable

You can run it on edition 2018 to "disable" RFC 2229 and see both name[Deref] versions revert to name[] versions which fail to compile.

Lifetime parameters on functions are chosen by the caller and can never be as short as the unnameable, inferred, shorter-than-the-function-body lifetimes that correspond to borrows of local variables. So you don't want 'a here:

fn borrow_error<'a>(arg: HashMap<String, Vec<i32>>) {

And this:

struct Outer<'a, T: 'a> {
    _marker: std::marker::PhantomData<&'a T>,
}

Would just be struct Outer {}.

But there's no way to explicitly write the attempted Fn implementations for Outer {} given what was inferred for Inner...:

struct Inner<'a, 'tmp> {
    name: &'tmp &'a String,
}

...because there's no way to name the lifetime that would need to be in the Output type in a way that accurately portrays the implementation attempt, similar to why the lifetime of fn borrow_error<'a> can't represent a local borrow.

But here's a version of the failing code with only Inner implemented "manually".

2 Likes