Lifetime problem with curried function

fn foo<'a, 'b>(_a: &'a str, b: &'b str) -> &'b str {
    b
}

fn bar<'a, 'b>(a: &'a str) -> impl Fn(&'b str) -> &'b str {
    move |b| foo(a, b)
}

(Playground)

I don’t understand why this doesn’t compile. I mean, I understand that the compiler is saying that the lifetime 'a must include the lifetime 'b in the second function, but I don’t get why that’s necessary.

Quick answer:

fn foo<'a, 'b>(_a: &'a str, b: &'b str) -> &'b str {
    b
}

fn bar<'a>(a: &'a str) -> impl for <'b> Fn(&'b str) -> &'b str + 'a {
    move |b| foo(a, b)
}

I'll try to give a detailed one later.

When you make a closure, the closure itself captures a to have it available for the foo(a, b) call sometime later. It returns something like struct Fn<'a> {&'a str, function pointer}, so the returned value is limited by a's lifetime.

Borrow checker checks interfaces, not implementations, so even though foo doesn't use the _a arg, it has it in its interface, so the borrow checker will still enforce that it's valid.

fn bar<'a>(a: &'a str) -> impl for <'b> Fn(&'b str) -> &'b str + 'a {
    move |b| foo(a, b)
}

Is the difference between my version and this one that I made a polymorphic (in 'b) function that returns a monomorphic closure, whereas yours is a function that doesn’t depend on 'b, but returns a polymorphic closure?

I get that the returned closure is limited by the lifetime 'a, but that doesn’t really explain why that lifetime needs to be longer than that of the argument of the closure. foo<'a, 'b> is well typed regardless of what 'a and 'b are, as evidenced by the fact that foo compiles.

foo is a distraction here. What you have is:

fn bar<'a, 'b>(a: &'a str) -> impl Fn(&'b str) -> &'b str {
    move |b| {drop(a); b}
}

or even can be simplified down to:

fn bar<'a, 'b>(a: &'a str) -> impl Fn(&'b str) {
    move |_| drop(a)
}

so what it actually returns is (handwaving syntax):

fn bar<'a, 'b>(a: &'a str) -> struct SomeFn {
    SomeFn {
      context: a,
      callback: fn(&self) {drop(self.context.a)}
    }
}

So SomeFn needs to be valid for lifetime 'a.

BTW, impl Fn is always generic, regardless whether you use for<'a> syntax to explain lifetimes. Rust doesn't really have a distinction between monomorphic/polymorphic closures. At best the closures without a context can decay to bare function pointers, like this:

fn foo<'a, 'b>(_a: &'a str, b: &'b str) -> &'b str {
    b
}

fn bar<'a, 'b>(a: &'a str) -> fn(&'b str) -> &'b str {
    move |b| foo("", b)
}
1 Like

I’ll have to think about this. If I may ask another question, can closures returned from functions be polymorphic in types as well as lifetimes? What I mean is something like:

fn make_id() -> impl Fn(T) -> T { //Where do I put the <T>?
    |x| x
}

such that make_id() itself is not polymorphic. In other words, its type should be Unit → ∀X. X → X, not ∀X. Unit → X → X.

1 Like

we can't express generic closures yet, so this isn't possible. We would need higher rank type bounds (HRTB)

1 Like

I see. Unfortunately, in the specific case I have, I can’t move the type parameter from the closure to the function because it depends on the lifetime parameter in the closure. I suppose this construction just can’t be done right now.

Can you show exactly what you are trying to do, I may be able to show a work around.

You’re right, I should be more specific. I’m trying to implement the regular expression parsers in nom, which are currently macros, as functions.

For example, there is the re_match parser that returns the entire input if a part of it matches the given regular expression. A naive implementation might have the following signature:

pub fn re_match<'a, E: ParseError<&'a str>>(re: &a' Regex) -> impl  Fn(&'a str) -> IResult<&'a str, &'a str, E>

This sort of works, but the lifetime of the references to the regex and the string to be parsed are unnecessarily coupled. This makes it impossible to implement one of the old macros in terms of this combinator, as far as I can tell. Therefore, I would like to do this:

pub fn re_match<'a>(re: &a' Regex) -> impl for<'b> Fn(&'b str) -> IResult<&'b str, &'b str, E> + a'

The problem is that there’s no place to put the trait bound on E. It can’t go in the impl because that’s not supported and it can’t go in re_match's type parameters because it depends on 'b and that’s not in scope at that point.

1 Like

I don't think you can generalize the error type in this way because the error type almost always contains the input type. If the error type doesn't contain the input then you could get away with more higher rank lifetime bounds, like so

pub fn re_match<E>(re: &Regex) -> impl for<'a>  Fn(&'a str) -> IResult<&'a str, &'a str, E>
where E: for<'a> ParseError<&'a str> {}

But that this won't work for most error types.


It looks like you are trying to over generalize the error type, why can't you just pick some custom error type to use?

1 Like

All combinators in nom work this way so that one can choose a more or less verbose error type in each individual case.

Sometimes a counter-example can be more helpful than a lengthy explanation.

Let's start from

and now imagine it being used:

let s = String::from("Hello, World!");
let partial
    : _ // some type Anon : for<'b> Fn(&'b str) -> &'b str
    = bar(&s) // `partial` closure contains `a = &s` // ---+
; //                                                       | lifetime 'a
drop(s); // make `&s` dangle <-----------------------------+
partial("Hello"); // Error, cannot use an object with a dangling reference

So, as you can see, although there may be a second lifetime 'b involved for the final returned thinggy, the closure object resulting from a partial application, which captures the input a: &'a str, is itself "infected" with the 'a lifetime bound.

That is, the Anon type has a Anon : 'a bound

In other words,

  • not only do we have:
    Anon : Fn(&'_ str) -> &'_ str (i.e., Anon : for<'b> Fn(&'b str) -> &'b str

  • we also have:
    Anon : 'a

Anon : 'a,
Anon : for<'b> Fn(&'b str) -> &'b str,
Anon : 'a
     + for<'b> Fn(&'b str) -> &'b str,
Anon : 'a + for<'b> Fn(&'b str) -> &'b str,

which written in an existential manner leads to bar() return type being:

impl 'a + for<'b> Fn(&'b str) -> &'b str

Hence the need to add a 'a bound to the impl ... returned existential type.

This can be even more confusing given that it is customary in Rust (rather arbitrarily truth be told) to write the lifetime bounds after the trait bounds, resulting in

impl for<'b> Fn(&'b str) -> &'b str + 'a

which visually seems to say that there is a lifetime 'a bound attached to the returned &'b str :sweat_smile:


When you don't specify a lifetime bound on an impl existential type, i.e., when you elide the lifetime bounds implicitly, then Rust uses the 'static lifetime "to fill the gap / hole / elided lifetime":

impl /*     */ for<'b> Fn(&'b str) -> &'b str
==
impl 'static + for<'b> Fn(&'b str) -> &'b str
  • Playground with the same error message as in the OP, but using 'static explicitely

So, your error message was just telling you that in your signature you were (implicitly) claiming that the closure returned by the partial application was not borrowing any local ('static bound), which was a lie given that it captured the local a: &'a str, which was itself borrowing a potentially local str.

3 Likes

I know, and you can just pick an error type for your project and stick to it. I normally just create a custom error type and a corresponding Result type-def when I work with nom. Trying to overgeneralize doesn't work that well.

Sorry, I was being unclear. I’m working on this issue in nom: Regex parsers to be ported from macros to functions · Issue #978 · Geal/nom · GitHub

Thank you. The version of bar that returns a lifetime-polymorphic closure is clear to me now. I’m still confused about the version in my original post (simplified down to the essentials as per @kornel’s comments) :

fn bar<'b, 'a>(a: &'a str) -> impl Fn(&'b str) {
    move |b| {drop(a)}
}

This fails to compile, but succeeds if I add the bound 'a: 'b. Unless I’m mistaken, this means that a lives at least as long as b, but I don’t understand what difference that makes. Moreover, I tested the “corrected” version on the playground in a case where 'b = 'static and 'a < 'static, which should violate the constraint, and yet it compiles. Clearly I’m misunderstanding some aspect of this.

This should compile

fn bar<'b, 'a>(a: &'a str) -> impl Fn(&'b str) + 'a {
    move |b| {drop(a)}
}

It doesn’t compile for me on the playground:

Compiling playground v0.0.1 (/playground)
error[E0482]: lifetime of return value does not outlive the function call
 --> src/lib.rs:1:31
  |
1 | fn bar<'b, 'a>(a: &'a str) -> impl Fn(&'b str) + 'a {
  |                               ^^^^^^^^^^^^^^^^^^^^^
  |
note: the return value is only valid for the lifetime 'b as defined on the function body at 1:8
 --> src/lib.rs:1:8
  |
1 | fn bar<'b, 'a>(a: &'a str) -> impl Fn(&'b str) + 'a {
  |        ^^

error: aborting due to previous error

error: Could not compile `playground`.

To learn more, run the command again with --verbose.

I'm a bit surprised by this too. My guess is that because you have 'b on the "wrong" function, the compiler finds some implied relationship with 'b at the time bar is called, and ties it to lifetime of the returned value (so it means that the closure can be called with arguments that are valid for bar's scope, not just any arg).

The more technically accurate version compiles (here 'b is not dependent on bar):

fn bar<'a>(a: &'a str) -> impl for<'b> Fn(&'b str) + 'a {
    move |b| {drop(a)}
}

and also this compiles, but would probably be annoyingly hard to use because of the extra bound:

fn bar<'a, 'b: 'a>(a: &'a str) -> impl Fn(&'b str) + 'a {
    move |b| {drop(a)}
}
3 Likes