Suddenly some lifetimes appear

I'm writing these iterators just fine without lifetimes, but now all of a sudden:

fn check_year_record<'a>(
    input: &'a str,
    field: &str,
    lower_bound: u16,
    upper_bound: u16,
) -> IResult<&'a str, u16> {
    let field = field.to_string() + ":";

    let (_, number) = terminated(
        preceded(
            tag(field.as_str()),
            map_res(digit1, |s: &str| s.parse::<u16>()),
        ),
        multispace0,
    )(input)?;

    if number >= lower_bound && number <= upper_bound {
        return Ok(("", number));
    } else {
        return Err(nom::Err::Error(nom::error::Error::new(
            "",
            nom::error::ErrorKind::Digit,
        )));
    }
}

This does not compile anymore without them. I introduced the field that is concatenated, so is that allocation forcing me to add these?

I don't understand how adding this suddenly breaks the entire thing. It worked fine before and the allocation of field + ":" has no consequence for any of the other fields.

Output lifetimes can only be elided when there is one input lifetime. Since you take two &strs (each of which has an elided lifetime), Rust doesn't know which string the output should borrow from, so it requires you be explicit.

No. Nothing in the implementation of a function affects its public signature. This is an essential feature of abstraction – if Rust didn't work this way, changing the body of a function could change its interface, which would be a huge pain.

The reason you are now forced to write lifetimes is because if you have to &str parameters, and the output is also &str, then the compiler has no way of inferring to which of these two you want to tie the lifetime of the return value.

If there is only one reference parameter, then the compiler assumes that you meant fn foo<'a>(&'a str) -> &'a str, because this is usually what you want indeed – but you can still override this using explicit lifetimes if you need to.

Also note that lifetimes don't appear; they are always there, albeit they can often be inferred.

5 Likes

So the change that made this happen is that there are two borrowed parameters instead of one?

I don't understand how to semantically resolve this. The output probably depends on the input and not on the field or maybe it does on both?

But how is the compiler supposed to know that? It can't just guess the intended high-level semantics of your function. For all it knows you might as well want to return a slice into the second parameter. That's why you have to specify it.

Well, at the moment the function can only return an empty string, so the output doesn't depend on anything.

Sure, but how am I supposed to know this. I don't know how the parser works. I guess it returns slices out of the input?

Sorry, which parser are you referring to? That of the Rust compiler, or something else?

The input here is passed to nom. Could I figure out what it does with it based on a function signature?

The signature is like this:

pub fn terminated<I, O1, O2, E: ParseError<I>, F, G>(
  mut first: F,
  mut second: G,
) -> impl FnMut(I) -> IResult<I, O1, E>

So very difficult to understand and no lifetime parameters.

Ah, that is pretty tricky to read. The terminated function returns a function. You then call that function with input. The function's argument is I, and since input has type &'a str, we thus assign I = &'a str.

So the first value in the tuple (_, number) that you are currently ignoring has type &'a str, so if you tried to return that first argument, you would need to use the 'a lifetime on that returned value.

1 Like

No, and you don't need to. That's what I'm saying. Your problem and the solution to it are both the consequence of the signature of your function, check_read_record(). Neither of them is a consequence of the implementation of your function.

To demonstrate this, here's a playground, which does not use Nom at all, fails to compile with exactly the same error, and compiles if you add the exact same lifetime annotations.


The point at which the signature of Nom parser combinators becomes relevant is when the compiler checks that your usage of Nom parsers is in accordance with the function signature. But you still need to specify the function signature first, and then write its body so that it compiles, and not the opposite.

1 Like

Yes.

To elaborate on what H2CO3 has said, not inferring the lifetimes from the body is important for the same reasons as not inferring the types from the body. It'd work fine if people only ever wrote correct-and-complete code, but I certainly can't do that :upside_down_face:

The compiler is intentionally doing things this way so that you can, for example, just write this at first, and fill in the details later:

fn check_year_record<'a>(
    input: &'a str,
    field: &str,
    lower_bound: u16,
    upper_bound: u16,
) -> IResult<&'a str, u16> {
    todo!()
}

And you still want the caller to get lifetime errors, rather than the compiler saying "well, whatever, it's not like the function you're calling uses the stuff anyway" until you put a real implementation in.

1 Like