Lifetime specifier question


#1

Hi,

I had a lifetime issue with a seemingly simple code.

The actual code:

use rustc_serialize::json::Json;
pub fn parse_request_type<'a, T: From<&'a str>>(json: &'a str) {
    let data = Json::from_str(json).unwrap();
    let request_type = {
        let object = data.as_object().unwrap();
        let request_type_str = object.get("test").unwrap()
                                         .as_string().unwrap();
        T::from(request_type_str)
     };
}

The error message:

src\utils\parsing_utils.rs:6:22: 6:26 error: `data` does not live long enough
src\utils\parsing_utils.rs:6         let object = data.as_object().unwrap();
                                                  ^~~~
src\utils\parsing_utils.rs:3:64: 11:2 note: reference must be valid for the lifetime 'a as defined on the             block at 3:63.
..    
src\utils\parsing_utils.rs:3 pub fn parse_request_type<'a, T: From<&'a str>>(json: &'a str) {
                                                                                            ^
src\utils\parsing_utils.rs:4:46: 11:2 note: ...but borrowed value is only valid for the block suffix following         statement
 0 at 4:45
src\utils\parsing_utils.rs:4     let data = Json::from_str(json).unwrap();

mbrubeck on irc was kind enough to help me to change the code and now it compiles.

New version:

pub fn parse_request_type<'a, T: for<'b> From<&'b str>>(json: &'a str) {
    let data = Json::from_str(json).unwrap();
    let request_type = {
        let object = data.as_object().unwrap();
        let request_type_str = object.get("test").unwrap()
                                                  .as_string().unwrap();
        T::from(request_type_str)
    };
}

However neither me nor mbrubeck actually understands the difference and why it helps. I’m writing this question hoping that someone can actually explain it.

  • why the original code did not compile
  • why the new code compiles
  • what the actual difference is between the two lifetime specification/declaration/whatever it is

Thanks in advance.


#2

The original code specifies T: From<&'a str>… which means that T::from has precisely the type fn(&'a str) -> T. The string you pass into it has a lifetime shorter than 'a, so it gets rejected. This is required for correctness; for example:

pub fn g<'a, T: From<&'a str>>(j: &'a str) -> T {
    T::from(j)
}
pub fn h<'a>(x: &'a str) -> &'a str {
     // The value returned from g must have lifetime 'a
    g(x)
}

The new code is different; it says that T must implement From for any lifetime, i.e. the lifetime of T is independent of the input string. For example, String implements for<'b> From<&'b str>.


#3

Thanks, now I understand why the first version does not work (data really has a shorter lifetime than the input parameter, I incorrectly specified what I really want). But I still have some doubt about what the second version is doing. You said that the for syntax tells the compiler that the lifetime of T is independent of the input string and it sounds somewhat logical but it’s not crystal clear. What does this tell the compiler then?

pub fn parse_request_type<'a, 'b, T: From<&'b str>>(json: &'a str) {

And why is the compiler trying to figure out the lifetime of T anyway? How did that come into the picture? According to my basic understanding, the From<&'b str> part tells nothing about the lifetime of T, it only tells something about the lifetime of the parameter of that trait method so why does it suddenly care about the lifetime of T? It should only matter if T actually stores that &str but that’s not the case.


#4

Hmm… maybe it’s clearer if we get the trait out of the way:

pub fn x<'a, T>(s: &'a str, convert: fn(&'a str) -> T) -> T {
    convert(s)
}
pub fn y<'b, T>(s: &str, convert: fn(&'b str) -> T) -> T {
    convert("asdf")
}
// Equivalent to y, but with the implicit lifetime spelled out
pub fn y2<'a, 'b, T>(s: &'a str, convert: fn(&'b str) -> T) -> T {
    convert("asdf")
}
pub fn z<T>(s: &str, convert: fn(&str) -> T) -> T {
    convert(&"zxcv".to_owned())
}
// Equivalent to z, but with the implicit lifetimes spelled out
pub fn z2<'a, T>(s: &'a str, convert: for<'b> fn(&'b str) -> T) -> T {
    convert(&"zxcv".to_owned())
}

We have three functions, each taking a similar function pointer.

The function pointer in x expects an input of lifetime 'a, so the lifetime of the function pointer’s input must exactly match the outer function’s input string. Note that this is actually a little more flexible than it might seem; Rust will automatically reborrow to fix up lifetimes in common cases, so you can still pass in a string constant or something like that.

The function pointer in y expects an input with an arbitrary lifetime 'b. The only lifetime which outlives every lifetime is 'static, so this formulation is basically useless; you can just write 'static explicitly.

The function pointer in z takes an input string with any lifetime. This is sort of the obvious case: the lifetime of the function pointer’s input is independent of any lifetime associated with the outer function. for<'b> is the syntax to name this sort of lifetime.


#5

Hmm, I might be too dense or my understanding of english is seriously lacking but I still don’t understand it fully.

I can understand the first sentence, let’s say that y expects an input with lifetime 'b. Why does this imply that it has to outlive everything?

input with an arbitrary lifetime 'b.

an input string with any lifetime

To me these two phrases are the same, except that the first one named the lifetime parameter. What’s the difference between an arbitrary and any?


#6

In y, there is only one lifetime 'b, chosen by the caller of y, which is live throughout the body of y. In z, the lifetime is a parameter of convert, so we can instantiate a different lifetime each time convert is called.

If that doesn’t make sense, I’m not sure how to explain it; maybe someone else can step in. It’s one of those things that’s difficult to explain, but completely obvious once you get it.


#7

:frowning: It’s probably similar to the monads. Once someone understands what a monad is, loses the ability to explain it. Anyway, I’ll just ignore this and keep on writing code, hopefully some day it will click.

Thanks anyway!

p.s.: if anyone has a good idea how to explain it, feel free to try:)


#8

I’ll give it a try…

  • For each instantiation of y/y2, convert takes a function with an arbitrary concrete lifetime, i.e. convert is not generic over the lifetime of the reference ('b). 'b is determined by the caller of y, not by y itself.
  • Inside z/z2 however, convert is still a template and the concrete lifetime is determined by the argument that is passed to convert.

This is how y is called:

// This function is *not* generic over the lifetime of mystr
fn convertY(mystr: &'static str) -> usize { ... }

let size = y("asdf", convertY);

And this is how z is called:

// This function *is* generic over the lifetime of mystr
fn convertZ<'a>(mystr: &'a str) -> usize { ... }

let size = z("asdf", convertZ);

Essentially, when implementing y, you know nothing about what lifetime 'b will be in the end, and thus you have to assume it could be 'static.

It’s a bit difficult to explain, because the difference between lifetime variables and concrete lifetimes is not obvious. The only concrete lifetime I know is 'static.

It’s probably easier to understand with types instead of lifetimes:

fn convertU32(mynumber: u32) {...}
fn convertStr(mystr: &str) {...}
fn convertString(mystring: String) {...}

pub fn impossible<T>(convert: fn(T));

Now implement a function impossible that can be called by like:

impossible(convertStr);
impossible(convertU32);
impossible(convertString);

This is not possible because there is no value that is &str,String and u32 at the same time. The only possible way would be if the argument to convert would also be provided by the caller, which would be the equivalent to the x function.

For lifetime it is possible, because there is one lifetime which encompasses all others, namely 'static.


#9

If this function is generic over the lifetime then why do we need the for<> syntax? If the for<> syntax is more generic then why do we need the less generic one? What are the advantages or disadvantages of them?

I’m totally lost. I swear if I ever get this I’ll write a table with examples and detailed explanation.

Anyway, tomorrow, I’m going to read all this again a few times.:slight_smile:


#10

One is a function declaration and the other is a function pointer type.

Function declaration:

fn convertZ<'a>(mystr: &'a str) -> usize { ... }

and that function (or rather a function pointer to that function) is of type

for<'a> fn(&'a str) -> T

I think this is just a syntax thing. You need a place to declare the lifetime.

However, I don’t know why the for<...> syntax was chosen and not something like this:

fn<'a>(&'a str) -> T

probably because it would not work for “normal” (i.e. non-function) types.


EDIT:

fn(usize, &str) -> String 

is just fancy syntax for something like:

FunctionPointer<String, usize, &str> 

and having this in mind, for<...> syntax actually makes sense.

You cannot just write:

pub fn y<T>(s: &str, convert: FunctionPointer<T, &'b str>) -> T {
    convert("asdf")
}

without having declared 'b.
To specify that convert is generic, you have to declare 'b like this:

pub fn y<T>(s: &str, convert: for<'b> FunctionPointer<T, &'b str>) -> T {
    convert("asdf")
}

#11

I’ll try another tack. Let’s put aside the question why the compiler needs to calculate the lifetime of T (potential mutability? invariance?) It needs to, and, as you’ve seen, the signature <'a, T: From<&'a str>>(json: &'a str) constrains that lifetime to one that exceeds that of the function, so it gets rejected. So far so good.

Now, let’s see what the signature <'a, 'b, T: From<&'b str>>(json: &'a str) tells the compiler: that the parameter to the trait has a lifetime that’s independent of the json parameter, but still exceeds that of the function body, so it will again get rejected. In other circumstances, that lifetime parameter setup would have resulted in an error like “cannot infer appropriate lifetime”, precisely because 'a and 'b are completely unrelated.

That leaves the for <'b> variant: it doesn’t tie the lifetime to one that exceeds the function body (as all those declared in the function signature must be), so, barring other problems, the compiler can work out the solution.