When do you find lifetime annotations helpful?

Dear fellow rustians, this is my first post here and I'm sorry that this is basically just a new rant about lifetime annotations. But after more than a year of learning Rust, lifetime annotations simply don't stop to bother me. I simply don't get it, regardless how hard I try. For me, lifetime annotations are the worst part of the language. If I had just one wish for the 2021 edition, that would be to get rid of them. have better Lifetime Elision.

Again I've spent the past few days learning all I can about lifetimes to fight my fear of them. I understand the syntax and I understand how to solve a lifetime problem in isolation. But when the code gets more complex, I am simply afraid of anything that has to do with them. There are a lot of other people who have asked similar questions, but non of the answers seems satisfying to me. People start talking about variance and covariance etc. but rarely say why lifetime annotations are actually useful. In every discussion I can clearly see how confusing lifetimes can be and that they are often very hard to explain to newcomers. But anyway, for some undefined reason, people want them to stay. Sometimes I think that I need to become a type theory expert to fully understand their usefulness. IMHO the book does not explain lifetimes well enough and don't even mentions anonymous lifetimes. For me, anonymous lifetimes are build around some intuition that some of you may have, but I don't. And there is no clear definition when it is possible to use them. Static lifetime can also be used in a lot more situations, that are not mentioned in the book. Writing lifetimes often feels like doing the job for my computer and all I want is that the compiler warns me, when a value doesn't live long enough. First time I heard how Rust manages memory, I thought, wow this is great and I simply implied that Rust does infer these things for me. But Rust doesn't. And I saw that it was the case in the earlier days? Why was it changed? Lifetimes do wake the fear in me of using references at all and it leads to a less performant code in my opinion. Lifetime-heavy code becomes very noisy and unreadable. Lifetimes also have a lot of subtle differences depending on where you put them and that makes the code hard to follow. When I'm writing a function that needs some lifetime annotations, I often feel like I need to know where it will be used, but why should I? Maybe I want that my function is reusable as much as possible and can't imply which lifetime is longer or shorter. Why do we need to type the lifetimes out when the compiler could analyze all possible variants and warn me when it finds a violation? This is what the computer is for! Or is this exactly the problem that would make compile times much longer? And there are a lot of misconceptions around lifetimes. This article (rust-blog/common-rust-lifetime-misconceptions.md at master · pretzelhammer/rust-blog · GitHub) led me actually realize how confusing and counterintuitive lifetimes are and that I'm not the only one who taps in all of those misconceptions.

In my eyes there are so many disadvantages to lifetime annotations, that there must be one or more >>huge<< advantages that justifies their existence. So please help me to get this once and for all. What is so helpful about lifetime annotations?

Edit:

I've changed the title form "Why do you like lifetime annotations?" to "When do you find lifetime annotations helpful?", because it caused some confusion. I was really interested specifically in situation when do lifetime annotations are most beneficial and not about the likability of this concept itself.

I'm sorry about this rant. Lifetime annotation can be tricky but now I do understand that they fulfill an important role.

9 Likes

As a first approximation, lifetime annotations are the name of some stack frame. Any variable or data structure that carries that name around with it will be destroyed before the locals in that stack frame are. This lets us store and work with references to local variables in an extremely flexible way, with the compiler making sure we don’t make a mistake and end up with a security vulnerability.

10 Likes

For a rough comparison, lifetime annotations are a lot like type annotations.

Why lifetimes and types exist

Lifetimes and types allow a compiler to do static analysis of your code after which you can avoid run-time checks without compromising safety. With types, those runtime checks are what you get in dynamic languages, mismatching types are detected and reported at runtime. With lifetimes it is more about memory management. Languages that don’t have them, while still being memory safe, usually lean heavily on one or the other form of garbage collection. So in effect, both lifetimes and types can offer good performance benefits.

Why lifetimes and types are explicitly written

Now the above was about lifetimes and types themselves. What about lifetime annotations and type annotations? The first reason why you have those: Compilers are just not smart enough or there might even be a fundamental reason against why all types and lifetimes can be inferred automatically. The details of course differ from type-system to type-system. The second reason is: you want to specify your interface using those annotations.

Benefits of specifying your interfaces with explicit types and lifetimes

I suppose this interface aspect is the most prevalent reason why Rust has and even forces type annotations as well as lifetime annotations. When a library provides its API, it is useful that it explicitly defines the interface of its datatypes and functions/methods in terms of type signatures. First up, this serves as great documentation, of course you still want to also have comments to keep the docs human friendly. Just comments on their own however tend to be vague/inaccurate and can thus be confusing – as far as I know, in many untyped languages good doc comments often also describe the type signature in one or the other way. Secondly, explicit annotations make it easy for the library author to retain stability. If you change your function in a way that changes its signature, you get an error, because the previous types are still there in your annotations.

Now all of this applies to lifetime annotations, too. If you learn to read them, they serve as great documentation of how ownership works in the API you’re trying to understand, and it they were implicit it would be even easier than it is with types to mess up things for library authors and change behavior in a breaking way (since lifetimes indeed are less intuitive than ordinary types).

As a final thing that both type and lifetime annotations offer: You can restrict your API further than necessary. An inferred type signature would always be the most general allowed signature possible, which could be more general than what you actually want to offer. In particular with the interaction of safe and unsafe code in rust, you sometimes need to specify lifetimes explicitly because the compiler does not know the guarantees that you are willing to give about the unsafe code that you wrote. Another relevant scenario is when you’re still developing and only have stub implementations for some functions/methods. The value in being able to type-check (and borrow-check) your program before you’re done writing it it pretty high in my opinion.

Specifying types / lifetimes in a lot of places also makes errors be more local. Global / top-level inference tends to generate out-of-place or misleading error messages. When you can see everywhere in your code how the lifetimes work, you can actually more easily fix errors. Imagine what it would be like if the compiler would complain about lifetimes that didn’t come from you but ones that it inferred itself. I would assume those kind of errors à la “I made up these lifetimes for you and they don’t work” would be orders of magnitude harder to understand and fix than what we currently have.

16 Likes

Why do you like lifetime annotations?

And why do we like function names too, for example? It's so annoying that we have to specify which function to call, while it's just common sense, the compiler really shouldn't force us to tell which function to call!

The point is: we might not like them but they are sometimes necessary.

2 Likes

@krevativ it's not clear to me are you actually talking about lifetimes in general or only about lifetime annotations?

1 Like

For curiosity's sake, what languages were you experienced with before you started learning Rust?

References are a sharp tool and there are roughly three different approaches to sharp tools.

  1. Don't give programmers sharp tools. They may make mistakes and cut their fingers off. This is the Java/Python/Perl/Ruby/PHP... approach.
  2. Give programmers all the sharp tools they want. They are professionals and if they cut their fingers off it's their own fault. This is the C/C++ approach.
  3. Give programmers sharp tools, but put guards on them so they can't accidentally cut their fingers off. This is Rust's approach.

Lifetime annotations are a safety guard on references. Rust's references have no sychronization and no reference counting -- that's what makes them sharp. References in category-1 languages (which typically do have synchronization and reference counting) are "blunted": they're not really quite as effective as category-2 and -3 references, but they don't cut you, and they still work; they might just slow you down a bit.

So, frankly, I like lifetime annotations because they prevent me from cutting my fingers off.

"This would be a great circular saw if only it didn't have this safety guard" is not something you will hear a whole lot of carpenters say.

What does "get rid of them" mean exactly? Do you have an alternative in mind that will also prevent chopped fingers, or do you mean you wish Rust were in category 1, or do you wish that it were in category 2?

I mean, that (inferring ownership and lifetimes based on usage) is just what (most) category-1 languages do already, right? There's nothing really novel about that. Most JVM languages do it pretty well, I think. If you're looking for a statically-typed safe language without lifetime annotations, have you considered maybe Kotlin or Scala?

13 Likes

The core proposition behind Rust is that it should be possible to verify memory safety at compile time, even without a garbage collector. To do that while still allowing programs that do moderately complicated things with memory, you're going to need something in your language that is at least as complicated as lifetimes.

One of the core design principles behind the rules of the languages is that verification must be possible to do using local reasoning. This means that when compiling a function, you should never have to look inside the body of any other function — the function signature should be enough. This has a very very nice property: Changing the implementation of a function can never break code that uses it, if the signature is left unchanged. This is absolutely essential for making libraries with backwards compatibility possible at all, and I generally find it comforting in my own code too.

This poses a challenge, because if lifetimes did not exist, then the things I described above would not be possible without limiting what is possible to write significantly. Consider these two functions:

// substr_until("foobar", "bar") = "foo"
fn substr_until<'a>(haystack: &'a str, needle: &str) -> &'a str {
    let idx = haystack.find(needle).unwrap_or(haystack.len());
    &haystack[..idx]
}

// get_shortest("foobar", "bar") = "bar"
fn get_shortest<'a>(str_a: &'a str, str_b: &'a str) -> &'a str {
    if str_a.len() < str_b.len() {
        str_a
    } else {
        str_b
    }
}

Besides the lifetime annotations, these two functions have the exact same signature. That means that without lifetimes, any code that compiled with one of them should still compile if changed to use the other function (or equivalently, if the body of one is replaced with the body of the other).

Consider this function:

// substr_until_char("foobar", 'b') = "foo"
fn substr_until_char(haystack: &str, needle: char) -> &str {
    let needle_str = needle.to_string();
    substr_until(haystack, &needle_str)
}

This compiles just fine. However consider what happens if replaced with get_shortest:

fn substr_until_char(haystack: &str, needle: char) -> &str {
    let needle_str = needle.to_string();
    get_shortest(haystack, &needle_str)
}
error[E0515]: cannot return value referencing local variable `needle_str`
  --> src/lib.rs:18:5
   |
18 |     get_shortest(haystack, &needle_str)
   |     ^^^^^^^^^^^^^^^^^^^^^^^-----------^
   |     |                      |
   |     |                      `needle_str` is borrowed here
   |     returns a value referencing data owned by the current function

The needle_str variable is destroyed when it goes out of scope inside substr_until_char just before returning, so if the returned reference pointed into that String, we would return an invalid reference, and therefore the code does not compile.

Sure, you could have a language that didn't need these lifetimes, but this has consequences. One option would be to have a simpler system, but then substr_until_char would likely be impossible to write. Another option is to remove the need for lifetimes and look inside the body of functions, but then:

  1. You can no longer write backwards compatible libraries.
  2. Compilation time would drastically worsen, as you can no longer type-check functions independently (and in parallel!), as you have to look at every function at the same time, including the functions in every dependency.

You could also use a GC, but then it would no longer be Rust.

In short: Lifetimes allow the compiler to know what is inside other functions without looking inside those other functions.

32 Likes

This is the kind of thinking where the "heap-tracing garbage collectors are great!" arguments come from. You may not want to care about how a borrow will be used, but something needs to care. From the borrow checker's point of view, accessing borrowed memory that is invalid is not memory safe (and therefore a compile error). And this makes sense on the surface, but what you seem to be missing is that the compiler does not do non-local evaluation of borrow lifetimes. (If it did, the build times would be astronomical, perhaps even running afoul of the halting problem.) Because lifetime checking is only done local to the function or type, lifetime annotations are needed to tie together the external linkage with that type.

I don't particularly like lifetime annotations. They are crucial, however; a necessary evil. I suspect there will never be a world where Rust does not have some kind of explicit lifetime annotation. This doesn't mean that the current situation will not improve. In fact, I believe that there are several small paper cuts that can be patched up to provide a much better developer experience. But I do not think the solution is "get rid of them."

Take for instance the cases where lifetime elision is not possible; a function accepting two borrows and returning one. The syntax needs to be clear about which of the two inputs relates to the output. This is inescapable. One could argue that the current lifetime annotation syntax is ugly, yes, but it gets the idea across and it works in practice. Making improvements to the syntax might be difficult (due to the existing corpus of code) but also may be worth additional discussion.

Similarly, the case where a borrow is being stored within a data structure, the developer needs some mechanism of bounding what can actually be borrowed. And again, it might be a case where clearer syntax could help... I don't really know.

My point here is that the syntax could potentially use some love. It isn't always clear what the annotation is conveying. Even though an experienced Rustacean will be able to work through it just based on experience alone (even if I can't always describe to others why it works, for example.)

I think the more complex uses of lifetime bounds in generics and traits are where things get really tricky; HRTB and so on. I have no other opinions here.

It is clear that regardless of whether people "want lifetime annotations to stay", they are absolutely required for writing correct code where borrows are involved. They may not always exist in the state they do now, but lifetimes will not be going away.

7 Likes

Dear rustians, first, thank you for all your answers. You are awesome! I’m actually a bit embarrassed because I have written this rant. I was a bit emotional and frustrated because I can’t grasp some concepts, like anonymous lifetimes for examples. And I do understand and appreciate the general concept of lifetimes and lifetime annotation and don't want to replace them with a GC. And never wanted to suggest to getting rid all of that. I only wished, Rust would be a bit more smart about lifetimes sometimes. Because I'm loosing my self in all the options that I have and investing the energy in something that seems so easy for a computer is a bit annoying. But it seems only to be easy because I am used to GC based languages.

And I’m sorry for all the bad wording like "get rid of …" and "why do you like …". I think this led to a lot of confusion. I hope my rant will not scare anybody and I will edit my post accordingly. My question should have been "When do you find explicit lifetime annotations helpful?"

@steffahn you made some good points. I don’t agree only about one thing. I'm not sure that the compiler generated lifetimes would be more misleading than handwritten ones. Rust does already generate lifetimes for simple cases and it is understandable.

@troplin I’m talking explicitly about lifetime annotations ('a, '_, ...). Using lifetimes for memory management in general is a great idea.

@trentj My main background is JavaScript/TypeScript. But I also have some experience in Java, Python, Ruby. So all of those, without the sharp tool :slight_smile: And a bit C/C++ and ObjectiveC from the university as I did my master, but it's really not much. I hated it to be left alone in C/C++ because I know that I am a human. That is why I do actually like Rust very much. (My colleagues at work can’t stop hearing that every day :D)

@Alice thank you, that was very helpful.

@parasyte You’ve made exactly the point that I was missing:

"..., but what you seem to be missing is that the compiler does not do non-local evaluation of borrow lifetimes"

Because what I’ve imagined was something like this:

fn sim(o1: &str, o2: &str) -> &str {
    let r: u8 = rand::random();
    if r < 50 {
        o1
    } else {
        o2
    }
}

fn main() {
    // case 1
    let o1 = String::from("o1");
    let o2 = String::from("o2");
    let res = sim(&o1, &o2);

    // case 2
    let o1 = String::from("o1");
    {
        let o2 = String::from("o2");
        let res = sim(&o1, &o2);
    }
}

And that Rust automatically generates the lifetime annotations for case 1 and case 2 and only forces me to write them out only if I want to enforce some specific case. But for this to happen, a non-local evaluation of borrow lifetimes would be necessary. You are totally right!

So thanks again to all of you. I'm now looking at lifetime annotations width different eyes.

19 Likes

I won't attempt to describe my understanding of lifetime annotations. That would likely just add to the confusion.

I do agree that all those tick marks and angle brackets can make code look like random "line noise" and make it very hard to read, never mind write yourself. That kind of syntactic noise stalls my mind.

Perhaps amazingly after tinkering with Rust for almost a year I don't recall ever actually having to write any lifetime annotations into my own code. Maybe I'm not trying to write anything with complex enough data structures to require them. But that code gets the job done and has been running in production for many months without issue.

The most significant thing I ever read about lifetimes went something like this:

"Life time annotations do not specify the lifetime of any data in your program, your program's structure does that. Lifetime annotations are only to help the compiler understand your intentions."

With that in mind we see that lifetime annotations are like the need for specifying the types of function parameters. Surely if I always call a function with an i64 the compiler should guess the parameter type and save me having to write it in the function declaration?

Well, likely it could, if it could read every call ever made, anywhere, to that function at the time it compiles it. Clearly not reasonable. Not even possible in the case of a library function where the code that will use it has not even been written yet!

Anyway, the best explanation and demonstration of the use of lifetime annotations you are ever likely to see is this " Crust of Rust: Lifetime Annotations" by Jon Gjengset : Crust of Rust: Lifetime Annotations - YouTube Well worth watching.

12 Likes

I think this is something people sometimes overlook:

It's very possible to write idiomatic, safe, sound, and fast Rust code without ever using a lifetime annotation explicitly. You might end up cloning some things that you don't need to, but cloning is usually cheap, especially if you use Arc to share data without the bother of explicit lifetimes. It's easy to get carried away with putting references everywhere because you can, but if specifying the lifetimes is burdensome, don't lose sight of the fact that you don't have to.

8 Likes

It's funny you should mention that, since I dabble in programming language design as a hobby, and I've spent years trying to figure out if it's possible to combine a Rust-like lifetime system with full type inference. Unfortunately, I suspect that inference is undecidable in the general case, but I think you could make a decent system with relatively minimal annotations.

That being said, be careful what you wish for. Even if it were technically possible, I'm not convinced that it would necessarily even be desirable. Rust is already a bit too magical for my tastes. Most of the time, the magic does what you want, and it's great and convenient, but when it does go against you, it makes for very frustrating errors.

3 Likes

If it was Polonius how do you solve this lifetime inferences ?

@serak so there is an alternative borrow checker called Polonius? That's interesting. Will the current borrow checker ever be replaced by it or is it just an experiment? I can't answer your question because I think it's beyond my level of expertise and I don't think that I fully understand Polonius either. But I could try: My naive solution would be to copy my thought process. I wouldn't do any local evaluation of a function because I would assume that the lifetime of a parameter is only known when the function is used. Instead, I would derive the lifetimes from the usage of that function and check if the lifetime of the returned value outlives its use in return (I hope it is understandable). Polonius seems to do something similar, but like it's stated in the blog post, and But like some people here already suggested, the downside of this approach would be the lower compile time performance.

Polonius will most likely end up in the compiler eventually in some form, but the exact details are not pinned down yet. Note that polonius does not do any form of non-local reasoning that require it to look at multiple functions at once, just like the current borrow checker.

1 Like

I thought of yet another reason that lifetime annotations cannot be inferred globally: unsafe code.

Global lifetime inference depends on being able to analyze code to decide how it uses lifetimes. Code in an unsafe block, however, cannot be fully and correctly analyzed by the compiler: if it could, it wouldn't need to be marked unsafe. So the compiler can't look inside an unsafe block to determine how it uses lifetimes.

For example, consider Cell::get_mut. Cell is built on UnsafeCell and get_mut just uses its raw pointer API. But raw pointers have no lifetimes, so the compiler has no way to know that get_mut's output lifetime is derived from its input lifetime. In other words, instead of this signature:

fn get_mut<'a>(&'a mut self) -> &'a mut T    // unelided

the compiler might as well infer this one:

fn get_mut<'a, 'b>(&'a mut self) -> &'b mut T

Which would have disastrous results, since the soundness of the API depends on the lifetimes being connected.

There's no way to fix this by making the compiler smarter because the correct lifetime signature of get_mut is not derived from the code itself; it's derived from the intent of the code. And since (as far as the compiler knows) any function with an unsafe block in it could be intending to uphold some lifetime relationship, all functions that contain unsafe code must either have explicit lifetime annotations on them, or be transitively marked unsafe themselves.

And that, in turn, means that you can't add an unsafe block even inside a function marked pub without breaking backwards compatibility. Because if you add an unsafe block, now your function has to have explicit lifetime parameters where it didn't before, and those lifetime parameters can't necessarily express all the flexibility that you had with global inference, because they are constrained to only local reasoning.

This is really just a special case of "what if you want to change the behavior backwards-compatibly?" but with the added complication that even if you don't change the behavior of a function with respect to its lifetimes, the presence of an unsafe block makes that opaque to the compiler.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.