Why does the borrow checker let me do this?

I recently ran into a situation where after tearing my hair out for some time saying "Why won't the borrow checker let me do this!?" I found a solution — but I'm now wondering "Why does the borrow checker let me do this?"

Here's a simplified version: If text is a &str reference to the contents of a text file, we want to update a HashSet by inserting the lines of the file — but sometimes we want a HashSet<&'a str> of string slices borrowed from text and sometimes a HashSet<String> of heap-allocated copies of those slices. I defined a trait to allow implementation sharing. (In the larger problem we also want operations other than insertion.)

Two things puzzle me about the solution below. First, I'm not sure why it's OK to say

impl<'a> LineSet<'a> for VecSet { ... }

I'd been under the assumption that the 'a in LineSet<'a> referred to the lifetime of the LineSet. Is it simply a lifetime parameter that can be used fairly arbitrarily in the trait and impl definitions?

Second, I wondered why it's OK to say:

impl<'a> LineSet<'a> for SliceSet<'a> {
    fn update_with(&mut self, line: &'a str) {
        self.insert(line);
    }
}

Here the lifetime of line could be longer than the lifetime of self, rather than identical to it. I'd expected to need this:

fn update_with<'b: 'a>(&mut self, line: &'b str)

This second version, explicitly specifying the larger lifetime, also works. Does the compiler desugar the first version into the second? What are the rules about such desugaring (if that's in fact what's happening)?

[Edit: I'd mistakenly used the playground version with <'b: 'a>, but meant to use the one without.)
(Playground)

use std::collections::HashSet;

trait LineSet<'a> {
    fn update_with(&mut self, line: &'a str);
    fn update_with_all(&mut self, text: &'a str) {
        for line in text.lines() {
            self.update_with(line);
        }
    }
}

type SliceSet<'a> = HashSet<&'a str>;
impl<'a> LineSet<'a> for SliceSet<'a> {
    fn update_with(&mut self, line: &'a str) {
        self.insert(line);
    }
}

type VecSet = HashSet<String>;
impl<'a> LineSet<'a> for VecSet {
    fn update_with(&mut self, line: &'a str) {
        self.insert(String::from(line));
    }
}

fn contents() -> String { String::from("abc\ndef\n") }

fn main() {
    let saved = contents();
    let mut slice_set = SliceSet::default();
    slice_set.update_with_all(saved.as_str());         // `saved` outlives `slice_set`
    // slice_set.update_with_all(contents().as_str()); // `contents().as_str()` does not
    println!("{:?}", slice_set);

    let mut vec_set = VecSet::default();
    vec_set.update_with_all(contents().as_str());
    println!("{:?}", vec_set);
}

I'll try and address your questions.

Is it simply a lifetime parameter that can be used fairly arbitrarily in the trait and impl definitions?

Here, impl<'a> defines a lifetime to be used in the block. In this, you're literally saying, "for some lifetime 'a implement LineSet (belonging to that lifetime 'a) for VecSet".

Here the lifetime of line could be longer than the lifetime of self, rather than identical to it.

You're right that line could live longer than self, and that's totally fine. Imagine calling update_with method with a static string. The compiler tries to guess the lifetime (based on elision rules), and in your case, since line lives atleast as long as self, we're good.

This second version, explicitly specifying the larger lifetime, also works.

Same as above. line (of lifetime 'b) lives at least as long as self (of lifetime 'a).

Does the compiler desugar the first version into the second?

I don't think so, because the lifetimes of line are different.

Now, coming to code:

slice_set.update_with_all(saved.as_str());

Here, saved lives as long as the main function. And, SliceSet lives as long as the main function. So, it doesn't outlive.

slice_set.update_with_all(contents().as_str());

In this case, contents() returns an owned String, and you're immediately getting the reference to it. This means the owned string will be dropped / destroyed after executing update_with_all (and the reference in SliceSet will then be dangling), which cannot be allowed, because SliceSet contents should live at least as long as itself.

This doesn't matter for VecSet because it only needs the string slice to live as long as update_with_all, which it does.

1 Like

The technical detail is Variance, but don't expect this to give much help explaining it in use.

Avoid saying "lifetime of" it leads to confusion. This is saying LineSet is bound by the lifetime, there is a borrow that constrains how much the structure can move about.

4 Likes

Indeed, I feel like this terminology leads to a lot of problems.

A lifetime is more like a set of read/write locks. It describes a set of objects which must not be touched in order for the data in some other object (the one with the lifetime) to remain valid.

  • These locks are created whenever something is borrowed immutably or mutably in a function body.
  • T: 'a means that, for as long as T exists, all of the locks in 'a will be held.
  • 'b: 'a means that every read/write lock held by 'a is also held by 'b.

This much is easy to explain, but it is perhaps hard to perceive.


(Edit from me 11 days in the future: The second bullet is only half true, and the third bullet is completely backwards! I'm currently writing a blog post that will beat this topic to death.)

4 Likes

Not in this case. If you replace &mut self with &'a mut self, the code won't compile. It's because Self is HashSet<&'a str> that line can be stored inside it. The elided lifetime is just the duration of the mutable borrow of self, which can be shorter than 'a.

1 Like

Oh right! Sorry, yes. That was wrong.

Ah! That cleared up a lot of my confusion. In code like this:

let saved = contents();
let mut slice_set = SliceSet::default();
slice_set.update_with_all(saved.as_str());
println!("{:?}", slice_set);

the slices stored inside slice_set reference pieces of saved, and thus can be used as long as saved exists.

Thanks to everybody else too! @jonh's link to Variance was particularly helpful. I'd skimmed it before, but this time I had the motivation to study it.

1 Like