Lifetime Mismatches in Rust

Lifetime Mismatches in Rust

Hello,

I am a Rust newbie, trying to learn how lifetimes work in Rust. I assume that the following example below does not compile because the lifetime of the &mut self borrow does not match up with the lifetime parameter 'a.

struct A(String);
struct B<'a>(A, &'a String);

impl<'a> B<'a> {
    fn change(&mut self) {
        self.1 = &self.0.0;
    }
}

fn main() {
    let a = A(String::new());
    let c = String::new();
    let mut b = B(a, &c);
    b.change();
}
   Compiling playground v0.0.1 (/playground)
error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
 --> src/main.rs:6:18
  |
6 |         self.1 = &self.0.0;
  |                  ^^^^^^^^^
  |
note: first, the lifetime cannot outlive the anonymous lifetime defined here...
 --> src/main.rs:5:15
  |
5 |     fn change(&mut self) {
  |               ^^^^^^^^^
note: ...so that reference does not outlive borrowed content
 --> src/main.rs:6:18
  |
6 |         self.1 = &self.0.0;
  |                  ^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'a` as defined here...
 --> src/main.rs:4:6
  |
4 | impl<'a> B<'a> {
  |      ^^
note: ...so that reference does not outlive borrowed content
 --> src/main.rs:6:18
  |
6 |         self.1 = &self.0.0;
  |                  ^^^^^^^^^

For more information about this error, try `rustc --explain E0495`.
error: could not compile `playground` due to previous error

So, if I set the lifetime &mut self to 'a, the example compiles:

struct A(String);
struct B<'a>(A, &'a String);

impl<'a> B<'a> {
    fn change(&'a mut self) {
        self.1 = &self.0.0;
    }
}

fn main() {
    let a = A(String::new());
    let c = String::new();
    let mut b = B(a, &c);
    b.change();
}

However, if I try to print b.1, I get an error:

struct A(String);
struct B<'a>(A, &'a String);

impl<'a> B<'a> {
    fn change(&'a mut self) {
        self.1 = &self.0.0;
    }
}

fn main() {
    let a = A(String::new());
    let c = String::new();
    let mut b = B(a, &c);
    b.change();
    println!("{}", b.1);
}
   Compiling playground v0.0.1 (/playground)
error[E0502]: cannot borrow `b.1` as immutable because it is also borrowed as mutable
  --> src/main.rs:15:20
   |
14 |     b.change();
   |     ---------- mutable borrow occurs here
15 |     println!("{}", b.1);
   |                    ^^^
   |                    |
   |                    immutable borrow occurs here
   |                    mutable borrow later used here
   |
   = note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0502`.
error: could not compile `playground` due to previous error

What is the lifetime of the b mutable borrow here? I thought that because of NLL and reborrowing, the lifetime of the b mutable borrow is restricted to that line (b.change()).

I would greatly appreciate any response/clarity. Thank you for your time, support, and help.

The whole pattern is problematic, no choice of lifetime annotations will help.

You’re trying to put a reference to the String in one field of a struct into another field of the same struct. This is called a “self-referential” datatype, and Rust doesn’t support those (at least out-of-the-box, without somewhat more complicated-to-use helper crates).

One of the main reasons (but not the only reason) why self-referential datatypes like this one are problematic is because of moving. In Rust, every value can be moved around. (E.g. by passing it to a function, assigning it to a variable, etc… lots of move operations.) As implied by this terminology, moving a value moves its location in memory. This means in particular that if a value of type B would be moved, the contained String would move with it, and the &String reference (references in Rust are implemented using pointers) would no longer point to the correct location after such a move.

Lifetime parameters on a struct in Rust will always refer to the lifetime of borrows of values outside of that same struct. So struct B<'a>(A, &'a String); means that the &'a String reference points to some existing String outside of the B struct. Typical examples for this are e.g. iterators like std::slice::Iter<'a> where 'a is the lifetime of a &'a [T] reference to an existing slice that is owned by someone else, not by the iterator, and this reference was used to create the iterator.

Typically, you’ll want to simply avoid bundling together a value and a reference to that value into the same struct. This can often be avoided, and if it can, that’s the easiest way to solve your problem. Other approaches -- if you do absolutely need such a datatype -- include using Arc<…> instead of &… references (and the “owner” field would be an Arc, too, in this case); or the beforementioned “somewhat more complicated-to-use helper crates” e.g. ouroboros - Rust.

6 Likes

Thanks for the detailed response. Would I use Rc in a single-threaded scenario instead of Arc?

Yes, in a single-threaded scenario, you can use Rc instead of Arc as an optimization, since those should be slightly faster to clone and drop.

1 Like

@steffahn has supplied a great practical answer, and this post is just about explaining (hopefully) what is going on with the lifetimes themselves for learning purposes.

Not quite (and it didn't work that way pre-NLL either). Although it's not exactly the same situation, let me run through a more typical explanation. What do you know about the lifetime here?

fn foo(&mut self) { /* ... */ }

That's pretty much the same as:

fn foo<'s>(&'s mut self) { /* ... */ }

That is, the function is generic over 's. And all you [1] know about the lifetime is that it's valid for the body of foo. Other than that, the caller chooses the lifetime, and it could be arbitrarily long -- however long the caller needs it to be. The programmer is declaring that foo can handle any lifetime that the caller can name/create/encounter.

Now back to change:

impl<'a> B<'a> {
    fn change(&'a mut self) {

When you call change, 'a is going to be the same as the lifetime on B<'_>. However, it's still the case that that lifetime is beyond the control of change -- it's decided by some outside situation. You [2] don't know how long it is. Just like with foo<'s>, there are no bounds on the lifetime of impl<'a> B<'a>. If the compiler lets someone call change, you know it's valid for the function body, but that's all you know.

In particular, the lifetimes in both scenarios don't need to end immediately after the call to the method. That's the shortest they can be, but it's a lower-bound only.


Ok, so, to your question -- what is the lifetime of the borrow? It's going to be determined by the needs of the callsite, and as per the declaration, it's going to be the same as the lifetime on the B<'_>. And we have:

fn main() {
    let a = A(String::new());
    let c = String::new();
    let mut b = B(a, &c);    // `B<'a>` created, references `c`
    b.change();              // `&'a mut b` used here
    println!("{}", b.1);     // `B<'a>` accessed here
}

The lifetime on the B<'a> has to be valid everywhere it's used -- so the lifetime is at least valid for the last three lines of main. (And generally, inferred lifetimes are as short as they can be, within the capabilities of the compiler [3].)

But here's the problem: on the b.change() line, you've mutably -- or more accurately, exclusively -- borrowed b for the [4] duration of 'a. When you hit the println!, b is still exclusively borrowed -- that line is part of 'a -- so you can't create a new, overlapping borrow (by printing b.1). Hence the error.

This is why &'x mut Thing<'x> is an anti-pattern: once you create it, Thing<'x> is exclusively borrowed for the remainder of its valid existence. You can't use it any more at all, except through the &'x mut in some way. You can't even call its destructor if it has one, so often this pattern can't compile at all.


OK, but sometimes it can compile -- it compiles if you remove the println! for example. What's up with that?

Well, there's no violation of the exclusive borrow if you just don't use the B<'a> ever again [5]. You exclusively borrow b for the rest of b's validity -- if there's no uses (including no destructor) after this point, that's fine, it's possible to create the borrow -- and then you call change. You've met the requirements to call it, so change can run perfectly fine.

You can just never use b again.

You may often see "it's impossible to create a self-referencial struct using references in safe Rust". This situation is the loophole -- but it has such exceptionally rare usefulness, and &'x mut Thing<'x> is such an antipattern, and the people running into this (or at least asking about it) are generally always in the learning stage, and probably even most seasoned Rust programmers don't realize it's possible -- that we just generally state "it's not possible". You can create one, but then you can never use it.

[6]


  1. as the author of foo ↩︎

  2. as the author of change ↩︎

  3. which is perhaps why your intuition said the borrow ended immediately after the call to change ↩︎

  4. remaining ↩︎

  5. or more accurately, if you only use it through the exclusive borrow in some way ↩︎

  6. Except via the exclusive borrow in some sense -- again, a very niche application you may never encounter outside of "playing around". ↩︎

4 Likes

For what it's worth, I have a suspicion that an enum is the solution.

FYI, the usual approach to avoiding a self-referential datatype is to keep the owned data and the referencing data in two separate datastructures. This does often mean some API changes, and the resulting API might feel slightly more clunky, but the benefit is that you avoid the overhead (run-time overhead, and/or overhead of a more complicated implementation) of implementing a self-referential data-type.

E.g. API designs such as group_by in Itertools are representative of the kinds of changes you might get in order to avoid self-referencing datatypes, even though that’s not quite the most canonical example since a self-referential alternative implementation of this would also need to deal with shared ownership amongst the items. Anyways… the API change here is what I wanted to present: the returned GroupBy<_, _, > struct is not an iterator itself, but instead the user/caller will need to store this struct somewhere (in a variable, or in a temporary), create a reference to it &GroupBy<_, _, >, and turn that reference into an iterator. I.e. the GroupBy<_, _, > struct is the owned part, and the iterator created from the reference is the referencing part, and the two are not combined into a single datastructure. By avoiding self-referencing datastructures here, the implementation can avoid the need for (additional) allocations, e.g. for Arcs. (To the point that .group_by – if consumed in-order – creates no additional allications at all!)

1 Like

I was thinking self-referential was simply unnecessary. The code implies an either-or choice.