Access to implicit lifetime of containing object (aka 'self lifetime)


#1

Hi,
In my relatively simple app I’ve experienced the exact same problem very well articulated here:

I.e. I had nested structures 3 levels deep, and I had to introduce a reference to the innermost struct.
As a result I had to add explicit 'a lifetime to all parent structs that was not necessary before. That’s the unfortunate “contagious” effect described in that post.

Since post is 3 years old, I wonder has anything changed in this regard and is there an elegant solution available now?

P.S. I’ve opportunistically :slight_smile: tried to use 'self just in case it was added since then, but it does not work.


Why don't references in structs have a default lifetime?
#2

The good news is that upcoming version of Rust will expand lifetime elision, so various <'a>s will be needed less often.

The bad news is that <'a> in structs isn’t part of that. The idea of lifetime elision in struct definitions was shot down, as many users really want to see in a noisy explicit way whether a struct may contain a reference.

The 'self lifetime might also suggest it’s a lifetime for self-referential structs. Note that such structs are not possible in safe Rust, since moving of such structs to a different memory address (something that Rust allows and does frequently) would invalidate all their internal references.


#3

Note that I’m not asking for any additional automatic elision, just a way to access implicit auto-generated lifetime.
To be specific, at first I had this (removing Window's parent for brevity):

struct Window {
    logic: Logic,
}

pub struct Logic {
    time_sec: i32,
}

and then I needed to add str to Logic, so I was forced to add <'a> all around:

struct Window<'a> {
    logic: Logic<'a>,
}

pub struct Logic<'a> {
    time_sec: i32,
    name: &'a str,
}

just to be able to write name: &'a str.

Now, I’m not suggesting the following should work:

pub struct Logic {
    time_sec: i32,
    name: &str,
}

where magically str's lifetime would be auto-assigned to be the same as Logic's. That would be true ellison.

I still want to be explicit. All I need is some way of accessing Logic's lifetime that does exist, but was auto-generated by compiler and is anonymous. Something like this (with major goal of leaving Window and its “parents” unaffected):

pub struct Logic {
    time_sec: i32,
    name: &'self str,
}

Obviously it does not have to be 'self, it could be

    name: &'default str

or

    name: & lifetime_of(self) str

or

    name: & lifetime_of(Logic) str

etc. I don’t know Rust well enough to dare to suggest new syntax.
Just some way of making it possible.

At least to me, this sounds like a very scoped feature, and it is not actually a new elision rule (as no additional elision is happening here). Just providing access to compiler-generated lifetime.
But then again, that’s my amateur viewpoint, and I realize I can be totally wrong, so please forgive my ignorance.


#4

It sounds like you’re asking for a self-referential struct, which is a struct holding a reference to part of itself. This isn’t supported by rust, although there are some crates (eg rental) that allow a limited form of it.


#5

I don’t think I do. name: &str is not a reference to another field inside Logic, it points to String that lives outside.


#6

In that case I don’t understand why you call it “self”, if it explicitly isn’t lifetime of the struct that holds the name.

BTW, there is no implicit lifetime for owned values.


#7

Also, do you have a reason to use temporary &str instead of owned String? The reference isn’t merely syntactic noise, but it ties the struct to wherever the name came from, and freezes owner of the name to be read only for as long as this reference exists.


#8

Ok, I was confused because you said you didn’t want elision and kept referring to lifetime of self. As has been noted, the lifetime parameter isn’t of self, it’s of whatever reference(s) you hold. The lifetime of Logic has nothing to do with lifetime of any references it holds beyond Logic not being able to outlive them. If you hold references, you must put a lifetime parameter into Logic; in some cases, you need multiple lifetime parameters, possibly with bounds across them. This is no different than using generic type parameters: Logic<T>. The compiler can later infer what T is when Logic is used with a concrete type, just like it automatically fills in the concrete lifetime of whatever references you put into it.


#9

To answer both of your questions:

  1. It is the lifetime of the Logic struct that holds name, I was answering to Vitaly’s question if I was trying to make name point inside the Logic struct, and it’s not the case.
  2. Making original String containing the name read-only is actually desired in my case and is on purpose.

Basically, I want to say to compiler “don’t worry, name reference is valid at least as long as Logic exists”.


#10

There’s a mismatch between your mental model of lifetimes in rust, versus what they actually mean. Values do not have lifetimes in rust. That’s the kind of model that seems to be in the works for C++, but C++ lifetimes only care about use-after-free. Rust’s model is more sophisticated because it is also used to prevent aliasing.

So it is not values that have lifetimes, but rather borrows (by which I mean &T). Those lifetimes help the borrow checker determine which loans are held by a value (“loans” being the specific things that the borrow checker cares about, which are things like e.g. “the borrow of local variable x beginning on line 13”)), to help it determine when two loans conflict.


Writing 'self would communicate nothing to the compiler because it is already the case that all borrows are held for as long as the value exists. What it would do is hide the lifetime from the user, leading to confusing borrowck errors when they don’t realize that you are holding onto their strings.


#11

Thanks, your reply and earlier Konel’s comment that

BTW, there is no implicit lifetime for owned values

convince me that my understanding of how lifetimes work is wrong.
Sorry about confusion.


#12

The common misconception is to think that lifetimes do something, and make compiler manage memory based on them.

But it’s in fact opposite — lifetimes only describe what actually happens. Compiler only checks (like an assertion) whether this information is true, and then throws all lifetimes away. Code that is run is not aware of lifetimes.


#13

In your case where there’s a single string, read only, living in two places Rc<String> may be appropriate.


#14

Don’t forget about Rc<str> in these cases :slight_smile:


#15

After more doc reading and thinking about it some more, I’ll give it another try :slight_smile: .
I’ll use the following code to check my understanding:

#![allow(unused)]

pub struct Logic<'a> {
    name: &'a str,
}

fn case1_shorter_lifetime() {
    let logic: Logic = {
        let n: String = String::from("n");
        Logic { name: &n } // NOT ALLOWED
    };
}

fn case2_same_lifetime() {
    let outer_ref: &str;
    {
        let inner_name: String = String::from("n");
        let logic: Logic = Logic { name: &inner_name };
        let inner_ref: &str = logic.name;
        outer_ref = logic.name; // NOT ALLOWED
    }
}

fn case3_longer_lifetime() {
    let outer_name: String = String::from("n");
    let outer_ref: &str;
    {
        let logic: Logic = Logic { name: &outer_name };
        let inner_ref: &str = logic.name;
        outer_ref = logic.name; // FINE
    }
}

fn main() {}

case 1 demonstrates that lifetime of name field must be greater than lifetime of containing object logic during assignment to the field “on the way in”. This does not allow structs to have dangling pointers, great.
case 2 demonstrates a check “on the way out” (when reading the field) and is nothing special since lifetime of name has already been decided when logic was constructed (is it true?).
case 3 shows how structs can have ref fields with lifetimes greater than the containing struct.

If we were to assume that handling cases 1 and 2 is more important goal as they are more common, and case3 is more rare and must always be explicit, then would the following suggestion make sense?
Syntax like:

pub struct Logic {
    name: &'owner str,
}

or even declaration without lifetime at all:

pub struct Logic {
    name: &str,
}

would mean: lifetime of name is exactly the same as lifetime of the object containing it. So case3 is not supported with this syntax and must be handled by using explicit <'a> as today.


#16

Really, part of what my post was trying to say is that hiding lifetimes is undesirable. Your examples focused on the use-after-free, but as I said, lifetimes are used for more than that. In order for users to know how to use your API, they need to know if you are borrowing their data.

Imagine if you had two functions with identical looking signatures:

pub fn make_foo(name: &str) -> Foo;
pub fn make_bar(name: &str) -> Bar;

…but one of them copies the data, while the other borrows it:

fn main() {
    let mut s = String::from("hello");
    
    {
        let foo = new_foo(&s);
        s.push_str(" world"); // ok
        println!("{:?}", foo);
    }
    
    {
        let bar = new_bar(&s);
        s.push_str(" world"); // ERROR: mutated while borrowed immutably
        println!("{:?}", bar);
    }
}

Actually, stuff like that is possible today, and we’re trying to phase it out, because it’s confusing! Assuming the types are defined as follows:

#[derive(Debug)]
pub struct Foo {
    name: String,
}

#[derive(Debug)]
pub struct Bar<'a> {
    name: &'a str,
}

then in particular, today you are allowed to write Bar (with no lifetime) in a function signature. However, this is no longer considered idiomatic, and in future editions it will be warned against in favor of explicitly using an anonymous lifetime:

// what you are now encouraged to write.
pub fn make_foo(name: &str) -> Foo;
pub fn make_bar(name: &str) -> Bar<'_>;

Now it is clear even from the function signatures alone why the second example has a borrow error while the first one does not!

If we introduced an invisible lifetime like 'self or 'owner we’d be back to square one.


#17

Thanks, I understand.

This non-local nature of lifetimes, is it considered a composability issue?
In my original example, while I wanted to modify just one struct I had to modify 3 instead.


#18

In some ways, yes. There is little doubt that lifetimes have a maintenance cost associated with them.

There are techniques I use to eliminate them in some places:

Abstracting over lifetimes with generics

Here’s something I occasionally do for types with public fields that represent a file format, so that both deserialization is simple and serialization can be done with whatever is at hand:

#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Poscar<
    Comment = String,
    Coord = Coords,
    Elements = Vec<Element>,
> {
    pub comment: Comment,
    pub coords: Coord,
    pub elements: Elements,
}

impl<Comment, Coord, Elements> Poscar<Comment, Coord, Elements>
where
    Comment: AsRef<str>,
    Coord: Borrow<Coords>,
    Elements: AsRef<[Element]>,
{
    // ... methods ...
}

This way, Poscar can be written without any explicit parameters, and it will refer to the type where all fields own their data (which is what is most commonly returned from a function). But somebody can also construct one with e.g. a &str for comment if they need to, and these cases are generally not inconvenient since oftentimes the type doesn’t need to be written.

Refcounting in high-level code

I used to deal with owned vs borrowed types throughout my entire base, but this gets very onerous in high-level application code that needs to be revised frequently.

When writing code that sits at the top of the food chain, don’t be afraid to use Rc<str> and Rc<[T]> as your “standard” string and vector types (once they no longer need to be mutated, of course). It will save you a lot of headache when refactoring.

Places where 'static can be used

Yesterday I was writing something that needs an RAII guard. The standard library’s MutexGuard borrows from its Mutex, but as long as that Mutex is static, then safe mutation is impossible (so aliasing is not a concern) and the guard can be given the 'static lifetime.

lazy_static! {
    /// Guarantees that only one instance of Lammps may exist on a process,
    /// if constructed through safe APIs.
    pub static ref INSTANCE_LOCK: Mutex<InstanceLock> = Mutex::new(InstanceLock(()));
}

/// Proof that no instance of Lammps currently exists within the current process.
pub struct InstanceLock(());

/// A Lammps is built directly with the MutexGuard wrapper (rather than a reference)
/// to dodge an extra lifetime parameter.
pub type InstanceLockGuard = MutexGuard<'static, InstanceLock>;