Force/override lifetime?

Playground.

struct Data {
    val: String,
}

struct Item<'a> {
    data: &'a Data,
}

struct Container<'a> {
    src: Vec<Data>,
    items: Vec<Item<'a>>,
}

impl<'a> Container<'a> {
    fn new() -> Container<'a> {
        Container {
            src: vec![Data { val: String::from("test") }],
            items: Vec::new(),
        }
    }

    fn build(&mut self) {
        let first = &self.src[0];
        self.items.push(Item { data: first });
        self.items.push(Item { data: first });
    }
}

first cannot outlive anonymous lifetime '_ defined on method body. That makes sense, because elements of self.src may be deleted so their minimum lifetime is '_.

But, I know I will never delete any element from self.src, so first lives as long as 'a'. Rust compiler doesn't know that, so it complains about conflicting lifetime '_ and 'a.

How to tell Rust compiler that first lives as long as 'a? Or, is there a better way to solve the problem?

Or because you may push a new element to self.src, reallocating it and making all previously creating references dangling.

You could put src and items separately (not in the same data structure), and then construct items after src is completely constructed. But src will then be read-only so long as items is around.

You could store indices instead of references.

Putting src and items in different structures is a little awkward, they are logically together.

Is there any other way that I can tell Rust compiler that src is never going to change once constructed, like the "immutable" concept in some other languages.

You are trying to build what's known as a self-referential struct, i.e. a struct where one field contains references into the struct itself. This is not possible in safe Rust code.

Consider storing indexes instead of references?

There's no canonical, built-in support for such. You can leak the memory and carry around static references to it. If you ever want to free the memory, you'll have to write a Drop implementation (deconstructor), use unsafe, and be very careful about not creating undefined behavior.

Or, you could build your src in a separate struct (or just a vec) and then build something like:

struct Container<'a> {
    src: &'a [Data],
    items: Vec<Item<'a>>,
}

But you'll have to keep the vec around at least as long as the Container, which may or may not be a good fit for your use case.

As @alice says, though, the standard way of doing things in Rust is to just not have a self-referential struct.

1 Like

Using index instead of reference is not safe either - the index may be out of range once src is changed. I think I'll have to use a separate struct instead.

I wish Rust would provide a readonly and immutable concept to store data that is never going to change once initialized. For example:

struct Container<'a> {
    readonly src: Collection<Data>, // Collection implements trait Immutable
    items: Vec<Item<'a>>, // Can contain references of data in src because they live as long as 'a
}

This has nothing to do with mutability. Lifetimes are orthogonal to mutability. Lifetimes are about ownership and references (regardless of the kind) and it keeps track of when data is moved while references to that data exist.

In this case, the problem is, that the reference into the src also indirectly references src itself and moving src would invalidate the reference into it. That's what happens in new and why the borrow checker deducts, that the lifetime must outlive the struct and the moment you try to push a reference into the vector, that lives only as long as the struct, it complains about the reference not living long enough.

You can use .get(index) to ergonomically check. Or even if you use [index], it's not unsafe in terms of violating Rust's guarantees. It will panic if out of range, but it won't access memory outside of the Vec's current length or otherwise cause undefined behavior.

If you're not comfortable using indices for your use case in particular, you definitely shouldn't want to use references. (Indices being able to change is also at odds with saying src doesn't change once constructed.)

It's interesting to think about. The only data that could be referenced would have to be on the heap (or 'static), and there would have to be a way to associate "borrows in this field are from that other field" (so if you copy items[2] somewhere, say, the borrow checker knows it's referring to src). And src would have to always own the memory on the heap in question.

Lifetimes are an analysis on blocks of code, and not some property that values carry around with them, so that would be an entirely new mechanism. Even then, the compiler also has no way to distinguish a reference into the heap with a reference elsewhere. (Imagine Collection<Data> used TinyVec, say; then the Item references could be into the stack and would become invalid whenever the container moved -- e.g. when it was returned from new()).

So I don't think there's a practical way to accomplish it without some sort of run-time system, like perhaps reference counting or dynamic generation of the references, etc. None of which seem better than just storing indices to me, for this use case.

If it's just the ergonomics that bug you, consider writing an Index<usize> impl on Container that returns Item<'a> (dynamically translates an index to a reference), and/or methods that return impl Iterator<Item=Item<'a>>, etc.

Very interesting and constructive discussion. Thanks everyone!

When I said "unsafe" I was being too vague. In reference scenario "unsafe" means dangling reference, while in index scenario "unsafe" means panic. I agree dangling reference is more dangerous. But neither are intended behavior since we don't want to access a piece of data that is no longer there. Using indexes just bypasses the compiler's borrow checker and let runtime boundary check do its job.

Why I think mutability and lifetime are related here is that we are giving compiler a hint to do proper borrow check.

struct Container<'a> {
    immutable src: Vec<Data>, // Hint: no mutable borrow allowed once initialized
    items: Vec<Item<'a>>,
}

With the immutable keyword as a hint, the compiler will not allow mutable borrow on src (after src is initialized). Thus, items can contain references to elements in src. Like you mentioned, the compiler can do lifetime analysis on the block of code (the "hint" is part of code, not a property carried by src value).

About the tinyvec scenario, the compiler can require the type of immutable properties to satisfy certain trait bound (e.g. disallow Copy trait?).

This is not correct, mutability is not required to invalidate the reference. Moving Container is enough.

1 Like

Makes sense. Looks like heap allocated memory is needed to keep the reference valid.

This isn’t quite right: Vec keeps its data on the heap and will only move it when the capacity changes, even when its container moves. If the contents can’t change, into_boxed_slice() will remove the possibility of modification, fixing the data in place in memory (until the Box is destroyed).

You can then use unsafe code to soundly keep pointers into this data, as long as you use some other mechanism to ensure that Item will never outlive the Container it was generated for (or, more precisely, its src).


In addition to the other solutions above, you can also use Rc or Arc to share src with the Items in a couple of different ways. You can have one reference count per item, which lets each Item live indepedently at the cost of a heap allocation for each:

struct Item {
    data: Rc<Data>
}

struct Container {
    src: Vec<Rc<Data>>,
    items: Vec<Item>,
}

impl Container {
    fn new() -> Container {
        Container {
            src: vec![Rc::new(Data { val: String::from("test") })],
            items: Vec::new(),
        }
    }

    fn build(&mut self) {
        let first = &self.src[0];
        self.items.push(Item { data: first.clone() });
        self.items.push(Item { data: first.clone() });
    }
}

Or, you can have a single reference count for all of src, which avoids the double-indirection but will keep the entire allocation alive, even of only one element is in use:

struct Item {
    src: Rc<[Data]>,
    idx: usize,
}

struct Container {
    src: Rc<[Data]>,
    items: Vec<Item>,
}

impl Container {
    fn new() -> Container {
        Container {
            src: vec![Data { val: String::from("test") }].into_boxed_slice().into(),
            items: Vec::new(),
        }
    }

    fn build(&mut self) {
        self.items.push(Item { src: self.src.clone(), idx: 0 });
        self.items.push(Item { src: self.src.clone(), idx: 0 });
    }
}
2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.