Lifetime explosion in structs with mutable references

crawdad · June 17, 2022, 3:17pm

I'm trying to build some structs that point to each other behind mutable references:

struct A { b: &mut B }
struct B { c: &mut C }
struct C { d: &mut D }
struct D;

Obviously we need some llifetimes, or it won't compile. So let's add them.

struct A<'b, 'c, 'd> { b: &'b mut B<'c, 'd> }
struct B<'c, 'd> { c: &'c mut C<'d> }
struct C<'d> { d: &'d mut D }
struct D;

That got real ugly all of a sudden. It pollutes the public API, and most of those lifetimes are pointless to have to keep track of. In A<'b, 'c, 'd>, it's the shortest of the three lifetimes that impacts whether I can access the field. The other two have no effect at all. So why do I need to write them?

I tried collapsing the lifetimes into one, but with mutable references it forces lifetimes to be exactly equal instead of combining them, so the resulting structs are impossible to use in practice, for example you can't even do this:

fn foo(a: &mut A) {
    let _ = A { b: a.b };
}

Playground link

I don't want to constrain the lifetimes to be exactly the same. I just want to trim the boilerplate and only annotate lifetimes that are actually useful. The struct A effectively has a single lifetime, so I'd love if I could write it that way as A<'b> somehow.

Is there a way to stop this explosion of lifetimes or at least abstract it away
so it doesn't pollute the public API?

RedDocMD · June 17, 2022, 3:22pm

Yeah, lifetime bounds in mut references is usually a pain to deal with.
I would say use Rc<RefCell<T>> to avoid having to store mutable references.

kpreid · June 17, 2022, 3:34pm

They do have an effect. Mutable referrences allow mutation, and given A<'b, 'c, 'd>,

a.b = new_b;

is valid if and only if new_b is an &'b mut B<'c, 'd> — and so on for every nested struct field. Thus, it matters what 'c and 'd are (to the type system, even if not to your intended application).

Perhaps you should tell us about why you're building this chain of mutable references — then we can suggest solutions which avoid it entirely. Using references in structs is often a mistake, in that people often try to use references where ownership is what is needed. So, tell us more context. What are A, B, C, and D? What are the requirements — what operations does A need to offer?

One possible category of solution is to make some of the structs generic; for example if A is just struct A<T: Bee> { b: T } and we have impl Bee for &mut B to convey the B-specific parts, then code can be written that's generic over <T: Bee> instead of 'b, 'c, 'd, which might be an improvement in readability — or it might be weird and complicating; the details will depend on what these structs are actually needed for.

steffahn · June 17, 2022, 3:46pm

Trait objects can work around this kind of problem pretty well (at least in some use-cases):


struct A<'b> { b: &'b mut dyn B_ }
struct B<'c> { c: &'c mut dyn C_ }
struct C<'d> { d: &'d mut D }
struct D;

trait C_ {
    fn d(&mut self) -> &mut D;
}
impl C_ for C<'_> {
    fn d(&mut self) -> &mut D {
        self.d
    }
}

trait B_ {
    fn c(&mut self) -> &mut dyn C_;
}
impl B_ for B<'_> {
    fn c(&mut self) -> &mut dyn C_ {
        self.c
    }
}

fn foo(a: &mut A) {
    let _ = A { b: a.b };
    let _ = A { b: &mut B { c: a.b.c() }};
}

It’s unfortunate that the same thing isn’t currently possible in Rust without also introducing the overhead of dynamic function calls, despite the fact that those aren’t even needed for this use-case.

The relevant detail/feature here is that &'a mut dyn C_ can be coerced into &'b mut dyn C_ if 'a: 'b, despite the fact that &'a mut dyn C_ is short for &'a mut (dyn C_ + 'a), and &'b mut dyn C_ is short for &'b mut (dyn C_ + 'b), so this coercion is not due to ordinary covariance+subtyping, since covariance for &'a mut T only supports changing the 'a, not the T.

kornel · June 17, 2022, 4:28pm

I used to complain about the same thing, and only later realized that this syntax is fine, because it should be discouraging use of references in structs. Most of the time it's a design mistake to use them. If it causes an explosion of lifetimes you don't want nor care about, then you may not have a valid use-case for putting references in structs.

To have fields you can mutate, with data stored by reference, use Box.

Rust's references aren't for merely referring to objects. They are temporary scope-bound loans, with mut adding exclusive access restriction on top of that. Their temporary nature and limited scope has to infect everything they touch. It's not an unfortunate side effect of a noisy syntax. Being severely limiting is their purpose. They are supposed to prevent unrestricted use of the structs. They are supposed to prevent careless spread of references that may not live long enough.

As much as you can, change your design to make structs own their data.

struct A { b: B } // these are mutable!
struct B { c: C }
struct C { d: D }
struct D;

or

struct A { b: Box<B> } // these are mutable and by reference!
struct B { c: Box<C> }
struct C { d: Box<D> }
struct D;

If you need shared mutable access to the same data from multiple structs, consider Arc<Mutex<T>>. It's not temporary or bound to a scope, and therefore it doesn't need any lifetime annotations.

crawdad · June 17, 2022, 6:23pm

Thanks for the responses. The real code is, I guess you could call it a document processor. A document has pages, a page has groups/text/images, etc. The document owns everything, but we construct temporary cursors into the document while editing it. Simplified:

struct Binder { documents: HashMap<Id, Document> }
struct Document { pages: HashMap<Id, Page> }
struct Page { images: HashMap<Id, Image> }

let cursor = binder.find_document("cover").page(1).find_image_by_name("handshake.jpg");

A cursor looks like:

ImageCursor {
    parent: &mut PageCursor {
        parent: &mut DocumentCursor {
            parent: &mut Binder,
            id: Id,
        },
        id: Id,
    },
    id: Id,
}

These cursors let us do things like Node.nextSibling and Node.parent from OO languages that don't make sense in Rust. There's a clear hierarchy of ownership so I was hoping to avoid Rc<RefCell<>>. And the cursors do not own the document, so I think references are the right tool to use here.

The ImageCursor ends up with about 5 lifetimes tacked on. That's what I'm trying to avoid.

steffahn: I like how that looks, but I don't like having to add dynamic dispatch if I don't have to.

steffahn · June 17, 2022, 9:10pm

This might be overkill and also it has the disadvantage that the cursor types get larger (on the stack) with more layers as more references are involved, but nonetheless something like this might work for you:

#![allow(unused)]

use std::collections::HashMap;

// todo function; unlike the macro, this does not interfere
// with the borrow-checker at call-site
fn todo<T>() -> T {
    todo!();
}

type Id = usize;
type Image = ();

pub struct Binder {
    documents: HashMap<Id, Document>,
}
pub struct Document {
    pages: HashMap<Id, Page>,
}
pub struct Page {
    images: HashMap<Id, Image>,
}

// ==================================================

// trait for abstracting over `Id` vs. `&mut Id`

// (no need to expose the `Id` type if you don't want to...)

// could also be sealed if you like
pub trait Storage {
    fn borrow(&mut self) -> Borrowed<'_>;
}
pub struct Owned(Id);
pub struct Borrowed<'a>(&'a mut Id);
impl Storage for Owned {
    fn borrow(&mut self) -> Borrowed<'_> {
        Borrowed(&mut self.0)
    }
}
impl Storage for Borrowed<'_> {
    fn borrow(&mut self) -> Borrowed<'_> {
        Borrowed(self.0)
    }
}

// ==================================================

pub struct DocumentCursor<'a, S: Storage> {
    parent: &'a mut Binder,
    id: S,
}
impl<S: Storage> DocumentCursor<'_, S> {
    fn borrow(&mut self) -> DocumentCursor<'_, Borrowed<'_>> {
        DocumentCursor {
            parent: self.parent,
            id: self.id.borrow(),
        }
    }
    fn some_method(&mut self) {
        println!("called some method");
    }
    fn parent(&mut self) -> &mut Binder {
        self.parent
    }
    fn page(&mut self, number: usize) -> PageCursor<'_, Owned> {
        let id: Id = todo();
        PageCursor {
            parent: self.borrow(),
            id: Owned(id),
        }
    }
}

// ==================================================

// instead of borrowing a `DocumentCursor`, we just
// borrow its `Id` and re-borrow the `Binder`, i.e. we take an
// owned `DocumentCursor<'a, Borrowed<'a>>` field

// this way the additional lifetime argument is avoided

pub struct PageCursor<'a, S: Storage> {
    parent: DocumentCursor<'a, Borrowed<'a>>,
    id: S,
}
impl<S: Storage> PageCursor<'_, S> {
    fn borrow(&mut self) -> PageCursor<'_, Borrowed<'_>> {
        PageCursor {
            parent: self.parent.borrow(),
            id: self.id.borrow(),
        }
    }
    fn some_other_method(&mut self) {
        println!("called some other method");
    }
    fn parent(&mut self) -> DocumentCursor<'_, Borrowed<'_>> {
        self.parent.borrow()
    }
}

// ==================================================

pub struct ImageCursor<'a, S: Storage> {
    parent: PageCursor<'a, Borrowed<'a>>,
    id: S,
}
impl<S: Storage> ImageCursor<'_, S> {
    fn borrow(&mut self) -> ImageCursor<'_, Borrowed<'_>> {
        ImageCursor {
            parent: self.parent.borrow(),
            id: self.id.borrow(),
        }
    }
    fn some_other_method(&mut self) {
        println!("called some other method");
    }
    fn parent(&mut self) -> PageCursor<'_, Borrowed<'_>> {
        self.parent.borrow()
    }
}

trentj · June 17, 2022, 11:58pm

So far, so good...

crawdad:

A cursor looks like:

ImageCursor {
    parent: &mut PageCursor {
        parent: &mut DocumentCursor {
            parent: &mut Binder,
            id: Id,
        },
        id: Id,
    },
    id: Id,
}

This structure violates the "document owns everything" principle, because cursors are dependent not just on the document itself but on other cursors which it borrows from. This is confusing.

I would expect an ImageCursor not to borrow a PageCursor, but just to contain it. This is easy enough to do, and you can flatten the lifetimes as you go by reborrowing the inner reference, which doesn't violate any borrowing rules.

If you want to split borrows, e.g. have two PageCursors pointing to the same Document and use them simultaneously, then you'll have to use a different data structure - you can't have two mutable references to Binder at the same time, so you'll have to get creative.^[1] But as long as you deal with one cursor at a time, you're fine.

One option would be to hold Binder by shared reference and turn the maps into something with interior mutability, but I haven't thought this through. ↩︎

crawdad · June 20, 2022, 8:38pm

Thanks both of you for taking the time to write all that out! I noticed the common idea between both your suggestions was a method for reborrowing. I implemented it and it worked great. There's always more to learn when it comes to Rust...

system · September 18, 2022, 8:39pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
References in struct's giving lifetime conflicts help	7	562	April 4, 2022
Annotating lifetime of references in a struct help	14	1020	October 13, 2023
Struct member lifetimes help	3	251	August 9, 2023
Lifetimes with mutable reference to self in function call help	10	1686	January 12, 2023
Confusion about lifetime of struct and struct field	4	475	September 19, 2019

Lifetime explosion in structs with mutable references

Related Topics