Another lifetimes + mutable reference question

I'm working on something rather complex, and have been running into a problem I can't seem to fully resolve. This is not a new problem, plenty of specific questions here and on SO, but so far I have found nothing that explains to me how to solve or avoid this (seeming) anti-pattern in general.

Long story short, I'm doing complex things and would like to cache the results. Unfortunately, part of the complex things involve an external library that also does complex things and returns its results through a struct with a lifetime parameter. That means if I want to cache those, my cache struct now also needs a lifetime parameter. And suddenly everything breaks.

Here's a simplified example, with every extraneous thing omitted.

First, the before-times. When all was well.

#[derive(Debug)]
struct MyThing(String);

#[derive(Debug)]
struct StructWithoutLifetime {
    my_stuff: Vec<MyThing>,
}

impl StructWithoutLifetime {
    fn very_complex_function_to_process_my_thing(&mut self, thing: MyThing) {
        // Such complex, much work
        self.my_stuff.push(thing);
    }
    
    fn process_lots_of_things(&mut self, things: Vec<MyThing>) {
        for thing in things {
            self.very_complex_function_to_process_my_thing(thing);
        }
    }
}

fn main() {
    let mut struct_without_lifetime = StructWithoutLifetime {
        my_stuff: Vec::new(),
    };
    
    struct_without_lifetime.very_complex_function_to_process_my_thing(MyThing("one".to_string()));
    
    println!("{:?}", struct_without_lifetime);
}

(playground)

This works quite nicely.

Next, I introduce the external library, and the entire world explodes.

// Imagine the external library function that creates these is very expensive,
// so I'd like to store them instead of recalculating them
#[derive(Debug)]
struct FromExternalLibrary<'a> {
    nr: &'a i32
}

#[derive(Debug)]
struct MyThing(String);

#[derive(Debug)]
struct StructWithLifetime<'a> {
    my_stuff: Vec<MyThing>,
    external_library_stuff: Option<FromExternalLibrary<'a>>
}

impl<'a> StructWithLifetime<'a> {
    fn very_complex_function_to_process_my_thing(&'a mut self, thing: MyThing) {
        // Such complex, much work
        self.my_stuff.push(thing);
    }
    
    fn process_lots_of_things(&'a mut self, things: Vec<MyThing>) {
        for thing in things {
            self.very_complex_function_to_process_my_thing(thing);
        }
    }
}

fn main() {
    let mut struct_with_lifetime = StructWithLifetime {
        my_stuff: Vec::new(),
        external_library_stuff: None
    };
    
    struct_with_lifetime.very_complex_function_to_process_my_thing(MyThing("one".to_string()));
    
    println!("{:?}", struct_with_lifetime);
}

(playground)

Boom:

error[E0499]: cannot borrow `*self` as mutable more than once at a time
  --> src/main.rs:25:13
   |
17 | impl<'a> StructWithLifetime<'a> {
   |      -- lifetime `'a` defined here
...
25 |             self.very_complex_function_to_process_my_thing(thing);
   |             ^^^^-------------------------------------------------
   |             |
   |             `*self` was mutably borrowed here in the previous iteration of the loop
   |             argument requires that `*self` is borrowed for `'a`

error[E0502]: cannot borrow `struct_with_lifetime` as immutable because it is also borrowed as mutable
  --> src/main.rs:38:22
   |
36 |     struct_with_lifetime.very_complex_function_to_process_my_thing(MyThing("one".to_string()));
   |     -------------------- mutable borrow occurs here
37 |     
38 |     println!("{:?}", struct_with_lifetime);
   |                      ^^^^^^^^^^^^^^^^^^^^
   |                      |
   |                      immutable borrow occurs here
   |                      mutable borrow later used here
   |
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

Now, in my actual case, it would be quite a lot of work to refactor the two methods in my struct. But suppose I do so anyway, and end up with something like this:

// Imagine the external library function that creates these is very expensive,
// so I'd like to store them instead of recalculating them
#[derive(Debug)]
struct FromExternalLibrary<'a> {
    nr: &'a i32
}

#[derive(Debug)]
struct MyThing(String);

#[derive(Debug)]
struct StructWithLifetime<'a> {
    my_stuff: Vec<MyThing>,
    external_library_stuff: Option<FromExternalLibrary<'a>>
}

impl<'a> StructWithLifetime<'a> {
    fn process_any_amount_of_things(&'a mut self, things: Vec<MyThing>) {
        for thing in things {
            // Such complex, much work
            self.my_stuff.push(thing);
        }
    }
}

fn main() {
    let mut struct_with_lifetime = StructWithLifetime {
        my_stuff: Vec::new(),
        external_library_stuff: None
    };
    
    struct_with_lifetime.process_any_amount_of_things(vec![MyThing("one".to_string())]);
    
    println!("{:?}", struct_with_lifetime);
}

(playground)
This solves one problem, as I'm no longer borrowing in a loop. However, the error in main remains:

error[E0502]: cannot borrow `struct_with_lifetime` as immutable because it is also borrowed as mutable
  --> src/main.rs:34:22
   |
32 |     struct_with_lifetime.process_any_amount_of_things(vec![MyThing("one".to_string())]);
   |     -------------------- mutable borrow occurs here
33 |     
34 |     println!("{:?}", struct_with_lifetime);
   |                      ^^^^^^^^^^^^^^^^^^^^
   |                      |
   |                      immutable borrow occurs here
   |                      mutable borrow later used here
   |
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

It's as if as soon as a lifetime parameter is introduced, that means that once you mutable borrow a struct it can never be borrowed again in the same scope?

As I said before, this is definitely not the first time this kind of question gets asked, either here or on SO, but every time the answer concerns some trivial "gotcha", or consists of "here's specific code to make this specific example work." But fundamentally, I still don't really get the problem here. I kind of have an idea about what's going wrong in the loop (but still don't really know why adding a lifetime bound triggers it), but I have no idea what is going wrong in the main function.
I tried wrapping the mutating process... call inside its own { } scope, screaming at the compiler NO WE'RE DONE MUTATING IT'S FINE. But he's just not having it.

Or is this the part where I'm supposed to just give up and slap a RefCell on all the fields in my struct and whistle past the compiler going "noo officer, only immutable borrows here, yessir, definitely nothing getting mutated in this here function!"

&'a mut self is a massive footgun.

It makes this function possible to call only once ever, in the entire lifetime of the object. This is because StructWithLifetime<'a> makes 'a mean "borrowed longer than the struct exists", and then you make 'a apply to self, which borrows it exclusively forever.

The <'a> definition always refers to something external, living longer than the scope you put it on. When it's defined on a function, at minimum it applies to the function call. When it's on the struct, it has to refer to something created before the struct existed, and staying alive even after the struct is destroyed. Everything marked with the same lifetime name will virally spread that restriction.

6 Likes

So a struct that's supposed to be mutable can just never contain fields whose type has a lifetime parameter?

In my particular case, it is even more frustrating, because all the (referenced) data is in fact owned by the same struct, and thus all of the lifetimes are more or less tied together (though obviously the compiler doesn't/cant't understand that). To continue with my abstract examples, imagine that the FromExternalLibrary struct actually looks like this:

pub struct FromExternalLibrary<'a> {
    value_from_slice: &'a [MyThing],
   // ... more fields that depend on the input slice
}

impl<'a> FromExternalLibrary<'a> {
    pub fn from_my_things(data: &'a[MyThing]) -> Self {
         // ... calculate stuff value and return Self ...
   }
}

In reality, Vec<MyThing> is actually a very large Vec<u8> loaded from files (i.e. not something you'd want to clone() all over the place), but the principle is the same. The struct owns a Vec<MyThing>, from which it takes a &[MyThing] slice to construct a FromExternalLibrary struct, which it then also stores. All of the lifetimes are tied together, but I cannot find a way to tell the rust compiler that the slice in the external thing I create is in fact tied to the Vec that I do in fact own. Or something like that.

By the same struct that is holding the references? Then that's the main problem - self-referential structs are, strictly speaking, possible in safe Rust, but almost always unusable in practice.

That's lovely.

I've encountered such answers before. Not necessarily about self-referential structs, but on the general theme of &'a mut self, the answers regularly boil down to "you cannot do that in Rust." And then someone slaps a big ol' green checkmark on that reply. Problem successfully identified! We're done here!

I do apologize for the snark, but you can imagine it is hard to not be somewhat frustrated when you're stuck on a project you've been working on for weeks and the answer to your problem is "Oh yeah, that. Classic. You can't do that in Rust." Cool. Cool cool cool.

No, that is possible. What you can't usefully do is use that same lifetime parameter as the lifetime of a mutable borrow of the struct itself, which is what writing &'a mut self means in this case. That is not because of a specific rule against it, but because the logical consequence of those two uses being combined is the single borrow extending forever.

5 Likes

That’s fair; nobody’s responded to try to solve the problem beyond explaining the error. Let me give that a shot.

There are libraries to construct useful self-referential structs, like ouroboros, yoke, and self_cell. However, in order to figure out whether and how they can be applied to your problem, we need more information about how external_library_stuff is actually filled in — the code you’ve provided so far never sets the value to not be None. If that were all you need, then you could just delete the 'as from &'a mut self, but presumably it isn’t. Please show us

  • the signature of the function that constructs FromExternalLibrary (or, even better, the documentation of the actual library you are using — there may be better solutions)
  • where in StructWithLifetime's methods that function is called (is it called after all things have been pushed, or once per thing?)
3 Likes

The long and short of it is that I'm doing things with fonts and glyph rendering (yes, there are many libraries that do that, but for educational reasons I want to do things myself). I load (potentially many) font files from disk, and would like to cache data for later use.

So my FontCache keeps one big Vec<u8>, which contains all the data of all the loaded files consecutively, and a Vec<Range<usize>>, with which I can obtain the slice for the data of a particular font file. I'm simplifying a lot, but that is essentially step one of the problem.

Step two is that I make use of existing font/glyph libraries to interpret these font files, such as harfrust and skrifa. Under the hood, these make use of read_fonts's FontRef<'a> struct, which is the cause of this entire mess. FontRef<'a>'s are constructed with FontRef::from_index(data: &'a [u8], index: u32). For the purposes of this problem index is an implementation detail that can be ignored, the important part is that FontRef wants the data for a specific font file as a &'a [u8] slice.

Now, constructing a FontRef does not actually do all that much, and as long as I was just caching fonts, I was happy to "reconstruct" it from the data in the cache whenever I needed it. And for some of the simpler "ref'd" structs in these libraries I made my own "owned" variants, but this is not feasible for the more complex ones.

The problem is that now I'm working on rasterization and rendering. For that, skrifa provides (for example) OutlineGlyphCollection<'a>. This is created from (you guessed it) a FontRef<'a>, and has the same reliance on the slice of data. Constructing this one, while not a huge amount of work, is still non-trivial. And given that it is static data for given font(file), and that I'm working in an IMGUI/game dev context, it feels very wasteful to reconstruct this for every glyph, 60 times per second.

And that is where I am now. I'd like to (conditionally) cache an OutlineGlyphCollection<'a> (among others, but this is just an example - most of these libraries seem to be fond of doing seemingly expensive calculations and data loads, and then storing the results next to a lifetime-bound reference to the source data. I'm sure they have perfectly good reasons, but it's... painful to work with), but that seems to not be possible at this junction.

I have also tried to put these lifetime-bound structs into some sort of separate "sub-cache" as it were, but that simply moves the problem. The data they need is in FontCache. It would not make sense to clone the actual cache, so any other structure that wants to store more data now needs a &FontCache reference, with a lifetime bound that is (I think?) still the same as the bound for the bounded skrifa/harfrust structs I'd like to cache, since everything still depends on the same (slices of the) Vec<u8> that is in the cache. So now I've just moved the problem.

I hope this makes some sense. It is not possible to paste the entire code here, it's huge and messy.

Ideally, Rust would have built-in mechanisms for self-reference so this sort of thing can be written without caveat, but it currently does not. Since that is the case, in my opinion, part of the blame should be laid on the libraries you are using. Libraries should offer alternatives to using only borrowed data whenever a data structure might be long-lived. In this application, bytes might be a good choice.

But that doesn’t help you now, so let's look at solving the problem.

But are you loading fonts lazily — is the Vec<u8> extended after it has been used the first time? If so, you cannot use a Vec<u8>, because pushing data into it can invalidate previous references. In that case, you need an append-only non-moving data structure like typed-arena.

If you are loading all the data up front and creating FontRefs for it, then you can just use a Vec<u8>, as long as you fill it out completely before borrowing it.

Either way, you can then use ouroboros or similar to express the self-reference. Here’s your code roughly edited into that shape:

#[derive(Debug)]
struct FontRef<'a> {
    r: &'a [u8],
}

#[ouroboros::self_referencing]
struct FontCache {
    data: typed_arena::Arena<u8>,
    #[borrows(data)]
    #[covariant]
    fonts: Vec<FontRef<'this>>,
}

impl FontCache {
    fn load_font(&mut self, thing: Vec<u8>) {
        self.with_mut(|b| {
            let ref_to_data = b.data.alloc_extend(thing);
            b.fonts.push(FontRef { r: ref_to_data });
        })
    }

    fn load_all_fonts(&mut self, things: Vec<Vec<u8>>) {
        for thing in things {
            self.load_font(thing);
        }
    }
}

fn main() {
    let mut struct_with_lifetime = FontCache::new(typed_arena::Arena::new(), |_| Vec::new());

    struct_with_lifetime.load_all_fonts(vec![vec![0, 1, 2]]);

    println!("{:?}", struct_with_lifetime.borrow_fonts());
}
2 Likes

Thank you for your input. I'm not yet sure that this will work, but I'm already happy I have something more to work with than a simple "nope". Refactoring and experimenting will likely still take me a while, but if it works I will try to remember to flag this as a solution :slight_smile:

If you never get rid of the loaded font data, storing it globally (outside of any struct) may also be an option.

The rest of this post will be more "explaining the error" with no more "what can you do about it". Please skip it if you're not interested.

But in case you're still interested in "why"... consider this data structure:

struct Ex1<'a> {
    data: [i32; 10],
    current: &'a i32,
}

where current has been made to point to something in data. Rust has no relative pointers or copy constructors, so it must not be possible to move a value so constructed (the reference would dangle). So while you can create that situation with a &'a mut Ex1<'a> method, it must then be "borrowed forever" (as you can't move something borrowed).[1]

The same thing happens when you have a Vec<i32> instead, in part because Rust references don't distinguish between pointing to the heap or not (and it would be non-breaking from a language level perspective if Vec<_> indexing started returning some inline address, etc). But even if they did, it wouldn't be a full solution on its own due to aliasing and mutability concerns:

You can't take a &mut _ to something borrowed. However, if you could create the self-referential value without borrowing forever, you could then get a &mut to the contents of data while also using current. Aliasing a &mut _ is instant UB. But even if it weren't, this would also allow you to call data.clear() and make current dangle.

So the struct ends up borrowed forever to prevent these aliasing and reference invalidation possibilities.

Ultimately Rust will need some new type of functionality, like a new type of reference or something, to natively support these self-referential patterns in safe code with some sort simple pointer type. It is a desired feature, but also an indefinitely long way off.

Unfortunately, reasoning about how long values stick around when dealing with borrow checker errors like the OP can lead to the wrong conclusions about what should work.

Despite the unfortunate naming, Rust lifetimes (those '_ things) denote the duration of borrows, not the liveness scope of values. And being borrowed "guards against" more than no longer existing, such as the "can't take a &mut" restriction. And borrow checking is a pass-or-fail check; it can't change the semantics of the program by making values live longer, for example.

In this case, the borrow in the struct being a borrow of something else owned by the struct is what caused the problem, not a solution to it.


  1. Or until the reference is overwritten to point somewhere else, logically, but Rust doesn't have a way to track that at compile time. ↩︎

1 Like

'static is extremely useful, we use them in our embedded codebase all the time, but usually with static_cell - Rust , it also gives mutable references. I believe it may be he best solution to the OP’s probably!

I want to ask the obvious question, I probably missed it while reading. Why do you need &’a mut self instead of &mut self? Those are completely different statements.

They were trying to do something like bite here, which only compiles with &'a mut self (but then leads to other horrible errors that make you pull your hair out, and then you create a forum thread like this one).

Also, if you don’t specifically need a global and are already handling loading the data only once, then you can use Vec::leak() to convert Vec<u8> into &'static [u8], which you can then use to create a FontRef<'static> without lifetime parameters. Of course, you need to make sure to do this only once, or you have a memory leak.

I once made a PR for yoke to allow self referential structs with a mutable access, but for my use cases &’static mut was good enough and they had an API that was directly contradicting that addition. Maybe it would be worth it to push it forward again?

Thanks again for all the input and insights.

For my actual, real problem, I think the solution will be to basically just not do it, after all. I can't cache OutlineGlyphCollection<'a>, but I can potentially "draw" all the glyphs in the collection (i.e. record the unscaled vector path commands) and cache those.

However, since nobody is paying me for this, and since I'm a sucker for detours into irrelevant rabbit holes, I've still been tinkering with what might be possible here.

Some of the walls I've bashed my head into today:

  • First, to reiterate the nasty bits of the problem (and abstract away the whole font-business a bit), the following illustrates the external library structs I would potentially be interested in caching. I cannot change these, and can only access the pub fns. For the purpose of this exercise, let's say I'm chiefly interested in ExternalDerivedRef.
#[derive(Debug, Clone)]
struct ExternalRef<'a> {
    data: &'a [u8]
}

impl<'a> ExternalRef<'a> {
    pub fn from_data(data: &'a [u8]) -> Self {
        Self { data }
    }
    
    pub fn derived(&self) -> ExternalDerivedRef<'a> {
        ExternalDerivedRef::new(self)
    }
}

#[derive(Debug, Clone)]
struct ExternalDerivedRef<'a> {
    ext: ExternalRef<'a>
}

impl<'a> ExternalDerivedRef<'a> {
    fn new(ext: &ExternalRef<'a>) -> Self {
        Self {
            ext: ext.clone()
        }
    }
}
  • I'm not adverse to taking the raw data out of the cache and storing it in a global static. Ideally the data would need to be able to grow after initialization, though. Although typed-arena was suggested, it seems somewhat difficult to work with due to the nature of returning &mut references after allocation. I did get it to sort of work using a LazyLock<Mutex<Arena<Arc<[u8]>>>> construction, which allows me to clone() the resulting &mut Arc<[u8]>, but that then has its own issues. I'm also looking at elsa::FrozenVec, which seems more ergonomic for this "use-case."
  • I've looked at yoke quite a bit, but I don't think it is useable here (granted, I barely understand what's going on). For one, it's obviously not possible to impl Yokeable on an external library struct. I tried making a "container" for the external library struct(s) and yoking that to an Arc<[u8]> obtained from global static data, but I can't get that to work.
  • In general, regardless of how I try approaching this so far, the fact that I'm chiefly interested inExternalDerivedRef<'a> seems to make this a lot more complicated than it already is - if not flat out impossible. It does (seemingly) not matter that everything depends on &'static source data, the result of ExternalRef::from_data(&static_data).derived() is always something that does not live long enough.
  • I'm not adverse to unsafe suggestions either, btw. I'm also not knowledgeable enough to know whether or not I would be screwing up something, so I've not experimented with that as of yet.

Anyway, as I said I don't have an immediate problem anymore, but if anyone is interested in this or has insights or wants to tell me everything I'm doing wrong, I would still be interested regardless :grinning_face_with_smiling_eyes:

You should be able to trace the lifetime annotations back to see why that is happening. If you pass a &'static [u8] to from_data, you should get an ExternalRef<'static>, and when you call derived on that you should get an ExternalRefDerived<'static>. So at a guess you're not managing to get ahold of a &'static [u8].

Example that leaks data to get &'static [u8]s.

You can have structs that hold temporary loans (references) in their fields, and you can mutate the structs and the fields.

The problem you have isn't a limitation of the language, but an incorrect declaration of lifetimes that asked the compiler to add an extra unnecessary severe restriction.

It's like adding final on a class instead of on a method. It doesn't mean that types can't be mutated, it means you unintentionally told the compiler to forbid mutation in this case.

Unfortunately, these <'a> declarations are very important in Rust and affect the types in major ways. You can't just slap them anywhere to make the compiler stop complaining. Other languages don't declare temporary loans like that, and don't have different references for owning and borrowing, so this is an unintuitive part of the language that's necessary to learn.

BTW, in Rust there is no concept of a "mutable struct". Mutability is not a property of the data itself. It's a property of how you can access the data, which can vary in time and place — structs can be borrowed as exclusive/mutable in one place, and as shared/immutable in another place. Struct owner can choose how to lend them, and ownership can change too as the data is moved around.

When you call a method with &'a mut self, you're asking to obtain an exclusive (mutable) access to self for the entire duration of loan/scope marked as 'a. By default (if you write &mut self) the loan duration is automatically inferred to be usually just the duration of the single function call, so the access is granted for the call, and ends after the call returns, and can be granted to another call afterwards. But if you accidentally make 'a mean "forever", you get forever-exclusive access granted to the first method call, and it won't end, so no other call will get access.

Unfortunately the footgun you've run into is just a particularly bad combination of declarations, which in other cases are useful and necessary (structs that borrow data from somewhere else need the <'a> to signal they're not normal self-contained types, and to distinguish between loans from different places, and &/&mut in function arguments support explicit lifetimes to declare which arguments are compatible with which struct fields or returned references, especially when multiple arguments can be references).

It turns out I somehow had an error in my own local test project that wasn't present in the example I provided here. The ExternalRef::derived() and ExternalDerivedRef::new(...) functions looked like this:

pub fn derived(&self) -> ExternalDerivedRef {
            ExternalDerivedRef::new(self.clone())
        }

fn new(ext: ExternalRef<'a>) -> Self {
            Self { ext: ext.clone() }
        } 

This obviously can't work. Thankfully, I doubled checked the actual library I'm using, and (unsurprisingly) it uses the correct version that's actually useful. With that fixed I did manage to get something functional based on your example, using Vec::leak() (which appears to be the crucial puzzle piece that I missed - thank you both!).

Now I'm trying to push this a bit further, and cobble together a static variable that can grow on demand (i.e. doesn't have to pre-load everything at once). I have something that compiles and seems to work, but it requires a line of unsafe, and, well, it wouldn't surprise me if it turns out that I'm doing something very silly :grinning_face_with_smiling_eyes:

Libraries used:

  • parking_lot: Accessing a value behind an ordinary Mutex requires going through a MutexGuard, which, as far as I can tell, "eats" the 'static lifetime. parking_lot::Mutex allows to (unsafely) access the contained data directly.
    • If I ever do get to actual multithreading, reads would likely (far) outstrip writes, so technically a RwLock would make more sense, but alas, FrozenVec is not Sync, and I have not find a datastructure that is, and also does everything else I want (they tend to use UnsafeCell under the hood).*
  • elsa: elsa::FrozenVec, for ergonomics mostly. It has a push_get method that returns a ref to the pushed data. It also allows the unsafe block to use &* instead of &mut *, which makes me feel slightly better. elsa is not available in the playground, so there I use Vec<Box<>> instead, which I think behaves similarly enough for these purposes.
use parking_lot::Mutex;
use elsa::FrozenVec;

static DATA: Mutex<FrozenVec<&'static [u8]>>> = Mutex::new(FrozenVec::new());

#[derive(Default, Debug)]
struct Cache {
    ref_collection: Vec<ExternalDerivedRef<'static>>,
}

impl Cache {
    fn add_data(&mut self, data: Vec<u8>) {
        let _lock = DATA.lock();
        // Safety: we hold the lock, so I think this is fine? Also, this would be
        // the only place where new DATA gets push'ed
        let raw = unsafe { &*DATA.data_ptr() };
        let data_ref = raw.push_get(data.leak());
        self.ref_collection.push(ExternalRef::from_data(data_ref).derived());
    }
}

(full playground example, using Vec<Box<>> instead of FrozenVec)

*While finishing this post, I realize that the Vec<Box<>> version probably does support RwLock... Also, I can't get footnotes to work properly it seems :grinning_face_with_smiling_eyes: