Confusion with Lifetimes of Struct-Embedded References

I'm trying to understand the borrow checker errors that I'm getting with this code, reproduced below for convenience:

use std::marker::PhantomData;

#[derive(Debug, Default)]
struct Token {}

#[derive(Debug)]
struct DataHolder<'a> {
    data: String,
    ref_holder: PhantomData<&'a Token>
}

#[derive(Debug)]
struct DataSource {
    string_vec: Vec<String>,
    token: Token
}
impl DataSource {
    pub fn new() -> DataSource {
        DataSource {
            string_vec: Vec::new(),
            token: Token::default()
        }
    }
    pub fn insert_element<'a>(&'a mut self, str: &str) -> DataHolder<'a> {
        self.string_vec.push(str.to_owned());
        DataHolder {
            data: str.to_owned(),
            ref_holder: PhantomData::default()
        }
    }
}

fn main() {
    let mut src = DataSource::new();
    let handle1 = src.insert_element("abc");
    let handle2 = src.insert_element("def");
    println!("{:?}", src);
    println!("{:?}", handle1);
    println!("{:?}", handle2);
}

I am getting error[E0499]: cannot borrow src as mutable more than once at a time --> src/main.rs:36:19 and error[E0502]: cannot borrow src as immutable because it is also borrowed as mutable --> src/main.rs:37:22 from the above code snippet.

My goal with the Token type and fields using it is to statically guarantee that the DataHolder does not outlive the DataSource (the actual code I extracted this MCVE from uses a SlotMap in the equivalent of the DataSource, with SlotMap keys stored in the DataHolder). If this approach is infeasible, what would be the recommended way of accomplishing this instead?

The issue here is that your signature for DataSource::insert_element

fn insert_element<'a>(&'a mut self, s: &str) -> DataHolder<'a>

(which is equivalent to (&mut self, &str) -> DataHolder<'_> by lifetime elision) ties the lifetime parameter in the returned DataHolder to the mut = exclusive borrow of self. Because this lifetime originates from an exclusive borrow, you cannot do anything that might invalidate that borrow between its creation

        let handle1 = src.insert_element("abc");

and its later use

        println!("{:?}", handle1);

and borrowing src again

        let handle2 = src.insert_element("def");

violates that condition. See here for more on this kind of problem.

Now, the reason that insert_element needs &mut self is to push the given string onto its string_vec. One way to work around this is with interior mutability: store string_vec: RefCell<Vec<String>> and add .borrow_mut() before the push. This allows insert_element to take just &self, and then your code compiles. That's not a great solution because RefCell::borrow_mut will panic if you use it inappropriately (there's a fallible version but how are you going to handle the error?). An alternative would be to use Cell<Vec<String>> and swap an empty Vec into place as a sentinel while you're doing the modification in insert_element.

2 Likes

Thanks for your reply. Is there any way to change the function signature of insert_element while preserving the exclusivity of the &mut self call, or would I have to use &self and interior mutability (as you suggested) to make this work?

As long as you have &'a mut self in the arguments and DataHolder<'a> for the return type, you will not be able to interleave calls to insert_element and accesses in the way your sample code does.

2 Likes

Keep in mind that the borrow checker checks against interfaces, not against what the code does.

Even though PhantomData reference is not real, and wouldn't cause unsafety if it was invalidated, the lifetimes you've declared are protecting from a very real bug that could have been happending:

pub fn insert(&'a mut self) -> &'a str {
    self.string = String::from("hello"); // assignment drops previous value
    &self.string
}

If you call it once, then it's perfectly safe:

let s = d.insert();
println!("{}", s);

but if you call it twice, then it's a use-after-free vulnerability:

let s1 = d.insert();
let s2 = d.insert(); // destroyed s1 and replaced it with s2
println!("{}", s1); // UAF bug
println!("{}", s2);

So &'a mut self intentionally freezes the whole object for the entire duration of the 'a lifetime in order to prevent anything touching 'a from being invalidated.

If you don't intend to invalidate the references, the use &'a self instead. mut is not for mutation, but for exclusivity. You can mutate via shared &, e.g. if you use an arena.

1 Like

Would you mind elaborating on how arenas have interior mutability? I'm using SlotMap for the code that I extracted my example from, and SlotMap::get_mut (and other methdos to get mutable references to the items stored in the SlotMap) require a mutable borrow of the SlotMap in order to provide a mutable reference to the items within it.

Append-only arenas are able to give out mutable elements from a shared borrow, because they ensure the references they give out are valid for the entire lifetime of the arena.

The implementation of arenas has to internally use unsafe, because the borrow checker doesn't understand this concept (example)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.