Help in understanding lifetime errors

Hello, I am a beginner trying to learn Rust by building a simple interpreted programming language. While programming blocks for the language, I ran into this lifetime error which I have trouble understanding.

So here is the code for the Block struct:

#[derive(Debug)]
pub struct Block<'a> {
    pub stmts: Vec<Stmt>,
    //The list of variables in the scope of the current block
    pub vars: HashMap<String, Literal>,
    pub parent: Option<Box<&'a mut Block<'a>>>,
}

impl<'a> Block<'a> {
    pub fn new(stmts: Vec<Stmt>, parent: Option<&'a mut Block<'a>>) -> Self {
        let parent = parent.map(Box::new);
        Self {
            stmts,
            vars: HashMap::new(),
            parent,
        }
    }

    pub fn execute(&mut self, print_expr_result: bool) {
        //execute all statements in the block
        let stmts = self.stmts.clone();
        for stmt in stmts.iter() {
            self.execute_stmt(stmt, print_expr_result);
        }
    }

    fn execute_stmt(&mut self, stmt: &Stmt, print_expr_result: bool) {
        //Execute a single statement
        match stmt {

            //Other statements match, some of them require &mut self
            //...

            Stmt::Block(stmts) => {
                let mut block = Block::new(stmts.clone(), Some(self));
                block.execute(print_expr_result);
            }
        }
    }

    //Other methods
    //...

}

Trying to compile this code gives the following error:

I am struggling to understand why the &mut self doesn't outlive the block's lifetime. Wouldn't the block go out of scope before the reference?

fn execute_stmt(&mut self, stmt: &Stmt, print_expr_result: bool) {
    match stmt {

        //...

        Stmt::Block(stmts) => {
            let mut block = Block::new(stmts.clone(), Some(self)); //<---- Block created here
            block.execute(print_expr_result);
        } //<---- Block goes out of scope here
    }
}//<---- &mut self goes out of scope here

At this point I am almost certain I need to use RefCell or Rc to fix this, but any help in understanding why this causes an error would be appreciated!

3 Likes

Any time you see &'x mut SomeType<'x> (specifically, a mutable borrow with a lifetime also used by the type being borrowed), that's a decent sign you're about to be in a world of hurt.

Let's go through this. The argument for Block::new is Option<&'a mut Block<'a>>, so the lifetime of the borrow must match the lifetime used for Block. That implies that '1 is 'a. But the signature of the execute_stmt function doesn't say this; it doesn't constrain '1 in any way whatsoever. It's not about what makes intuitive sense for this block, it's about what you're requiring the compiler to prove, and what you've required it to prove can't be done. I don't think changing to &'a mut self will help, because I suspect that will cause calling execute_stmt to permanently borrow self inside of execute, which will break the for loop.

Tying the lifetimes of multiple, unrelated mutable borrows together is just asking for problems. I'm honestly not sure how you'd make this work in practice with mutable borrowing. If blocks need to hold on to their parent, then that sounds like shared access and, yes, you'll probably need something like Rc<RefCell<Block>>.

But that, in itself, is a code smell in Rust. Sometimes necessary, but often worth avoiding by changing the design. I've had cases in the past where I needed to pass down some kind of "fallback" chain for things like lookups, and that was done by passing that reference down on a per-call basis, not by trying to maintain it in something like Block directly.

*Looks at example code again.* Oh, that is what you're doing. Should pay more attention. :stuck_out_tongue:

In short: Rust loves for things to be owned in trees, and to never, ever use links going back up the tree. Design the structures with that in mind, and you can mostly avoid problems with the borrow checker. I'd change execute/execute_stmt to take parent: Option<&mut Block> as an argument and ditch the parent field entirely.

1 Like

I didn't look into it, but this type looks highly suspicious. the type &'a mut Block<'a> will almost certainly cause problems, see:

3 Likes
use std::collections::HashMap;

#[derive(Debug)]
pub struct Block<'a> {
    pub vars: HashMap<String, usize>,
    pub parent: Option<Box<&'a mut Block<'a>>>,
}

impl<'a> Block<'a> {
    pub fn new(parent: Option<&'a mut Block<'a>>) -> Self {
        let parent = parent.map(Box::new);
        Self {
            vars: HashMap::new(),
            parent,
        }
    }

    pub fn execute(&mut self, print_expr_result: bool) {
        //todo...
    }

    fn execute_stmt<'b>(&'b mut self, print_expr_result: bool) where 'b: 'a {
        let mut block = Block::new(Some(self));
        block.execute(print_expr_result);
    }
}

fn main() {
    let mut block = Block {
        vars: Default::default(),
        parent: None,
    };

    let mut block_parent = Block {
        vars: Default::default(),
        parent: None,
    };
    block.parent = Some(Box::new(&mut block_parent));

    println!("{:?}", block);
}

Thanks for the detailed explanation! It seems I have misunderstood lifetimes a bit :sweat_smile:. I thought the child block's 'a is independent of the parent's 'a and all it did was make sure that the parent's reference (&'a mut Block<'a>) lived at least as long as the child block. So does it instead mean that the child block will live as long as the parent and keep the reference forever?

In short: Rust loves for things to be owned in trees, and to never, ever use links going back up the tree. Design the structures with that in mind, and you can mostly avoid problems with the borrow checker. I'd change execute /execute_stmt to take parent: Option<&mut Block> as an argument and ditch the parent field entirely.

I did consider doing that, but a parent can also have its own parent. For example I omitted these methods above:

impl<'a> Block<'a> {

    //...

   pub fn get_var(&self, name: &str) -> Option<&Literal> {
        if self.vars.contains_key(name) {
            self.vars.get(name)
        } else if let Some(ref parent) = self.parent {
            parent.get_var(name)
        } else {
            None
        }
    }

    //Reassign a variable in the block's scope
    //If not found, check the parent scope and reassign
    //Return true if the variable was found and modified
    pub fn insert_if_exists(&mut self, name: &str, value: Literal) -> bool {
        if self.vars.contains_key(name) {
            self.vars.insert(name.to_owned(), value);
            true
        } else if let Some(ref mut parent) = self.parent {
            parent.insert_if_exists(name, value)
        } else {
            false
        }
    }
}

So if a block doesn't have a variable, it asks its parent, who can also ask its parent and so on. Passing parent as an argument might mean that the parent might not be able to provide its own parent variables to its child.

This is tricky to explain because 'a is being used in multiple contexts. In other circumstances, the 'a in the Block::new call would be different to the 'a that's coming from the implementation of Block::execute_stmt. The problem here is that because of the specific code you've written, you have unintentionally forced them to be the same.

And, yes, in this specific case the moment self.execute_stmt is called, self would probably get borrowed forever. Probably because in matters such as these, I only trust the compiler, never myself. :slight_smile:

Aah yes, a classic case of Object-Oriented-Brain. Don't worry, there are treatments available that can correct the considerable damage it causes.

All jokes aside: what makes you think Block has to be the one to implement get_var? That's where it would go in an OOP design, but Rust isn't really an OOP language. You can have a free-standing get_var(name: &str, search: &BlockChain) -> Option<&Literal> that implements the traversal logic directly. This does mean you can't just do a lookup directly via a Block without going through its inheritance chain, but that's part of designing for Rust.

If you really need to track the parent in Block, you can restructure the code to have the parent field be Option<BlockId>, then keep all blocks in a central BlockIdBlock map of some kind, then pass a reference to that map into get_var.

Or, yes, just accept the performance cost and irritation of Rc<RefCell<Block>> / Weak<RefCell<Block>> (remember that Rust doesn't have a tracing GC!) everywhere in exchange for direct parent pointers.

Another way to look at it: in OOP, you often design code bottom-up based on viewing the universe form the perspective of individual objects. In Rust, it can be much easier to design code top-down based on viewing the universe from the perspective of the system as a whole.

I'm not 100% happy with this reply, but it's 2:30 am and I need to get to bed. :slight_smile:

1 Like

I didn't attempt anything beyond compilation, but sometimes it's possible to hide an arbitrary chain of lifetimes beyond type erasure (here, dyn BlockStuff):

pub struct Block<'a> {
    pub stmts: Vec<Stmt>,
    pub vars: HashMap<String, Literal>,
    pub parent: Option<Box<&'a mut dyn BlockStuff>>,
}

pub trait BlockStuff: fmt::Debug {
    fn execute(&mut self, print_expr_result: bool);
    fn execute_stmt(&mut self, stmt: &Stmt, print_expr_result: bool);
    fn get_var(&self, name: &str) -> Option<&Literal>;
    fn insert_if_exists(&mut self, name: &str, value: Literal) -> bool;
}

impl BlockStuff for Block<'_> {
    // ...
}

impl<'a> Block<'a> {
    // This one stays an inherent method
    pub fn new(stmts: Vec<Stmt>, parent: Option<&'a mut dyn BlockStuff>) -> Self {
        // ...
    }
}
1 Like

&'a mut Block<'a> is probably the problem, try chaning it to &'a mut Block<'b>. The reason it is problematic is that you are (esentially) saying that the reference to the block must live as long as the block itself.

And because the data is self, then the reference must also live as long as self lives, which it clearly doesn't, so you want two idependent lifetimes. In most cases, a reference to some data will live shorter than that data.

I think this maybe could work with shared references because of "lifetime shortening", but I can't find a link to the article explaining it at the moment.

In the type &'a mut Block<'a>, the lifetime variable 'a certainly denotes the same lifetime in both positions. I'm not sure why you would expect them to be independent. If you declared a variable and referred to it by its name in two places, would you also expect the two accesses to refer to different variables? The whole point of giving things a name is to refer to them unambiguously; suddenly breaking the consistency of notation would be a very bad surprise and nothing ever would work correctly.

Anyway, what comes into play here is invariance. Normally, behind a non-mutable reference, &'a Block<'a> is OK because immutable references are covariant in their referent type: a &'_ T<'long> can be converted to &'_ T<'short>.

But with mutable references, it's different: converting a &'_ mut T<'long> into a &'_ mut T<'short> would be very obviously unsound, because you could then write a short-lived object through the reference, while the owner of the referred value expected a long-lived object -> instant UaF bug. Thus, &'a mut Block<'a> only ever denotes the single, exact type &'a mut Block<'a>, and never &'a mut Block<'shorter_than_a>, which means that you are asking the Block to be borrowed for precisely the entirety of its valid lifetime, effectively rendering it useless.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.