Beginner question - cannot borrow as mutable more than once

Hi all, I'm new to Rust and coming from Python, never wrote any C or C++ code before. (yes, It's really hard for me but I'm slowly progressing).

Anyways, for learning purposes I started to implement simple language interpreter and reading Orendorff/Blandy book but I'm seeing a pattern where I don't exactly know how to deal with it in Rust:

use crate::ast;
use crate::ast::Program;
use crate::lexer::Lexer;
use crate::token::Token;

struct Parser<'a>{
    l: Lexer<'a>,
    cur_token: Token<'a>,
    peek_token: Token<'a>,
}

impl <'a>Parser<'a> {
    fn new(l: Lexer<'a>) -> Parser<'a> {
        let mut p:Parser = Parser {
            l,
            cur_token: Token::new(),
            peek_token: Token::new(),
        };
        p.next_token();
        p.next_token();
        p

    }
    fn next_token(&'a mut self) {
        self.cur_token = self.peek_token.clone();
        self.peek_token = self.l.next_token();
    }
    fn parse_program() -> ast::Program<'a> {
        todo!()
    }
}

In Parser::new method, I can't call more than once p.next_token and return the p object. I'm getting:

cannot borrow p as mutable more than once at a time

How is the best way to deal with such a pattern (when you want to call self method more than once and return the object) in idiomatic Rust?

You basically never want the struct lifetime on a &mut self or &self. Doing that borrows the struct forever, because a lifetime annotated on a struct is always larger than or equal to the struct's lifetime. Hence your error. The first call to .next_token() borrows p until p is destroyed, but the second call happens before p is destroyed, so you have overlapping mutable borrows.

When you write <'a> on a struct, you are telling the compiler "This struct contains references to a value stored somewhere outside this struct, and you should use this lifetime to track that external location to ensure the reference stays valid". If those things marked with <'a> don't actually point to outside the struct, but to another field in the same struct, then you will run into unfixable lifetime issues.

That said, I can't tell if that is the case here, as you have not posted the definition of those other structs.

2 Likes

Here are Lexer and Token structs, I'm not sure if it is helpful:

pub struct Lexer<'a> {
    input: &'a str,
    position: usize,      // current position in input (points to current char)
    read_position: usize, // current reading position in input (after current char)
    ch: char,             // current char under examination
}

#[derive(Debug, Clone)]
pub struct Token<'a> {
    pub ttype: TokenType<'a>,
    pub literal: String,
}

#[derive(PartialEq, Eq, Debug, Clone)]
pub enum TokenType<'a> {
    ILLEGAL(&'a str),
    EOF(&'a str),
    // Identifiers + literals
    IDENT(&'a str), // add, foobar, x, y, ..
    INT(&'a str),   // 1343456
    // Operators
    ASSIGN(&'a str),
    EQ(&'a str),
    NOTEQ(&'a str),
    PLUS(&'a str),
    MINUS(&'a str),
    BANG(&'a str),
    ASTERISK(&'a str),
    SLASH(&'a str),
    LT(&'a str),
    GT(&'a str),
    // Delimiters
    COMMA(&'a str),
    SEMICOLON(&'a str),
    LPAREN(&'a str),
    RPARENT(&'a str),
    LBRACE(&'a str),
    // Keywords
    RBRACE(&'a str),
    FUNCTION(&'a str),
    LET(&'a str),
    TRUE(&'a str),
    FALSE(&'a str),
    IF(&'a str),
    ELSE(&'a str),
    RETURN(&'a str),
}

I'm sure I'm doing something extremely wrong here, because I still don't have good in depth understanding of lifetimes. The way I came to those lifetimes was mainly following the compiler suggestions.

When you write <'a> on a struct, you are telling the compiler "This struct contains references to a value stored somewhere outside this struct, and you should use this lifetime to track that external location to ensure the reference stays valid".

So if I have a struct that contains another struct, those this means that the value is stored outside the main struct?

Where is the data in input stored?

Where is the data in input stored?

It comes from here. I decided to use &str instead of String, which I don't know if it is the best idea :slight_smile:

    #[test]
    fn next_token() {
        let input = r#"let five = 5;
                            let ten = 10;
                            let add = fn(x, y) {
                            x + y;
                            };

                            let result = add(five, ten);
                            !-/*5;
                            5 < 10 > 5;

                            if (5 < 10) {
                                return true;
                            } else {
                                return false;
                            }

                            10 == 10;
                            10 != 9;
                            "#;

As long as it points into a variable outside of your struct, it is fine. Can you show your next_token method? What happens without the lifetime?

fn next_token(&mut self) {
    self.cur_token = self.peek_token.clone();
    self.peek_token = self.l.next_token();
}

Can you show your next_token method? What happens without the lifetime?

I'm getting:

error[E0495]: cannot infer an appropriate lifetime for autoref due to co
nflicting requirements
  --> src/parser.rs:26:34
   |
26 |         self.peek_token = self.l.next_token();
   |                                  ^^^^^^^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #1 defin
ed on the method body at 24:5...
  --> src/parser.rs:24:5
   |
24 | /     fn next_token(&mut self) {
25 | |         self.cur_token = self.peek_token.clone();
26 | |         self.peek_token = self.l.next_token();
27 | |     }
   | |_____^
note: ...so that reference does not outlive borrowed content
  --> src/parser.rs:26:27
   |
26 |         self.peek_token = self.l.next_token();
   |                           ^^^^^^
note: but, the lifetime must be valid for the lifetime `'a` as defined o
n the impl at 12:7...
  --> src/parser.rs:12:7
   |
12 | impl <'a>Parser<'a> {
   |       ^^
note: ...so that the expression is assignable
  --> src/parser.rs:26:27
   |
26 |         self.peek_token = self.l.next_token();
   |                           ^^^^^^^^^^^^^^^^^^^
   = note: expected  `token::Token<'a>`
              found  `token::Token<'_>`

error: aborting due to previous error

Here is the Lexer impl:

impl<'a> Lexer<'a> {
    pub fn new(input: &'a str) -> Self {
        let mut l = Lexer {
            input,
            position: 0,
            read_position: 0,
            ch: '\0',
        };
        l.read_char();
        l
    }

    // Read the current character
    fn read_char(&mut self) {
        if self.read_position >= self.input.chars().count() {
            self.ch = '\0';
        } else {
            self.ch = self
                .input
                .chars()
                .nth(self.read_position.try_into().unwrap())
                .unwrap()
        }
        self.position = self.read_position;
        self.read_position += 1;
    }

    // Skip the whitespace characters in the input
    fn skip_whitespace(&mut self) {
        while self.ch == ' ' || self.ch == '\t' || self.ch == '\n' || self.ch == '\r' {
            self.read_char();
        }
    }

    // Main method which return the next token from the input.
    pub fn next_token(&mut self) -> Token {
        self.skip_whitespace(); // We need to skip the whitespace and the new lines from the input

        match self.ch {
            '=' => {
                if self.peek_char() == '=' {
                    let ch = self.ch;
                    self.read_char();
                    let mut tok = self.new_token(EQ, ch);
                    let mut new_literal = ch.to_string();
                    new_literal.push(self.ch);
                    tok.literal = new_literal;
                    self.read_char();
                    tok
                } else {
                    let tok = self.new_token(ASSIGN, self.ch);
                    self.read_char();
                    tok
                }
            }
            ';' => {
                let tok = self.new_token(SEMICOLON, self.ch);
                self.read_char();
                tok
            }
            '(' => {
                let tok = self.new_token(LPAREN, self.ch);
                self.read_char();
                tok
            }
            ')' => {
                let tok = self.new_token(RPAREN, self.ch);
                self.read_char();
                tok
            }
            ',' => {
                let tok = self.new_token(COMMA, self.ch);
                self.read_char();
                tok
            }
            '+' => {
                let tok = self.new_token(PLUS, self.ch);
                self.read_char();
                tok
            }
            '*' => {
                let tok = self.new_token(ASTERISK, self.ch);
                self.read_char();
                tok
            }
            '-' => {
                let tok = self.new_token(MINUS, self.ch);
                self.read_char();
                tok
            }
            '!' => {
                if self.peek_char() == '=' {
                    let ch = self.ch;
                    self.read_char();
                    let mut tok = self.new_token(NOTEQ, ch);
                    let mut new_literal = ch.to_string();
                    new_literal.push(self.ch);
                    tok.literal = new_literal;
                    self.read_char();
                    tok
                } else {
                    let tok = self.new_token(BANG, self.ch);
                    self.read_char();
                    tok
                }
            }
            '/' => {
                let tok = self.new_token(SLASH, self.ch);
                self.read_char();
                tok
            }
            '<' => {
                let tok = self.new_token(LT, self.ch);
                self.read_char();
                tok
            }
            '>' => {
                let tok = self.new_token(GT, self.ch);
                self.read_char();
                tok
            }
            '{' => {
                let tok = self.new_token(LBRACE, self.ch);
                self.read_char();
                tok
            }
            '}' => {
                let tok = self.new_token(RBRACE, self.ch);
                self.read_char();
                tok
            }
            // This happens when there is no more characters i.e. end of the input
            '\0' => Token {
                ttype: EOF,
                literal: String::from(""),
            },
            _ => {
                let mut tok = Token::new();

                if self.ch.is_alphabetic() {
                    // Some words are specific for the language(keywords) and we need to distinguish
                    // that from the identifiers chosen by the user(function names, variables, etc).
                    // We need to lookup every word if it matches any of the keywords
                    tok.literal = self.read_identifier();
                    tok.ttype = Token::lookup_ident(&tok.literal);
                    tok
                } else if self.ch.is_numeric() {
                    // Any consecutive digits(0-9) are matched as single INT token
                    tok.ttype = INT;
                    tok.literal = self.read_number();
                    tok
                } else {
                    // Map any unrecognizable char as illegal
                    let tok = self.new_token(ILLEGAL, self.ch);
                    self.read_char();
                    tok
                }
            }
        }
    }

    // Return a number if consecutive digits(0-9) are found
    fn read_number(&mut self) -> String {
        let position = self.position;

        while self.ch.is_numeric() {
            self.read_char()
        }

        let result = String::from(self.input);
        result[position..self.position].to_string()
    }

    // Underscore is also treated as a letter
    fn is_letter(&self, ch: char) -> bool {
        if ch.is_ascii_alphabetic() || ch == '_' {
            true
        } else {
            false
        }
    }

    // Read consecutive letters and return identifier
    fn read_identifier(&mut self) -> String {
        let position = self.position;

        loop {
            if self.is_letter(self.ch) {
                self.read_char();
            } else {
                break;
            }
        }

        let result = String::from(self.input);
        result[position..self.position].to_string()
    }

    fn new_token(&self, ttype: TokenType<'a>, ch: char) -> Token<'a> {
        Token {
            ttype,
            literal: ch.to_string(),
        }
    }
    fn peek_char(&self) -> char {
        if self.read_position >= self.input.chars().count() {
            '\0'
        } else {
            self.input.chars().nth(self.read_position).unwrap()
        }
    }
}

Sorry if it is too much code or confusing. I'm not looking for specific fix for my case, I just want to know why it is happening and to learn valuable lessons from this.

Does it compile with Token<'a> as the return value of next_token? I can talk a bit why it failed later this evening.

Yes, It works :slight_smile:

I can talk a bit why it failed later this evening.

Sure, no hurry.

So the returned Token<'a> will live as long as the Lexer and the Parser, because the Parser contains the Lexer and they have the same lifetime <'a> ?

And maybe the main lesson is never set lifetime to &self and &mut self...

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.