Traversing AST in VM

Hi, I am writing an VM in Rust. I want to traverse the AST tree like that:

struct Function { name: String, test_value: bool }
struct Program { functions: Vec<Function> }
struct SemCheck { program: Program, errors: Vec<String> }

impl SemCheck {
    fn check_program(&mut self) {
        for fct in &mut self.program.functions {
            self.check_fct(fct); // ERROR
        }
    }
    
    fn check_fct(&mut self, fct: &mut Function) {
        fct.test_value = true; // I want to change fct
        self.errors.push("test".to_string()); // but also change SemCheck
    }
}

fn main() {
    let check = SemCheck { program:
        Program { functions: vec![] }, errors: vec![] };
    check.check_program();
}

Of course, the borrow checker does not allow that. But what is the best (rust idiomatic) way to achieve something like that? How does rustc solve that? Can you even point me to some corresponding rustc source code?

1 Like

https://github.com/rust-lang/rust/blob/master/src/libsyntax/visit.rs is how rustc does it, but note that rustc's AST is essentially immutable. There is a separate fold for creating a new AST from an old one.

Thanks, creating a new AST from the old one seems to be rather expensive though (Although probably not really important in my tiny project). So the other 2 options I have are essentialy: (1) store additional information somewhere else (2) use unsafe code.

Also thanks for the link. I read this code and wondered how they would change the AST, I didn't know that the AST is immutable.

Rather than use unsafe code, you can probably accomplish what you want with various designs using RefCell and Cell.

I think that you can avoid the problems by removing the field named program from SemCheck.

struct Function { _name: String, test_value: bool }
struct Program { functions: Vec<Function> }
struct SemCheck {errors: Vec<String> }

impl SemCheck {

    fn new (program: &mut Program) ->SemCheck {
        let mut semcheck = SemCheck { errors: vec![] };
        for fct in &mut program.functions {
            semcheck.check_fct(fct); // ERROR
        }
        semcheck
    }
    
    fn check_fct(&mut self, fct: &mut Function) {
        fct.test_value = true; // I want to change fct
        self.errors.push("test".to_string()); // but also change SemCheck
    }
}

fn main() {
    let mut program = Program { functions: vec![] };
    let _check = SemCheck::new(&mut program);
}

EDIT: I guess that it is bad style for a constructor to mutate it's arguments, but I think that the general idea is sound.

I think using RefCell isn't really possible: Rust Playground. I would still get an error on the same line, I can not borrow &RefCell and &mut self at the same time.

@nielsle: This works in this case but in my real program Function also stores an Expr-AST. So I guess I would have the same problem, just one level deeper.

I think i will assume the AST to be immutable but just in case: How would I do that with unsafe code? The easiest way seems to be transmuting &Function to &mut Function. But this could cause undefined behavior according to the compiler:

let fct : &mut Function = unsafe { std::mem::transmute(fct) }; // error by default

I can avoid this error with:

let fct : &mut Function = unsafe { std::mem::transmute(fct as *const Function) };

So this would work: Rust Playground. But I am unsure about the possibility of undefined behavior, since I probably only avoided the error message of the compiler. What does that mean in this case? As long as I stick to some rules, like not adding more functions to the Vec I should be safe, am I not?

The function declaration fn check_mutable(&mut self......), requires SemVer to be mutable. Otherwise you should have written fn check_mutable(&self......). This is problematic because the fields of SemVer inherit their mutability behavior from the parent. Therefore you end up requiring Program to be mutable.

This can be solved in several ways. The easiest solution is to remove the field named program from SemVer. Here is a version where the AST is immutable.

Actually I removed all fields from Semver to have better control of what is mutable and immutable, so you could argue that the struct named SemVer is now redunant.

1 Like

Thanks, seems my problem was that I stored Program in the struct. I agree that this shouldn't be necessary - even in the full program.

1 Like