Reference to ancestor in AST

New Rust user here. As an exercise when I learn a new language, I tend to implement the front end of the dragon book (you can find the code here). So far everything has work well until I found an impase with implementing the break statement.

The break statement holds a reference to its enclosing while or do-while bloc. This reference is used in the code generation phase, so after building the AST. The code looks as follows:

// statement.rs
pub trait Statement {
  fn generate(&mut self, b: &mut String, begin: i64, after: i64) -> Result<()>;
  fn label_after(&self) -> i64;
}

struct WhileStmt {
  cond: Box<dyn Expression>,
  body: Box<dyn Statement>,
  after: i64,
}

impl Statement for WhileStmt {
  fn generate(&mut self, b: &mut String, begin: i64, after: i64) -> Result<()> {
    self.after = after;
    cond.generate(b, 0, after);
    let lbl = new_label();
    body.generate(b, lbl, after);
    // ...
    Ok(())
  }

  fn label_after(&self) -> i64 { self.label }
}

pub struct BreakStmt<'a> {
  enclosing: &'a Box<dyn Statement>
}

impl Statement for BreakStmt<'_> {
  fn generate(&mut self, b: &mut String, begin: i64, after: i64) -> Result<(), String> {
    emit(b, format!("goto L{}", self.enclosing.after()));
    Ok(())
  }
}

Now my parser should do something like this

// parser.rs

pub struct Parser {}

impl Parser {
  // ...
  fn stmt(&mut self, enclosing: &Box<dyn Statement>) -> Result<Box<dyn Statement>> {
    match self.lookahead.tag() {
      WHILE => {
        let wh = Box::new(WhileStmt::emtpy());
        // ...
        let body = self.stmt(wh)?; // THIS NEVER WORKS
        // ...
        Ok(wh)
      },
      BREAK => {
        let brk = BreakStmt::new(encstmt)?;
        Ok(Box::new(brk))
      }
      // ...
    }

    fn program() -> Result<()> {
      let nullstmt: Box<dyn Statement> = Box::new(NullStmt::new());
      self.stmt(&nullstmt)?
      Ok(())
    }
  }

The code above never works for different reasons, depending on the changes I made either because wh is not areference to a Box<dyn Statement> but a Box<WhileStmt>.
If I modify the code so no reference to a Box is used, but &dyn Statement then I cannot get the lifetimes parameters right no matter what I do.

I know I can modify the code, so that the after label gets trickled down during code generation, but If I want to be trustful to the original implementation, what am I missing?

Self-referential types using plain references are not possible in safe Rust. (The reason for this is pretty trivial – if such a type were moved, self-references would be invalidated.)

In an AST, it is unusual to create such semantic bindings so early. Make your AST a pure tree instead (ie., no backwards edges), and resolve scopes using a separate scope stack data structure at codegen time instead.

1 Like

The fact that Rust's ownership and borrow system does for the data (affine type system is the language theory name) does to data what structured programming did to code.

Like with strctured programming there are a problem: while there are exist algorithms which may transform your spaghetti code program into structured code purely mechanically, without changes… they tend to generate fake-structure to achieve that.

That is: mechanically converted program works like the original one, technically, but it's even harder to understand how it works than with original, spaghetti code/data program!

Similarly with Rust: mistake #6 (apply best practices from other languages) is half of the article “How not to learn Rust”.

Essentially Rust strongly dislikes designs which violate Hoare Property and thus you have to ask yourself in such cases: do you want to push ahead, break the limitation which Rust imposes on you and get faithful representation of the foreign code in Rust… or you do want to try to write idiomatic Rust code.

If you want the latter then yes, you would have to change the structure of the code, if you want to former then you go with Rc/RefCell/Weak or Arc/Mutex/Weak with interior mutability.

Also, please, remember that while it's always beneficial to think about whether your design makes sense it doesn't mean you have to use references as much as possible (that's actually mistake #7 from the aforementioned article): sometimes changing the structure of your program to make sure silly mistakes are not possible just makes you spend so much time and effort that accepting that you have to be careful in a few trickly places in your code makes the whole thing more robust.

Standard library wouldn't have included refcounted data structures if that wouldn't have been so!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.