Is there a better way to represent an abstract syntax tree?

Here are a couple of other options:

Instead of Binary(Box<Expr>, Operator, Box<Expr>) you can do something like

enum Expr {
    ...
    Binary(Box<BinaryExpr>)
}

struct BinaryExpr {
    op: Operator
    lhs: Expr, rhs: Expr,
}

This allows you to implement specific methods for BinaryExprs, and also allows to keep the size of the Expr itself small. I've used this representation here: https://github.com/matklad/miniml/blob/master/ast/src/exprs.rs

Another representation, which is used in the context of IDEs, when you want to have a lossless representation of the source code, as opposed to the abstract syntax tree, is a two-layer one.

On the first layer you have an untyped syntax tree, which basically marks ranges in the source text:

struct NodeType(u32);

struct Node {
  type: NodeType,  
  text_range: (usize, usize),
  children: Vec<Node>
}

on top of that, you have a typed representation, which delegates to the syntax tree

struct BinaryExpression {
    node: Node
}

impl BinaryExpression {
    fn lhs(&self) -> Expression { Expression { node:  self.children[0] } }
}
6 Likes