I'm trying to build a parser for a language where files can include other files. I don't need to go into details about the language as the problem I will describe is pretty generic. Specifically, my parser, instead of creating copies of bytes from the file/input it is parsing, keeps references to byte slices, such that we have a (tremendously simplified) struct like:
struct ASTNode<'src> {
// actually a tree of nodes, but pretend we have only one thing right now.
attr: &'src [u8],
}
// used like
let contents: Vec<u8> = // read from a file.
let ast = Parser::parse(&contents); // ast now refers to bytes owned by contents
Now, I'm confused about how I would support these include statements. Every time the parser encounters one of these, it is going to create a sub parser, and acquire an AST that refers to bytes owned by a vector created in parse()
. Apart from trying to solve the problem of how that vector is made to live long enough in the program space (which is more of an API design problem than a Rust question), I'm very confused how to express that nesting in the AST. The straightforward, but invalid way would be:
struct ASTNode<'src> {
attr: &'src [u8],
includes: Vec<ASTNode<'other_src>>, // where 'other_src is actually going to be different for each entry in the vec as each include has a lifetime bound to a separate bytes vector.
}
Is there a good solution out of this?
One thing I can think of is having some kind of arena with a fixed lifetime, and all allocations to read a file are made in that arena (perhaps just a giant Vec where I keep appending every file read) and then having all ASTNodes bound by the lifetime of that arena. Is this the only reasonable approach?
Thanks!