Code-sharing for two related kinds of structures

I'm working on some code to represent and reason about uninterpreted terms, and I'd appreciate help finding a representation that preserves important distinctions while avoiding code duplication.

A Term is a tree constructed of Operators and Variables. Operators take 0 or more arguments, while variables are complete terms in themselves. Operator arities and human-readable names for variables and operators are all tracked in a separate interning structure. The details of the interner aren't important here, but the other structures are defined as follows:

pub struct Variable(pub usize);

pub struct Operator(pub usize);

pub enum Term {
    Variable(Variable),
    Application { op: Operator, args: Vec<Term> },
}

It's also useful to be able to reason about a Context, which is an incomplete term, a term with holes in its tree structure. Pragmatically, these holes are empty places in a term that need to be filled in to get a complete term. Mathematically, the hole is just a special operator that's always defined and distinct from all other operators.

Right now, I've defined the Context as a separate structure:

pub enum Context {
    Hole
    Variable(Variable),
    Application { op: Operator, args: Vec<Context> },
}

While this provides a nice guard in the typesystem against there being holes in a Term, it doesn't prevent you from constructing a Context without holes in it. Moreover, this approach leads to lots of code duplication, as there are many kinds of operations that you need to define for both a Context and a Term (e.g. iterate over substructures, retrieve the substructure at a given location, ...). Removing the current Term definition and renaming Context to Term removes the duplication but also removes any sort of typesystem distinctions between a Term and a Context.

Is there an idiomatic way that I could represent a term without holes and a term with holes as separate types while also avoiding lots of code/method duplication? Thank you!

You could have

pub struct Application<T> {
    pub op: Operator,
    pub args: Vec<T>,
}

pub enum Term {
    Variable(Variable),
    Application(Application<Self>),
}

pub enum Context {
    Hole,
    Variable(Variable),
    Application(Application<Self>),
}

Or maybe even

pub enum VarOrApplication<T> {
    Variable(Variable),
    Application(Application<T>),
}

pub struct Term(VarOrApplication<Self>);

// This is basically Option<VarOrApplication<Self>> now...
pub enum Context {
    Hole,
    VarOrApplication(VarOrApplication<Self>),
}

And then implement generically or via traits on VarOrApplication<T> and/or Application<T> where possible.


Another idea: You could have

pub enum GenericContext<HoleType> {
    Hole(HoleType),
    Variable(Variable),
    Application(Application<Self>),
}

// Stand in for the unstable never type, `!`
pub enum NoHole {}
pub type Term = GenericContext<NoHole>;

pub type Context = GenericContext<()>;

And then you can't have Holes in your Terms.

2 Likes

I like this idea. Conceptually, a Term is a Context that just happens not to have a Term::Hole in it. This setup lets you represent that distinction in the typesystem, which is great, and it provides only a single type, GenericContext, which eliminates a lot of code duplication but also lets you implement methods separately for Context<()> and Context` if they might benefit from specialization.

Thank you!

For any future readers, if !/never is new to you, this reference provides some useful context and idioms.