Review of a Programming Language

So, I recently embarked on a journey to make a programming language in Rust and came out the other end with Rlux, it's a smaller language with a pretty standard feature set and I really want to make it more modular

Right now the pipeline is pretty standard, but my code is highly coupled, and I would really want to try to refactor it so smaller modules can be used to swap in and out, as an end goal I would like each "unit" of my pipeline to be it's own crate, the catch is that I have never really refactored something this way or on this scale, so the actual way it needs to be sliced to still be usable and run effectively escapes me at the moment, so if anyone could show me how I could even start to think about slicing up my code base, it would be highly appreciated.

Thank in advance for any time or advice!

3 Likes

It looks really cute. Perhaps you need to find a niche where the language could be used. For example, I also did a language as first Rust project. Now, I use it instead of Cargo.

1 Like

What is the name of your language?

I would love to see how other similar efforts are being handled!

Please only share if you feel comfortable, I don't want to pressure you but I also really enjoy this kind of thing

I called it the RustBee. You can glance over the doc - https://gitlab.com/tools6772135/rusthub/-/blob/master/doc/rustbee/README.md . Sources are also presented there.

1 Like

Very nice project!

It's always hard to avoid that in interpreters. You could start by isolating the parser, even though Statement and its related types will be coupled between the parser and the interpreter, unless you break that with another intermediate representation at some point.

Talking of which, the Interpreter type is in the parser module: that'd be the first item I'd move away. But I only had a quick, casual look. Are you parsing and interpreting simultaneously, or are you storing the AST and then interpreting it with a visitor (or is that your goal)?

If you can separate the two, if you decide one day to compile (to bytecode or something else) or to introduce an optimization phase, it should be easier to insert it between those two stages.

You'll probably want to separate the bin from the lib, too. So binary, common lib, parser, and interpreter seems like a natural way to split.

If one day the interpreter is more independent of the parser (code, stored bytecode or any IR), it makes sense to see what would be the minimal library for the interpreter. But perhaps it's too early to decide.

EDIT: How do you like the book? I've only read a few tiny parts, as I'm currently more involved with compilers than interpreters.

1 Like

So then would making some kind of Intermediate type that defines something similar to a socket be good way to tackle Statement being highly coupled? I feel like that's what you're saying but I don't 100% understand

I absolutely LOVE the book, it gave me so much more insight than my college course on programming languages, because I could always understand regex's, state machines and turing machines, but that always felt so disconnected from the idea of turning either text a into text b using some intermediate idea (compilers) or making text a do something on the machine directly (interpreters), but the book lays out what, why, and especially how, in a really unique way, that helped me fall in love with languages even more than I already was. It's almost like getting to know someone you love even better, and now being able to explain why you love them.

1 Like

This super cool, I love your idea of having blocks default to defining scope because it always feels like that's supposed to be their primary purpose from a user perspective.

Hi Chase,

If you're looking to continue your journey, consider developing a browser plugin for your language. It's an intriguing challenge that differs in many ways from traditional language compilers or interpreters. My framework, Lady Deirdre, could be helpful for this as well.

Best,
Ilya

Since you call it Lux, did you deviate from the Lox language Robert uses in Crafting interpreters?

Not sure you found it, but there are a bunch of Lox implementations in this repo.

I think lox is a stripped down version of wren - also by Robert Nystrom.

When I read Crafting Interpreters I found the clox implementation (bytecode interpreter) very interesting - it's a great c-program to study (NaN boxing, tagged unions, string interning, Pratt parsing, ...). Quite a bit crammed into that not-too-long program.

Currently, the parser produces statements, and the interpreter processes statements, so any change in the language and/or the parser requires a parallel change in the interpreter.

Instead, the statements could be translated into something closer to a processing unit. For example, instead of having different statement objects for "if", "while", and "for", you could translate them into jumps, the other instructions being data processing. Similarly, instead of having expressions graphs with several types (well, at least logical and arithmetic) and operations / function calls, you could translate that into a linear succession of simple operations and calls/returns. A little like assembly or bytecode, but it could be at a higher level.

It requires a fair amount of work, but it should be very interesting to do. Once you get there, you can modify or add features to your language and keep the interpreter unchanged (most of the time), and you can even add an optimization phase between the two if you fancy exploring that.

One nice alternative to jump ops for this sort of interpreter is explicit basic blocks; you end up with something like:

struct Code {
  blocks: Vec<Block>
}

struct Block {
  ops: Vec<Op>,
  branch: Branch, 
}

enum Branch {
  Return, 
  Jump { block: size },
  JumpIfLessThan {
    true_block: usize,
    false_block: usize,
  },
  ...
}

Doesn't completely eliminate forward/back branching weirdness, but it does simplify it a bit by removing the need for patching in jump targets or a second pass or something.

2 Likes

I deviated somewhat, there are a couple language features like Ternary expressions that I included that was discussed in the challenges part of the book, there are also a couple of features that are just unimplemented currently, like the class system, inheritance, and a couple other OO concepts