Learning about compilers with Rust

Hi everyone ^^

So @Vicho and I just entered a course on compilers, and since the implementation language is up to each group, we naturally decided to use Rust :smile:
The enums of Rust seem well-suited for lexing and parsing, and the speed of Rust itself is of course a plus <3

However, we haven’t decided yet what language we’ll make a compiler for. We were thinking of a subset of Rust, without generics, traits and explicit lifetimes, but that itself might prove too complex (specially because of borrows and lifetimes).

What would you recommend? Will compiling Rust itself be that hard for beginners? What languages would you recommend from a learning standpoint?

Thanks, and have a good weekend :blush:

2 Likes

miniml might be a good choice: http://plzoo.andrej.com/language/miniml.html

3 Likes

One of the first “big” projects I like to try out when I’m learning a language is to write a compiler or virtual machine for a toy programming language, and so far I think Rust was the best suited for the task.

The language choice really depends on how much time you’ve got and how far down the rabbit hole you want to go. Rust will probably be difficult because we’ve got a lot of fancy features which all require quite a lot of work and knowledge about type systems.

You could always implement a subset of C. It’s very well known, has several high quality implementations with formal specifications, and the language maps to machine code quite nicely.

2 Likes

Thanks y’all :slight_smile:

Vicho has kind of convinced me to compile a subset of Rust without generics and the heap, because that would simplify things quite a bit.

However, I feel like the borrow checker and lifetimes would still have to be implemented, would they not? Lifetimes in particular is the feature that I fear we might not be able to make work remotely well, given we’re beginners to the topic of compilers and to Rust itself (though we’re quite experienced at programming).

Is that true? Or does using the stack-only subset allow us to avoid implementing lifetimes and borrow-checking?

Also: C sounds like a decent plan B.

What we must have in our compiled language of choice is this:

  • Imperative primitives of flow control (if, else, whiles at least).
  • Some kind of user-definable types (structs, enums, unions…).

Everything else is negotiated with the professor.

The user-definable types is where miniml falls too short. But C, however dirty it may allow your code to be, fits the bill. And the static analysis the specification asks for is quite weak (“Wanna make a pointer out of nothing? Sure!”, that kind of thing), which allows you to focus on the other parts of the compiler instead. I think I prefer that. Depending on what the lifetimes question above’s answer is, we might end up just compiling C :3

Also: we’re both experts in C, more or less.