Help with compiler code generation in Rust


#1

Hello, I am writing a compiler, and I was wondering if anyone could provide suggestions on how to do code generation within Rust? I initially thought of using LLVM, however from talking to people, and reading examples using LLVM in Rust is big enough of a pain that it does not seem worth it at the moment. Any and all suggestions welcome!


#2

I found this project pretty interesting that you might find useful: CensoredUsername/whitespace-rs/blob/master


#3

For a second I was wondering why you censored the username XD


#4

Not sure how helpful this is, but I’ve found that just writing out LLVM IR as a string can be quite powerful. It’s fairly easy to read and understand and mostly platform-independent (although not in things like pointer size). If you pass it into the clang (I mostly just invoked it the binary as a subprocess), it simply compiles to machine code, including all the optimizations that clang will do as normal.

I used this approach while hacking on a compiler (written in Python) for a Python-like systems programming language, called Runa. Find the source code at https://github.com/djc/runa if you’re interested.

I investigated a bunch of other options at the time, but the LLVM bindings situation did not seem especially well setup for this use case. At the time, the Python libraries for LLVM were geared towards other use cases, as well, so I decided to do away with all the abstractions and just write out strings. Over time, I grew a small abstraction (contained in my codegen module). Maybe once your project gets larger you’d want the additional guarantees of using bindings to generate the IR, but I found this to be a great way to get going, and would do it again.


#5

Interesting, would you have a good resource for LLVM IR?


#6

This is the canonical documentation: http://llvm.org/docs/LangRef.html

I also made heavy use of this shell script:

clang -emit-llvm -S $1 -o $2.ll -O0
cat $2.ll
rm $2.ll