A "big picture" image (like the OpenGL machine you linked) would be very helpful for people that want to start hacking the compiler.
I drew a thing by hand! (Apologies for my poor drawing skills :-). Hopefully it gives you an idea of what Cargo and rustc do.
some notes:
- AST === Abstract Syntax Tree
- HIR === High-level Intermediate Representation
- MIR === Mid-level Intermediate Representation
- llvm-ir === LLVM Intermediate Representation
- obj === Object file (ELF in Linux)
- The rustc pipeline
may beis a little off. The image in this blog post is more accurate. - The parser will "follow"
mod foo
items and parse other/external files as needed. The parser returns the AST of the whole crate. - As my drawing denotes, I'm not quite sure from which phase the metadata comes from.
- The metadata is formatted as RBML (Really Bad Markup Language). "RBML was originally based on the Extensible Binary Markup Language" (according to the source code).
- You probably already know but LLVM is an external dependency, it's used as a library and it's not written in Rust (it's written in C++).
- The archiver used to be an external command (e.g.
ar
) but today we use an in-memory archiver that comes with LLVM. That's the new default, you can still use an external archiver if you want. - Currently, the linker is always an external command. On most platforms we use a C compiler (e.g.
gcc
) as a linker. Speculation: In the future, we may use an in-memory lld instead of an external command. - It's not explicitly shown but if e.g. crate B depends on crate A,
rustc
will load thelibB.rlib
and use its metadata to type check crate A.rustc
may also take a generic function fromlibB.rlib
metadata, "monomorphize" it and include that inlibA.rlib
(library) or./A
(executable).
Oh, and if someone wants to digitalize and improve my drawing, feel free to do so -- I give you permission/license to do so.