Some findings of the rust compilation process


#1

I am new to Rust. I came from a C/C++ background. Each time I begin to learn a programming language, I want to know as much as possible about the internal mechanisms of how it builds an executable from the source files. That makes me comfortable.

To do that for rust, I wrote two programs, which are used to intercept rustc and link, respectively. I renamed “rustc.exe” to “rvstc.exe”, and named my rustc intercepter as “rustc.exe”. I renamed my original linker from Microsoft Visual Studio 2017 to “ljnk.exe”, and named my link intercepter as “link.exe”. I put the two intercepters in the same folder as their interceptee. When executed, my rustc intercepter first writes the command line into a log file “rsclog.txt”, then it invokes “rvstc.exe”, the true rustc. Similarly, when executed, my link intercepter first writes the command line into a log file “linklog.txt”, then it invokes “ljnk.exe”, the true linker.

Then, in a folder, I typed “Cargo new hello --bin” in the console, and Cargo then built a hello project. I know link-time optimization helps make the size of executable small, so in “cargo.toml” I added the following:

[profile.release]
lto = true

That enables link-time optimization.

And then I typed “Cargo build --release” to build the executable. And then I checked the content of the two log files. They are pasted as follows:

RSCLOG.TXT

“rustc” -vV

“D:\Programs\Rust\Cargo\bin\rustc.exe” - --crate-name ___ --print=file-names --target i686-pc-windows-msvc --crate-type bin --crate-type rlib --print=sysroot --print=cfg

“D:\Programs\Rust\Cargo\bin\rustc.exe” --crate-name hello src\main.rs --crate-type bin --emit=dep-info,link -C opt-level=3 -C lto -C metadata=349de1fcdb8623d1 -C extra-filename=-349de1fcdb8623d1 --out-dir D:\Computer\hello\target\release\deps -L dependency=D:\Computer\hello\target\release\deps

LINKLOG.TXT (formatted for better viewing)

“D:\Programs\VS2017\VC\Tools\MSVC\14.12.25827\bin\HostX86\x86\link.exe” /NOLOGO /NXCOMPAT /LARGEADDRESSAWARE /SAFESEH

/LIBPATH:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib
D:\Computer\hello\target\release\deps\hello-349de1fcdb8623d1.hello0.rcgu.o

/OUT:D:\Computer\hello\target\release\deps\hello-349de1fcdb8623d1.exe
/OPT:REF,ICF /DEBUG
/NATVIS:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\etc\intrinsic.natvis
/NATVIS:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\etc\liballoc.natvis
/NATVIS:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\etc\libcore.natvis
/LIBPATH:D:\Computer\hello\target\release\deps
/LIBPATH:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib

D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libcompiler_builtins-c355ee975be6fc83.rlib advapi32.lib ws2_32.lib userenv.lib shell32.lib msvcrt.lib

Now I removed the LTO enabling lines from “cargo.toml”, and rebuilt the executable. Now the content of “rsclog.txt” was nearly the same as before except that “-C lto” now disappeared. The content of “linklog.txt” is as follows:

“D:\Programs\VS2017\VC\Tools\MSVC\14.12.25827\bin\HostX86\x86\link.exe” /NOLOGO /NXCOMPAT /LARGEADDRESSAWARE /SAFESEH
/LIBPATH:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib

D:\Computer\hello\target\release\deps\hello-be0970e9a7678686.hello0-dd5b819bafdf9994a7d024311674fc5c.rs.rcgu.o
D:\Computer\hello\target\release\deps\hello-be0970e9a7678686.hello1-dd5b819bafdf9994a7d024311674fc5c.rs.rcgu.o

/OUT:D:\Computer\hello\target\release\deps\hello-be0970e9a7678686.exe

D:\Computer\hello\target\release\deps\hello-be0970e9a7678686.crate.allocator.rcgu.o

/OPT:REF,ICF /DEBUG

/NATVIS:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\etc\intrinsic.natvis
/NATVIS:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\etc\liballoc.natvis
/NATVIS:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\etc\libcore.natvis

/LIBPATH:D:\Computer\hello\target\release\deps /LIBPATH:D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib

D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libstd-2eefd1bb8a414188.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\liballoc_system-a73097f8a15d6ce6.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libunwind-6a15584ceb55e004.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libpanic_unwind-48aa968b7232ee4d.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\liblibc-1dab299b452022f4.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\liballoc-b03f6a774b79bd71.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libstd_unicode-11d6982f77622fda.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libcore-56e48c7254dcabdd.rlib
D:\Programs\Rust\Rustup\toolchains\stable-i686-pc-windows-msvc\lib\rustlib\i686-pc-windows-msvc\lib\libcompiler_builtins-c355ee975be6fc83.rlib
advapi32.lib ws2_32.lib userenv.lib shell32.lib msvcrt.lib

I must admit that I have learned very much from the command lines, but I still have some questions to ask:

  1. What is NATVIS?
  2. It appears that rlib can be read and linked by Microsoft Linker, but when I used DUMPBIN to dump the exports of an rlib, DUMPBIN collapsed. Where can we find a description of the format of rlib?
  3. If LTO is disabled, we can see that main.rs is compiled into two object files. Which of the source goes into which object?
  4. If LTO is enabled, we can see many rlibs not called by main.rs are saved into linking, hence resulting in a much smaller EXE. Is that all of rust LTO? In other words, does Microsoft linker support LTO? If an rlib is included into linking, then all of the library is in the final EXE?

#2

I believe it’s for https://msdn.microsoft.com/en-us/library/jj620914.aspx


#3

The rlib format is technically unspecified, but in practice it’s just a normal object file with a bit of extra metadata. You’ll probably need to dive into the source code to find out more though.

There’s also a work-in-progress document on rustc's internals and going through all the processes happening under the hood so it’s easier for new contributors to get started. There isn’t a section on code generation yet, but you can always make an issue for it. Hopefully that link helps you in your quest to learn what rustc is doing when you type cargo build or rustc main.rs! :grinning:


How to set my custom entry point?