Soft question: genera lisp os, smalltalk, turbo pascal, lower bounds in rust compilation speed?

I've never used Genera Lisp, Smalltalk, or Turbo Pascal, so this is all from internet reading. Supposedly they were all infamous for having blazingly fast compile times, despite running on far inferior hardware.

I'm curious, is there some fundamental lower bound to Rustc compile times (i.e. some step requires solving NP-complete dependability problem) or if it's just matter of LLVM, or ...

These languages don't have many analysis rust does (like borrowck) and they don't have generics (the trait solver necessary for supporting them takes a decent amount of time). In addition their backends barely did any optimizations, which doesn't work with rust's zero cost abstractions as they aren't zero cost without optimizations. I don't think rust will ever get as fast as those languages, but the compilation time can certainly be reduced.

(by the way smalltalk is not exactly comparable as it uses a jit and thus only compiles things when they get executed. i believe they also have an interpreter as first compilation tier that runs on the code that has already been compiled to bytecode right when you save a method.)

5 Likes

It's easy to make a fast compiler if you've got a basic type system or you don't care about runtime performance.

A good example of this is Python, where "compiling" is just a case of parsing the source code and emitting bytecode instructions on startup, leaving the heavy lifting up to the runtime (e.g. implementing low level operations in C or treating things like field access as a dictionary lookup). You see a similar mentality in C# where they heavily rely on a JIT to get good performance.

Interesting. Maybe JIT is indeed what I should be looking for.

I'm currently on a 30k-40k LOC codebase (all my code, not counting dependencies). On a 12-core 24-thread system where
target (build dir) is on ramdisk, the incremental compile time change of making a SINGLE LINE change is often on the order of 10-15 seconds.

This is not recompiling external crates (as they are untouched and we are measuring incremental compile time). target is on ramdisk, so this is all just CPU / waiting, no disk io.

30k loc, 1 line change, 24 threads, yet 15 seconds.

I (clearly) don't know Rust internals, but this seems difficult to accept as a lower bound.

Rustc is currently not parallelized except for codegen. It is very much possible that most of the time is spent in the serial part of the compilation. You could try timing cargo check. (make sure to disable rust-analyzer temporarily as it invokes it for you and would thus make your own invocation complete instantly after the rust-analyzer invoked one has completed) You might also want to check if using a faster linker like lld or mold would help. I'm confident that 15s is not the lowerbound on how much time compiling your project needs to take, but it may take a while for significant improvements. I'm working on improving codegen performance for debug build: GitHub - bjorn3/rustc_codegen_cranelift: Cranelift based backend for rustc

Yeah, something seems wrong there. I'm used to an incremental recompile taking maybe a second at most.

Using a ramdisk might actually be hurting you here. Say your target/ folder has 4 or 5GB of stuff in it (not uncommon) and your whole machine has 8GB RAM available, rustc would almost certainly use that remaining RAM and start swapping to disk. I normally let my projects live my computer's NVMe SSD while all the infrequently used stuff (videos, etc.) sits on a spinning rust hard-drive.

You might also want to look into different linkers. In the past, I've had massive improvements during linking by switching from the GNU linker to something like LLVM's LLD.

1 Like

@bjorn3 @Michael-F-Bryan : Wait, sorry, are you guys talking debug builds or release opt-level=3 builds?

I'm using release opt-level=3, as for anything slower, the png / jpeg decoders seem to be really really really slow at runtime in wasm.

Now that I re-think of it, this issue is a bit silly on my part. By release opt-level=3, I've chosen runtime speed at the cost of compile time speed, so to complain about compile time being slow is a bit silly.

It's possible to optimise just one dependency without compiling your entire app in release mode. That's what insta suggests to speed up snapshot testing, even though you'll normally run tests in debug mode (cargo test).

# Cargo.toml
[profile.dev.package.insta]
opt-level = 3
2 Likes