Why does rustc require 7GB of memory?

I'm running into some problems because the rustc build of one of my crates (https://github.com/samuela/rustybox) requires ~7GB of memory. Why in the world does rustc require so much memory? Are there any particular rust features that lead to this kind of outsized memory usage?

Would "crate shattering" the project into multiple crates help? In this particular case, I'm okay with doing that if it solves the problem, but it hardly seems like it should even be necessary in the first place.

Here's how I benchmarked it:

$ /usr/bin/time -v cargo build --jobs 1
...
Finished dev [unoptimized + debuginfo] target(s) in 3m 17s
	Command being timed: "cargo build --jobs 1"
	User time (seconds): 156.60
	System time (seconds): 17.12
	Percent of CPU this job got: 87%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 3:17.98
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 7053216
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 27
	Minor (reclaiming a frame) page faults: 768571
	Voluntary context switches: 38426
	Involuntary context switches: 304785
	Swaps: 0
	File system inputs: 334648
	File system outputs: 632816
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
2 Likes

A heap profiler like valgrind massif might help you figure out what's so big.

One guess is that it's just a ton of code for LLVM to process all at once, which doesn't happen in C's per-file compilation. If so, "crate shattering" might indeed help. You could also try cargo-llvm-lines to see if your code is generating anything surprising or pathological.

I know that incremental and debug info lead to huge disk usage. I wonder how they affect compiler's RAM usage though? If someone runs the above bench with env CARGO_INCREMENTAL=0 RUSTFLAGS="-C debuginfo=0", please share the results :wink:

2 Likes

For what it's worth: if I were writing a BusyBox replacement, I'd put each tool in a separate crate (or possibly groups of related ones that share lots of code) and rely on LTO to do the code sharing and whatnot that BusyBox is known for.

That's not me saying "you're doing it wrong;" what you've done is also reasonable. I bring this up because clusters of smaller crates is more typical of Rust programs, and is therefore what the tools have been optimized around. It may be that you've found a bug because you're doing something unusual (your crate appears to be massive). It probably won't be the last. If that becomes annoying to you, splitting up the crate might help you avoid it. (I bet it also helps your build times.)

Or, of course, you might be excited about helping us reduce the complier's memory usage in cases like this, in which case :partying_face: ! Filing a bug might make sense.

2 Likes

Ok, here are the results!

$ cargo clean
$ CARGO_INCREMENTAL=0 RUSTFLAGS="-C debuginfo=0" /usr/bin/time -v cargo build --jobs 1
...
    Finished dev [unoptimized + debuginfo] target(s) in 2m 46s
	Command being timed: "cargo build --jobs 1"
	User time (seconds): 130.29
	System time (seconds): 18.22
	Percent of CPU this job got: 89%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 2:46.25
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 6101920
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 24
	Minor (reclaiming a frame) page faults: 3041267
	Voluntary context switches: 16999
	Involuntary context switches: 258476
	Swaps: 0
	File system inputs: 151088
	File system outputs: 210160
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

So it does reduce memory usage by ~1GB which is substantial, but not quite enough to take us down from the currently stratospheric levels.

1 Like

Yeah, splitting into separate crates is actually on the roadmap for other reasons as well! (Installing individual utilities, more precise licensing, etc). The current layout of the project is a result of c2rust outputting all of the code in a single crate.

I guess I didn't think of this as being a particularly large crate, but perhaps I need to recalibrate on that :stuck_out_tongue: Either way, this extreme memory usage seemed buggy/unusual. I'll open up a bug report! (Done: https://github.com/rust-lang/rust/issues/67183)

1 Like

I'm not aware of any global analyses of crate source code sizes, but based on my personal experience, 469.4k lines across 563 modules is unusually large for a single crate. Most projects seem to carve up their crates on the order of 1-10kloc. (I still wouldn't expect 7GiB though -- I would still like the behavior you're describing to change.)

I wonder if c2rust's flavor of Rust is contributing? (Probably not, but it certainly doesn't produce typical Rust code.)

1 Like

Yeah, that does sound pretty big I guess! There's some more discussion here suggesting that the borrow checker may be at fault. I'm not able to reproduce those results however...