Making executables smaller

Thanks for this compiling info.

I encourage, again, this be explicitly put into the Rust Book so others are provided knowledge of this option in the documentation.

So here are the results using this compling option.

hello_world.rs
---------------------------
fn main() {
   println!("Hello World!")
}
---------------------------

$ rustc -O -C prefer-dynamic hello_world.rs
$ strip hello_world

32-bit Linux   1.8         1.9        
rustc         10,561      10,573
strip          5,568       5,568
dif            4,993       5,005
% dif          47.3        47.3
rust/strip      1.9         1.9

64-bit Linux   1.8         1.9        
rustc         12,586      12,598
strip          6,304       6,304
dif            6,282       6,294
% dif          50.1        50.0
rust/strip      2.0         2.0

Now these results are much more reasonable, and pleasing.

I'm wondering though, since you've acknowledged excessive binary sizes have been
brought up before, have you taken these issues raised by Users to heart?

This example demonstrates Rust has a standard way to create orders of magnitude
smaller binaries over the default method, so the question becomes why isn't this
the defualt compiler method?

The answer to this seems to reside more in the realm of philosophy than technology.

But I'm pretty sure 999,999 out a million users would prefer having binaries that are
50-100 times smaller than what the current default compiling process produces. Don't you?

I also cant' help but think it would be a much better marketing feature, and attractive to
more Users in more domains of uses, to demonstrate a 5-6KB Hello World! program than a 500-600KB one.

upx used to be very popular in 90s - early 2000s - among freeware and shareware developers. In many cases size savings would mean that your software could fit on a single floppy disk or would look more attractive for people who would have to download it over a slow dial-up connection. It actually bundles an unpacker code together with your executable, and for small programs that can increase the file size.

The main downside is that in order to execute a binary that unpacker has to write unpacked code to writable and executable memory pages. This can be prohibited in some strict environments (like iOS), and trigger security / antivirus software warnings in others. Turned out that for most malware authors small executable size was also very desirable. As a result, if one used upx any new release of one's software could trigger a false positive with some antivirus. There was no easy way to white list your executable, there were no "recognized developer" signatures that Apple and Microsoft offer these days, not app stores. As time went on and file size became less of an issue most software vendors stopped using upx to reduce the number of customer complains.

5 Likes

Deployment. It's easier to deploy a self-contained binary without having to package a bunch of dynamic libraries with it.

4 Likes

Plus there's no Rust ABI, so that makes deploying dynamic dependencies even harder. Everything has to be published together exactly as they were built.

1 Like

Containing symbols is a feature of the implementation taken directly from the llvm. gcc does the same. MS vc instead places them in .pdb file. It isn't a bug and down to the developer(/release engineer) to decide whether or not to strip them. The output of cargo build --release is not necessarily intended for the end user. Further packaging typically should be applied. (+ a crash reporter.) (The static compilation also the current default choice of rustc.)

Without symbols, debugging crashes that end users experience becomes near impossible.
objcopy is the main program to complement strip.

The question is very generic and misleading. Rust targets software developers, those seeking to write efficient CPU applications. (i.e. not javascript or GPU(OpenCL)) It's core feature is; making memory corruption through accessing unintended memory or unrestricted multithread access impossible, (when just using safe code.) Such bugs can be very hard to fix.

Rust is good for the end user. i.e. the end user does not need to know what language the software was written in (e.g. does not need to keep updating the latest runtime, only has to update the software itself.)

I'm afraid (/thankful) efficient kb memory level usage even for embedded is in the past. All commercially practical new embedded systems have substantially more memory that even GC languages can be an option. AFAIK Rust is not designed to target safety critical systems that require strict memory usage and static analysis.

There is nothing substantial inherently more memory efficient in C than in rust. Rust adds additional overhead (probably similar to that of C++) due to having advanced functionality (e.g. dynamic dispatch.) For non trivial programs this overhead is small. Writing safe code can have a small impact but is typically worth it.

4 Likes

To whoever wrote this: Great work! Rustlog : Why is a Rust executable large?

5 Likes

I believe that would be @lifthrasiir, I suggested on Twitter that maybe it should be linked from the FAQ or otherwise added in the documentation somewhere, possibly in Rustonomicon due to its more esoteric nature from "vanilla" Rust development.

I definitely agree with @jzakiya that putting something like that post somewhere in the docs would be a good step forward for those scenarios where you want a really slim binary, as long as you understand all the caveats around the approaches (which that post lays out very well).

Just a really great post @lifthrasiir, thanks!

1 Like

This post is really nice, I had never though, that Rust binaries could even be smaller than Cs :open_mouth:

wrong, this statement has been refuted, sorry, my bad :wink:

I did a little study with MIR and it turns out -Zorbit also makes your binaries visibly smaller :smiley:
Though this is gonna be a default soon anyway :blush:

Do you have some numbers? @nikomatsakis and I, at least, are curious.

Use of the println! macro increases the size of the binary by a decent amount. If you do it manually, you'll save a lot of space.

use std::io::{self, Write};

fn main() {
    let mut stdout = io::stdout();
    let _ = stdout.write("Hello Word!".as_bytes());
}
rustc -C opt-level=s -C prefer-dynamic main.rs

9344 bytes

Let me reproduce my results...

working...
...done

nope, sorry, I must have mixed up some flags there, the results are identical after all

Strange, the size of a regular hello.rs on ARM ends up at just 5815 bytes whereas your version at 9963 bytes. Did you forget to strip or something?

ARM isn't AMD64.

A regular hello.rs produces a 6304 byte binary on AMD64 so the relative differences probably hold.

Yup.

The ONLY case I could see this being used is in the context of a distribution package manager, where everything is built with a single distribution compiler.

For anything else, you need to link statically, or bundle your dynamic libraries with your executable (as is common on Windows).

Note that if you statically link a C program with libc, you will get similar binary sizes. You need dynamic linking, or direct use of syscalls (without libc), or whole-program dead code elimination that can actually find a lot to eliminate (it often can't) to get much less.

1 Like

If you need debug info in your binary to keep stack traces you can still make things better by passing -C link-args=-gz=zlib to rustc on Linux. That will instruct ld to compress debug section.

4 Likes

For future reference, I've created a min-sized-rust repository that documents all of the ways to reduce the size of a Rust binary.

8 Likes