Making executables smaller

There's article in the May 2016 Linux Format magazine on using Rust.

In it, the author used the [Li|U]nix command strip to reduce the 'Hello World' executable size.

Here's and example with a Rust executable of a program I created.

jzakiya@localhost ~ $ file prodfactors
prodfactors: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=5ec13d16cc5767604036150ee9e967f87d3b7cb7, not stripped
jzakiya@localhost ~ $ ls -al prodfactors
-rwxr-xr-x 1 jzakiya jzakiya 468748 May 23 22:07 prodfactors*
jzakiya@localhost ~ $ strip prodfactors
jzakiya@localhost ~ $ ls -al prodfactors
-rwxr-xr-x 1 jzakiya jzakiya 388592 May 23 22:08 prodfactors*
jzakiya@localhost ~ $ file prodfactors
prodfactors: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=5ec13d16cc5767604036150ee9e967f87d3b7cb7, stripped

  1. Are there any negative affects of stripping Rust executables (it's documentation suggests no)?

  2. Are there Rust compiler flags/options to automatically strip out symbols from its executables?

  3. Using strip seems to appreciably reduce the size of executables. Was/is there any thought to make this the default behavior for the compiler

6 Likes

I don't know anything about strip but if you're trying to get smaller executables (as the thread title suggests) the easiest thing might be to just use the system allocator instead of bundling jemalloc.

You can also through upx at your binary, in my case that turns a stripped 5.1MB bin (--release) into a 1.4MB.

It appears stripping breaks RUST_BACKTRACE but you might not want it in a release binary anyway.

hoodie/killercop can you give specific examples of the techniques you suggested.

  1. Yes. Without debug info it's much harder to know where the code crashes/panics.

you can do something like upx compress --best target/release/yourbin and upx will create a compressed binary which behaves the same way as the original, only it decrompresses itself at runtime.
Thats quite cool, only I have no idea about the implications. Anybody else?

According to UPX website (http://upx.sourceforge.net/) the last release was September 30, 2013, so it doesn't seem to be actively supported. Plus, it's not a part of the standard GNU tool chain, so I'm wouldn't choose to use it.

I was really hoping some developers would provide direct answers to the 3 specific questions I asked.

If stripping the symbols from exectutables can reduce their size by 30-40% why isn't that the default behavior of the compiler? If you want/need the symbols then a compiler flag can be set to retain them.

If there is no performance hit (and documentation suggests it may improve performance to strip) it seems obvious that the smaller file size is _bette_r (more marketable, easier to package; takes up less bandwidth to copy/upload/download over networks) to use.

It just doesn't look/seem right to have to create a 450+ KB file just to say "Hello World".

I linked to a chapter in the book above in case you missed it :slight_smile:

It's not gnu, but GPL, and it seems to be kept up to date on my distro at least. Thing is, it shrinks a --release build to about 30% its original size without cutting anything out, I find that very impressive.

1) Are there any negative affects of stripping Rust executables (it's documentation suggests no)?

2) Are there Rust compiler flags/options to automatically strip out symbols from its executables?

IDK Sorry, though strip is usually performed by packagers, you find it in archlinux's makepkg.conf (similar to upx btw :smiley: )

3) Using strip seems to appreciably reduce the size of executables. Was/is there any thought to make this the default behavior for the compiler

I think not, because of 1)

Thanks for the replies, but I still don't see the validity to the answer to 1).

As a programmer (not a developer) when I'm creating a program I will compile without doing --release.

$ cargo some_project

This will identify the errors (warnings) as I iterate development of the program until it works.
At this point I'm not too concerned with the executable size, or even it's ultimate possible performance.

Now, as a programmer, when I've got my program working correctly (giving me the results I want) then I want to compile it using --release.

$ cargo build some_project --release

At this point, as a programmer, I would expect the compiler to fully optimize the executable performance, and also its file size. At this point, I don't care about the internals of symbol tables, back traces, etc. All I want is an optimized performing programming. And, as a programmer, I also want an optimized file size because I'm not doing anymore development on the code, because I now want to use/deploy it.

So, when I'm compiling with --release, I'm finished debugging my program, so I don't see how removing access to debugging information at that point can be considered a negative.

It would be nice to at least provide to programmers the option to compile with a flag that will strip the final executable, something like below.

$ cargo build some_project --release -strip

I don't see how this could hurt, and it provides programmers a built in way to shrink the final executable size at their discretion/desire.

I hope you can see how this would be a benefit to programmers, especially if you are using multiple external crates that aren't stripped. Those extra useless KBs start to add up quickly.

As mentioned earlier, the standard practice is for package managers to run strip as one of the steps in the packaging process after building; in fact, when doing so, they will frequently actually use strip in a couple of different modes, to produce one file that is the executable, and another that contains only the debuginfo, and links them together, so that if you install the foo package you get only the executable and other necessary files, while you can also install foo-dbg to get symbols and be able to get better backtraces when debugging.

Since exactly how they strip the executables and where they put the debug info is fairly distro-specific, this step is generally best left up to the packaging process.

However, if you want to do do this with Cargo, you should be able to build with something like the following (untested):

$ cargo rustc --release -- -C link-args=-Wl,-x

Of course, the argument passed link-args will be platform-specific, but it looks like -Wl,x is correct for at least GNU ld, gold, and OS X's ld.

And if you really do want to get a smaller "hello, world", take the advice from the first comment and switch to the system allocator. That plus stripping symbols will reduce your executable size considerably. And heck, you can even switch to no_std and just use libc to print if you really want:

#![feature(alloc_system)]
#![no_std]

extern crate libc;
extern crate alloc_system;

fn main() {
    unsafe {
        libc::puts(b"Hello, world\0" as *const u8 as *const libc::c_char);
    }
}

Heck, let's even add and -C opt-level=z -C lto, which gets an even smaller executable.

If you want to go even smaller still, you can use tricks like in this post, bypassing libc, libstd, libcore, and just making system calls directly, but that requires some more post-processing linker trickery to actually extract package the resulting code up into an executable. Also note that this post was written pre-1.0, so some things may have changed.

As you can see, there are many things you can do to get smaller executables. Rust isn't necessarily optimized for producing the smallest executable by default; it includes a separate allocator, as well as statically linking its own standard library. This will tend to produce a larger executable than a C, or even C++ compiler, which generally will rely on the already installed standard library.

Since a "hello, world" executable is already carrying the weight of a custom allocator and libstd around, and the standard for compilers is to not strip executables by default, it would seem odd to trim off just a little bit of size while surprising users who expect to still be able to get backtraces in a debugger or panic message.

7 Likes

Just since we're on the topic of things that make your binaries smaller:

-C panic=abort

do this at your own risk though :smiley:

I don't think aborting on panic is particularly risky. It means you can't do highly-available servers, but we're talking about "hello world" here.

3 Likes

That's something that would be okay for me, as a release binary doesn't need backtrace, at least in my case.

Talking about binary size: I would love to decrease the size of executables, recently I've had multiple times the hassle of very long upload times until my binary was on the test server because of my slow connection.
Isn't a bigger binary also increasing the "bloat" a CPU has to handle ?
And as second question: Isn't the catch/throw of C++ creating equally much unwinding code, or are these cases simply running into an abortion ?

You get backtraces even with aborts, actually. It doesn't have to do with unwinding.

It would be, but if you disable exceptions, like many do, then it won't. Same thing as Rust.

2 Likes

Strange nobody gave the simplest answer: -C link-args=-s, so that the above command (if you don't want to make it permanent) looks like this:

RUSTFLAGS='-C link-args=-s' cargo build --release

1 Like

I want to thank the responders who provided examples of compiling settings to create smaller executables. At this stage in my knowledge of Rust, I have no idea what they are doing, and why.

I would suggest (and urge) this knowledge on this topic be included in the Rust Book (documentation) under this topic, as an official guide to Users on how to currently create smaller executables.

However, in raising this issue I had hoped the response would be more along the lines of:
"OK, I see your concerns. Yeh, maybe we need to look into seeing how we can make the executables inherently smaller."

Please, please, please, take this issue seriously.

Let's look at the example that started this.

hello_world.rs
---------------------------
fn main() {
   println!("Hello World!")
}
---------------------------

$ rustc hello_world.rs
$ strip hello_world


32-bit Linux   1.8         1.9        
rustc        533,943     599,205
strip        309,400     354,504
dif          224,543     244,701
% dif          42.1        40.8
rust/strip     1.73        1.69

64-bit Linux   1.8         1.9        
rustc        590,007     654,559
strip        318,952     364,008
dif          271,055     290,551
% dif          45.9        44.4
rust/strip     1.85        1.80

Fundamental Questions

  1. Why should it take 500+ KBs to create an executable to merely display Hello World!?

It is apparent execution efficiency is not currently a design priority for Rust (developers).

And from version 1.8 to 1.9 the efficiency has gotten worse.

As this example shows, the executable increased ~65,000 bytes from version 1.8 to 1.9 on both 32 and 64 bit Linux systems, with 1.9 now taking ~600+ KBs.

We also see the number of unnecessary (stripped off) bytes for the executable also increased by 20KBs.

I see this as a major design flaw, which practically eliminates using Rust on small SBC systems like Raspberry PIs, and their like.

  1. What has to change in Rust's design to produce (much) smaller executables?

You're not going to like what I think, but I think a complete fundamental redesign of the language would need to occur to put making small executables a design output.

It seems there needs a different partitioning of resources (libraries, etc) that contain the elements that perform the called for tasks, so that the machine code produced is the minimal necessary to do just the desired work.

I understand that being a young language still in early development, the focus is on increasing features and fixing bugs, but creating 'bloatware' needs to be acknowledged, guarded against, and reversed.

  1. Who are the ultimate users Rust is designed for?

C was created to write Unix in, instead of assembly, to port it to each new machine.
It's still the language Linux, Ruby, and other projects use to get close to bare metal.
Because of its age and heritage, it's much more memory efficient than Rust is now.

I know that part of the reasons most of these new 'system' languages (Rust, D, Go, etc) are so inefficient is because their designers have grown up in an era of memory abundance. I grew up programming in the days of kilobytes of ram and megabytes of disk storage. You read Byte Magazine and Dr Dobbs's Journal to learn how to pinch and scrimp on memory usage. Now you start out on the lowliest of systems with gigabytes of ram and near terabytes of disk storage. And this affects your view on the need to be memory efficient.

But the world is increasingly becoming less about servers and desktops, and more about mobile and small SBCs (again Raspberry PIs, Beagle Boards, etc), and IOT (internet of things) devices. In these domains, memory use efficiency (ram and disk) is still a necessity and priority.

Please, take these as constructive criticisms.
Rust has a lot of great design goals, and potential.
Make producing much smaller/leaner executables a fundamental design goal too.

1 Like
  1. We're statically linking our own allocator, instead of using libc's. For this program, that's clearly not disk space well spent, but for programs that allocate a lot it improves execution speed.

  2. Stack trace information. That's why strip removes so much stuff. The sidecar debug info use case is why it's there in release mode. Stuff crashes in production sometimes.

  3. Unwinding tables. Those run during panics; combined with panic recovery, they can be used to create highly-available services.

Those are the big sources of executable size.

Besides the fact that 1 improves execution speed and 2 & 3 have no effect on it, this isn't even usually true. Look up "space-time trade-off," and then, if you want to know how Rust or anything else performs in terms of execution speed, measure the execution speed of it, not its disk footprint.

I admit that Rust executables are bloated by default, but most of that code is either very rarely run (the unwinding code only runs when part or all of the application crashes, for example) or is run instead of code that C programs usually get from external libraries.

@jzakiya This aspect of Rust binaries' size has been brought up many time before - the largest contribution comes from the statically linked rustlibs, libstd, etc. On ARM, without jemalloc:

rustc -O hello.rs -> 112k
rustc -O -C prefer-dynamic hello.rs -> 6k

Hope it's clear now you can have similar binary sizes you're accustomed to in C/C++ but dynamic linking will make sense only if you stick to the same rustc version (no stable ABI!)