Size of the executable binary file of an application

I am new to Rust and so far I was amazed by its design. But I encountered something that makes me be scared to use it in commercial projects. The size of the executable binary file of a "Hello world" application is 3.2Mb.

-rwxr-xr-x 2 kos kos 3,2M Jul 10 15:44 experiment_app_size

That's huge!

The version of rustc is 1.53.0
The toolchain is stable-x86_64-unknown-linux-gnu.
Target is release.

I am wondering is it planned to fix the problem in the future?
Is there a technique I can use to decrease the size of the executable binary file?
Is the same problem relevant to WASM toolchain?

1 Like

Rust uses monomorphization and static linking, which does tend to cause the size of binaries to be larger than languages which do not do those things, however the benefits of these techniques are generally seen as outweighing the file size impact.

There is some information here on how to generate much smaller binaries: GitHub - johnthagen/min-sized-rust: 🦀 How to minimize Rust binary size 📦

Following that guide, I was able to get a hello world binary around 93k.

8 Likes

The main reason is the C cheat on its binary size by pre-distribute its standard library. Rust cannot assume it so it statically link its stdlib within the binary. In my environment uncheated C binary compiled with gcc -static main.c takes 872 kB.

But 872 kB still is larger than 3.2MB! Well, that's because your fn main() does more than single printf call. For example it registers default panic hook and print stack trace if the hello world output fails, which is the case silently ignored in most C code.

2 Likes

My basic understanding of the theory of compilers prompts me that unused code could be optimized out from the release build, which seems is not the case.

Any good reading on monomorphization?
Is the problem relevant to WASM toolchain too?

I don't think we need be scared about this.

Certainly a 3.2Mb executable is a lot for what is effectively one line of functionality in "hello world". But that does not mean that a 10 line program will compile to 32Mb, an 100 line program to 320Mb etc. It does not scale linearly like that. Once you have the std library and some crates linked in your code will only grow by the small amount generated for each line you write.

I'm just tinkering with a 3000 line program that pulls in 123 crates and compiles to about 4Mb.

I notice rustc is only 8.3Mb on my machine.

Given the huge amount of storage and RAM we generally have, even on low end machines, I'm not about to worry about this.

On the other hand, when people want to squeeze Rust code into the tiny space available on micro-controllers it can be done: Rewriting m4vgalib in Rust - Cliffle

Similarly web assembly: Making really tiny WebAssembly graphics demos - Cliffle

13 Likes

Wow! That's a very useful article, thank you @ZiCog !
Why all those techniques and tools are not incorporated into official dev tools?

I have no idea really.

I guess because it's an extra complication that is not normally needed and people are not clamouring for it.

2 Likes

Release target should have that. Not having that built-in increases the cost of the development of userland applications.

1 Like

I disagree. Static linking means Rust applications are less brittle than is typical for C/C++. For example a Rust application can often be run on a wide array of linux distros without worrying about mis-matched library versions or pre-deploying the runtime.

If you really need the space (and by far most users and usages don't), then there are options to help with that such as optional dynamic linking and debug symbol stripping.

3 Likes

Yes, size is essential for the web.

For wasm target I think it's worth consider to do so. But for x86_64-unknown-linux-gnu target I hardly believe majority of users ever need it.

2 Likes

A very large part of the size is debug symbols. In many cases these should just be removed by running strip the_binary. This reduces the 3.2 M hello world to about 271 k. The same is true for c, but as the standard library is often dynamically linked there, the difference is less drastic; from 25 k to 15 k. But for a real c program, where more symbols are declared in the program itself, it is as important to strip a c program as a rust program in cases where size matters.

When compiling to wasm, I wrote a julia renderer in rust and compiled to 907 bytes of wasm. Julia fractals in Rust wasm — Rasmus​.krats​.se , and Cliffle made things even smaller in the pages linked to by @ZiCog .

As to why "those tricks" are not included in the official dev tools. First, strip is a standard tool that does the same thing for a rust binary as a c binary, so no need for a rust-specific version of that. Much of the size that remains in a stripped hello world but not in those small wasm demos is panic handling and its ability to write debug output and stack traces. That is considered imporant enough to include by default (and I certainly agree), but it can be removed using the standard toolchain (setting some stuff in Cargo.toml and providing your own panic handler in rust code).

6 Likes

Nevertheless we are getting a rustc flag for stripping debug info and symbols, and you can already use it on nightly:

2 Likes

I feel your pain OP, I myself am trying to deploy Rust code on an embedded target with 12MB of space. I have literally run out of room to continue with Rust, and am back to C for remaining control systems.

However it is very much something you have to consider on a case by case basis. I agree 100% with the gist of what others are saying here. Are you deploying on a 12MB embedded platform, or a more usual grade desktop or server target? If the latter, don't stress about binary size. The equation is not 300kB × number_of_functions, it's more like 300kB + something_tiny × number_of_functions (including what you call from deps, ballpark only, your mileage may vary, etc. etc.). Honestly, if had even 32MB instead of 12MB, there would be zero question.

There are a couple of oft-repeated claims here I disagree with though. Firstly, C does not "cheat" by having a shared stdlib. C has achieved that by having a stable ABI for close on 35 years now. It also does not have to worry about generics or monomorphisation. Developers OTOH get to worry about void pointers resolving to the correct type. I do get that no one was using "cheat" in a truly derogatory way, just to emphasise that it's not a trivial comparison; however it is helpful to ask what would make it possible for Rust?

If I could find a way to generate a shared stdlib for my binaries, my space pressure might ease by about 200-500kB per binary.

Secondly, advice like "just use no-std" or "abort on panic" has the opposite effect than you might intend. My goal is not to use Rust at any cost. It's to deploy systems that meet a need. The choice is between Rust with stdlib and some useful deps, and C with its stdlib and some useful deps. Rust with no deps and no RAII is not in the running here.

Best of luck!

5 Likes

C doesn't have stable ABI. It's just that every popular operating systems speaks in C since the UNIX and distributes first party C standard library and compiler, which is the point I considered cheat as this advantage is came from the market dominance, not only from the technical properties of the language.

So what's the C ABI if the C itself doesn't have any ABI? The behavior of the C language is fully specified, and the C support is quite universal, so "what C does in this platform" becomes pretty reliable interface between languages like C++, Rust, Python, Fortran etc.

2 Likes

This is a fair point, but there's still an ecosystem that we call "C", and the availability of a dynamically linkable, ABI-stable stdlib is largely driven by open source distributions and toolchains. It would be entirely possible for the Rust community to ensure it has a dynamically linkable stdlib in those distributions too, but it (a) has to want that first and (b) has to overcome the technical limitations in the way. So I still don't think it's really "cheating".

I actually think that a non-C ABI would be a perfectly valid solution to this (not precluding Rust from still using a C ABI when required).

1 Like

The relevant command is size(1).
When I do

cargo new testhelloworldsize
cd testhelloworldsize
cargo build --release
size ./target/release/testhelloworldsize

I see:

   text	   data	    bss	    dec	    hex	filename
 256023	  10984	    616	 267623	  41567	./target/release/testhelloworldsize

This is the size you can get your binary down to if you strip(1) it.

The resulting binary does not link the C library statically, it's dynamically linked:

$ ldd ./target/release/testhelloworldsize 
	linux-vdso.so.1 (0x00007fff6cb7d000)
	libgtk3-nocsd.so.0 => /usr/lib/x86_64-linux-gnu/libgtk3-nocsd.so.0 (0x00007fcdddc51000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fcddda39000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fcddd831000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fcddd612000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fcddd40e000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fcddd01d000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fcdde09b000)

The 261KB size comes from the Rust standard library, you can use the command:

nm -S ./target/release/testhelloworldsize | c++filt

to get an idea of what's taking up the space, mostly core::, std::, alloc:: and the like.

4 Likes

Interesting.

Thing is 261KB is still huge for a "hello world" program.

Presumably there is a lot of stuff in std that is not being used in any particular program. How can that be removed?

How can we get a very small std (both C and Rust)?

Typically an embedded system has no need of things like printf, often not any file system interface or pthreads etc, etc.

I can compile C code, that uses pthreads, down to the 32K bytes available on the 8 core Propeller micro-controller from Parallax Inc.

I'd like to think the same is possible with Rust.

4 Likes

Most cloud applications are distributed as containers nowadays and typical Docker images are measured in the Gigabytes. Unless you are among the 1% of Rust devs working in a severely constrained environment, I don't think a 3.2Mb executable is anything to be concerned about.

That said, if you are one of those developers then there are a bunch of tricks you can use to get your binary size down. In addition to some of the tips that have already been shared, you can also aggressively cut down on the number of external dependencies, prefer dynamic dispatch (trait objects) over static dispatch ("normal" generics), and avoid chunky functionality like the core::fmt machinery (i.e. write!() and println!()).

1 Like

I'm pretty sure stuff from the standard library that's really not used by the program is removed (optimized out) from the binary. The problem (or one of them) is that the panic handler needs to be able to print a report of any panic that could possibly happen, and that involves a lot of stuff when it comes to I/O (even for such a "simple" example as a hello world).

There are people thinking about how to make that code (mainly the code for formatting all kinds of stuff, but also the panic handling itself) smaller, but that is a hard problem to solve.

1 Like