Size of the executable binary file of an application

Here's something I don't understand. The ufmt crate has a formatting implementation without any of the complex machinery that core::fmt has. Why can't core::fmt implement ufmt and then add what it needs following the same principal? Why must it be so complicated?

5 Likes

Two reasons:

  1. It pays in compile times.
  2. It is a breaking change (currently std::fmt::Formatter uses &mut dyn Write, while it will require it to use a generic parameter, like ufmt::Formatter.
1 Like

I thought about it. It's easy. Just remove all that panic handling stuff. Perhaps give me a hook to insert an endless loop, light up a fault LED, or whatever on panic!.

In the embedded and small systems I have worked on in languages like C there is no such panic handler, only the raw code I compile to get my job done. If anything untoward happens it had to be debugged, with in-circuit emulators or JTAG debuggers.

This might not sound like a serious suggestion but I recall a presentation by the embedded Rust working group (Sorry if that is not the actual name) where it was stated that "parity with C" was one of the goals. Well, this is one aspect of that goal.

Alternatively I think it was Ferrous Systems that have been working on removing all that stuff, a lot of formatting as you say, off the chip by using some binary protocol to report panics, which is then analysed and formatted n the host computer. Perhaps somewhere here: Ferrous Systems ยท GitHub

2 Likes

The trouble with this is that "embedded" can include anything on the spectrum from IP cores on ASICs or SoCs, to microprocessors using freestanding C, to microprocessors running FreeRTOS, to SBCs running embedded Linux. I am currently working with the latter. Here, "parity with C" means "I can write code with a Linux kernel, an event loop, IPC, logging, etc. with binary size under 100kB". In C that means C stdlib, libuv, ZeroMQ, which adds up to 60-70kB per binary + 500kB one off for libc (I checked). In Rust it means stdlib, Calloop, IPC, some sort of logger, which adds up to 900kB per binary at least and no shared one off costs.

But counter to this is that while I "want" parity with C in terms of size and performance, I want better than C in terms of RAII, correctness and stability. Which isn't really fair! But the word "want" is in quotes up there because I don't really want 100% down-to-the-byte parity with C, I'd happily pay a marginal cost for Rust... but I want it to be marginal. I don't want to pay the same 300kB per binary for the same(ish) stdlib.

Rust is worth it because I save massive amounts of development and maintenance time and effort and produce better quality systems. But it's frustrating when I'm forced to stop using it because it simply doesn't fit.

1 Like

Indeed. I have worked on embedded systems across that entire spectrum. Heck I would argue that a lot of what goes on in the cloud are embedded systems.

I'm sure there is some tension in the requirements here.

My interpretation of "parity with C" here is that if I don't need it I don't have to carry it around in my binary. In the extreme that means using no-std as Cliff Biffle demonstrates in the example I linked to above.

But then:

If I want to dynamically allocate things on the heap I need a malloc. That need not be very big. Let me add my own, or use the services of whatever tiny operating system I am using.

If I need to print formatted text I need a printf. Let me provide my own, that has the minimal features I need. Need not be very big. Let me skip all that Unicode and internationalisation support.

If I need threads let me add that myself. As I said, I have built C code using pthreads on MCUs with 32K RAM. Or use that RTOS I have available.

Same with any file system if I need one. Or IP stack.

And so on. But in Rust.

Don't forget, back in the day we could compile C code on 8 bit machines with only 64K bytes of RAM. The floppy disks had less capacity that a current "hello world" in Rust.

Something has gone horribly wrong since then :slight_smile:

Yes, exactly. "parity with C" meaning we pay for the features we need, and not anything more.

2 Likes

You can tell Rust to abort on panic instead of unwinding. You can also install a completely custom panic handler.

You can also replace the global allocator and then use Rust's allocating APIs, like Rc and Vec, without std.

There's nothing stopping you from doing these things but they won't be compatible with the ones provided in std, so any code that needs to interact with them will need to be custom-written as well.


In summary, it sounds like you really want to be targeting libcore + alloc instead of libstd, which still lets you use the algorithmic parts of Rust without std's OS integrations.

2 Likes

The biggest part of the 256KB code size is for debug symbol parsing. If showing backtraces is not a requirement this can be reduced to about 136KB using Rust nightly (Linux x86-64) with -Z build-std and disabling backtraces:

$ cargo +nightly build -Z build-std --target x86_64-unknown-linux-gnu -Zbuild-std-features=panic-unwind,default --release
[...]
$ size target/x86_64-unknown-linux-gnu/release/testhelloworldsize2
   text    data     bss     dec     hex filename
 136109    8024     480  144613   234e5 target/x86_64-unknown-linux-gnu/release/testhelloworldsize2

Enabling LTO gets that down to 109KB:

$ size target/x86_64-unknown-linux-gnu/release/testhelloworldsize2
   text    data     bss     dec     hex filename
 109327    6256     424  116007   1c527 target/x86_64-unknown-linux-gnu/release/testhelloworldsize2

Stack unwinding does still happen in this build configuration.

P.S. To see what takes up space I use

nm -S -td --size-sort target/x86_64-unknown-linux-gnu/release/testhelloworldsize2|rustfilt|tac|cut -d' ' -f 2-|less

as cargo-bloat does not work with -Z ... options.

(Edit: stable -> nightly in shell command)

5 Likes
cargo +stable build -Z build-std --target x86_64-unknown-linux-gnu -Zbuild-std-features=panic-unwind,default --release

This fails for me with:

error: the `-Z` flag is only accepted on the nightly channel of Cargo, but this is the `stable` channel
See https://doc.rust-lang.org/book/appendix-07-nightly-rust.html for more information about Rust release channels.

Should it be nightly cargo +nightly? (When I do that, it builds, but there's no reduction in size)

1 Like

Yes +nightly of course, I apparently copy-and-pasted the wrong command ( :sweat_smile:). You should be able to reproduce my results. Did you check the binary in target/x86_64-unknown-linux-gnu/release/ (using -Zbuild-std places it there).

Actual output from my shell this time:

$ cargo clean; cargo +nightly run -q -Z build-std --target x86_64-unknown-linux-gnu -Zbuild-std-features=panic-unwind,default --release; size target/x86_64-unknown-linux-gnu/release/testhelloworldsize2
Hello, world!
   text    data     bss     dec     hex filename
 109327    6256     424  116007   1c527 target/x86_64-unknown-linux-gnu/release/testhelloworldsize2
2 Likes

Rust has a bug that adds more than 2MB of debug information, even to release executables. You have to call strip on them if you care about binary size.

3 Likes

Huh? Do you consider the precense of the symbol names (debug info) a bug, or is there a more specific bug that makes it larger than it should be?

1 Like

The bug is that on Linux debug = false setting disables debug info for all rlibs, except the libcore/libstd.

4 Likes

To add another data point: if you're interested in how much physical memory is actually needed to run a "hello, world" application it's useful to look at the number of page faults as first approximation (via perf stat ./target/debug/testhelloworldsize , may need sudo ).

I'm getting about 50 page faults for a dynamically linked C "Hello World" (23 if statically linked) vs. 70 for the one linked with kryps's cargo command, vs. 80ish for the out-of-the-box cargo run executable. I'm not yet set up to link Rust binaries statically (requires MUSL?). So Rust's overhead in terms of actual memory may be about 80-120KB on a Linux system (Ubuntu 18.04).

1 Like

You can actually dynamically link to std using the prefer-dynamic rustc flag. Though then you'll probably also have to figure out how to build std in a way optimized for size rather than using the version shipped with the compiler which is optimized for speed (it's 4.2 MB stripped on my machine, which is going to be tight on such a small flash).

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.