Why is Hello World 4MB?

I just installed Rust and I'm dipping in my toe. (Going to try Advent of Code 2017 in Rust).

I doubt this is relevant, but I'm on Ubuntu 14.04.5 LTS and my cc is clang

mason@mason-gaming-pc:~/dev/rust/advent2017$ cc --version
Ubuntu clang version 3.4-1ubuntu3 (tags/RELEASE_34/final) (based on LLVM 3.4)
Target: x86_64-pc-linux-gnu
Thread model: posix

I compiled my first Hello World program which consists of

fn main() {
    println!("Hello, world!");
}

The compiled binary weighs in at 4020381B.

Is this a fully statically linked binary? What's in the Rust runtime?

Mind you, I'm not complaining, just curious.

3 Likes

This article covers your question in detail:
https://lifthrasiir.github.io/rustlog/why-is-a-rust-executable-large.html

Short answer is that binary uses static linking and includes:

  • jemalloc (can be replaced with system allocator)
  • debug and backtrace functionality (can be stripped and panic replaced with abort)
9 Likes

Fantastic link, thanks!

Rust doesn't really have a "runtime". Other than depending on the C standard library for interacting with the OS, Rust is fully compiled down to machine code.

That's in comparison with languages like Go and Java where you've got a garbage collector and some sort of scheduler or VM which runs everything.

1 Like

IIRC, Go also uses a static linking approach for the sake of easier deployment, and in this sense would compare a little better to Rust than Java, whose idiomatic compilation model requires separate installation of the JVM. But indeed, it does have a lot more runtime black magic under the hood.

1 Like

Since the article linked above is a little old, if you want to remove jemalloc on the latest nightly see this post: How to tell which allocator I am using? - #6 by johnthagen

2 Likes

Another way to look at it is that a "Hello World" from C is 10MB in size, but 9.99MB of that is shipped with your operating system.

For example on macOS a C hello world program links with all these libraries:

otool -L a.out
a.out:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.0.0)
/usr/lib/libSystem.B.dylib:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.0.0)
	/usr/lib/system/libcache.dylib (compatibility version 1.0.0, current version 80.0.0)
	/usr/lib/system/libcommonCrypto.dylib (compatibility version 1.0.0, current version 60118.30.2)
	/usr/lib/system/libcompiler_rt.dylib (compatibility version 1.0.0, current version 62.0.0)
	/usr/lib/system/libcopyfile.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/system/libcorecrypto.dylib (compatibility version 1.0.0, current version 562.30.10)
	/usr/lib/system/libdispatch.dylib (compatibility version 1.0.0, current version 913.30.4)
	/usr/lib/system/libdyld.dylib (compatibility version 1.0.0, current version 519.2.2)
	/usr/lib/system/libkeymgr.dylib (compatibility version 1.0.0, current version 28.0.0)
	/usr/lib/system/liblaunch.dylib (compatibility version 1.0.0, current version 1205.30.29)
	/usr/lib/system/libmacho.dylib (compatibility version 1.0.0, current version 900.0.1)
	/usr/lib/system/libquarantine.dylib (compatibility version 1.0.0, current version 86.0.0)
	/usr/lib/system/libremovefile.dylib (compatibility version 1.0.0, current version 45.0.0)
	/usr/lib/system/libsystem_asl.dylib (compatibility version 1.0.0, current version 356.1.1)
	/usr/lib/system/libsystem_blocks.dylib (compatibility version 1.0.0, current version 67.0.0)
	/usr/lib/system/libsystem_c.dylib (compatibility version 1.0.0, current version 1244.30.3)
	/usr/lib/system/libsystem_configuration.dylib (compatibility version 1.0.0, current version 963.30.1)
	/usr/lib/system/libsystem_coreservices.dylib (compatibility version 1.0.0, current version 51.0.0)
	/usr/lib/system/libsystem_darwin.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/system/libsystem_dnssd.dylib (compatibility version 1.0.0, current version 878.30.4)
	/usr/lib/system/libsystem_info.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/system/libsystem_m.dylib (compatibility version 1.0.0, current version 3146.0.0)
	/usr/lib/system/libsystem_malloc.dylib (compatibility version 1.0.0, current version 140.1.1)
	/usr/lib/system/libsystem_network.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/system/libsystem_networkextension.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/system/libsystem_notify.dylib (compatibility version 1.0.0, current version 172.0.0)
	/usr/lib/system/libsystem_sandbox.dylib (compatibility version 1.0.0, current version 765.30.4)
	/usr/lib/system/libsystem_secinit.dylib (compatibility version 1.0.0, current version 30.0.0)
	/usr/lib/system/libsystem_kernel.dylib (compatibility version 1.0.0, current version 4570.31.3)
	/usr/lib/system/libsystem_platform.dylib (compatibility version 1.0.0, current version 161.20.1)
	/usr/lib/system/libsystem_pthread.dylib (compatibility version 1.0.0, current version 301.30.1)
	/usr/lib/system/libsystem_symptoms.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/system/libsystem_trace.dylib (compatibility version 1.0.0, current version 829.30.14)
	/usr/lib/system/libunwind.dylib (compatibility version 1.0.0, current version 35.3.0)
	/usr/lib/system/libxpc.dylib (compatibility version 1.0.0, current version 1205.30.29)
4 Likes

I wonder why Rust's hello world is 4mb while Go's is only 1.6mb ... I would expect the opposite result actually!

4 Likes

Also, the dynamics is not really inspiring :frowning:

--release hello world is 640kb on 1.11, 2.1m on 1.14 and 4m on 1.22. I think that change from 1.11 to 1.14 is the addition of debuginfo to stdlib: https://github.com/rust-lang/rust/issues/36452. Not sure what's the reason for further blow up.

EDIT: existing issue https://github.com/rust-lang/rust/issues/46034

3 Likes

You also have to remember that Go wrote their own toolchain (compiler + assembler + linker) and they don't link to any kind of libc (they use the underlying syscalls directly). I'm assuming that extra control over the toolchain and less system dependencies would mean they can skip a lot of the extra baggage that a statically linked Rust or C program would have.

1 Like

Stripped Rust binary is leaner than Go binary: 400k vs 1m. Rust Hello World has massive amount of debuginfo.

7 Likes

I guess this is one of the very few areas where Rust build toolchains could learn from CMake: there are three important build configurations, not two, and these are Debug, Release, and RelWithDebInfo.

4 Likes

This is not a build configuration problem, I think. By default, —release does not include debug info for your code, you’ll need to tweak Cargo profile for that.

The problem is specific to the standard library: in Rust, stdlib is distributed with the compiler in compiled form, as an rlib, so the end application compilation flags do not affect it. Moreover, the same rlib is used for both the release and debug profiles, and it is compiled with precisely release with debuginfo configuration :slight_smile:

3 Likes

In my dream world, cargo build --release would strip the output binary (including the stdlib), and cargo build --release-with-debug-info would ensure that all dependencies of the output binary, including the stdlib, are built with debug info. On its side, cargo build --debug would link with a debug version of the stdlib which makes life maximally easy for debuggers and similar tools.

That would certainly be more elegant than jumping through RUSTFLAGS='-g' contortions whenever I want to profile a Rust program.

2 Likes

Hm, you don't have to use RUSTFLAGS to pass -g, you can add this to the Cargo.toml:

[profile.release]
debug = true

Having a command-line switch for this will be useful as well! However, the whole profile system should probably redesigned first...

I may have gotten it wrong, but IIRC this Cargo.toml switch only enables debugging symbols for the active crate, and not for its dependencies, which is most often not what I want.

No, profile section is supposed to affect everything except for stdlib. If -g is not applied to deps, then it's a bug in Cargo!

2 Likes

Just saw a link to this thread and noticed nobody mentioned -C opt-level s (or z) which tells LLVM to optimize for size. (Only available in nightly though, IIRC)

3 Likes