Why is a rust executable smaller than Kotlin-native?

hyousef · October 12, 2018, 2:02pm

Considering both Kotlin-native and Rust are based on LLVM, I expect there output and performance to be close!

So created 2 simple Hello World! with kotlin and rust

main.kt:

fun main() {
    println("Hello, world!")
}

main.rs:

fn main() {
    println!("Hello, world!");
}

Then generated the executable files for bot using:
kotlinc-native main.kt for kotlin and cargo build for rust

Then checked the executables/binary using the below:

ls -S -lh | awk '{print $5, $9}'

and found that the file generated by kotlin native is 1.48X the file generated by rust.

Any idea why this variance!

ArmsOfSorrow · October 12, 2018, 2:34pm

Kotlin is still garbage collected, right?
My guess would be it's because of the bigger runtime.

DanielKeep · October 12, 2018, 2:41pm

As an aside: file sizes on trivial programs aren't terribly interesting, unless you're talking about multiple orders of magnitude difference, or you're trying to target a pathologically space-constrained system.

That they both use LLVM is also irrelevant.

hyousef · October 12, 2018, 3:51pm

Correct, Kotlin still GC.

kornel · October 12, 2018, 3:53pm

BTW, for "hello world" executable there's still lots of unused code in the executable. If you care about exectuable size, you should:

build with --release
enable LTO
switch to system allocator
strip the executable (due to some bugs, even in release mode, Rust may put megabytes of debug symbols)

hyousef · October 12, 2018, 8:05pm

Thanks.
So the executable can be even smaller than what I reported!
Do you have reference links for what you mentioned, LTO, system allocator, executable stripping.

I already used '--released' forgot to mention it in my post.

matklad · October 12, 2018, 9:07pm

I think the expectation about performance is here is probably unfounded. They both are compiled with LLVM, yes, but how runtime and object model works matters a lot, probably more than what particular codegen backend is used.

For purely number-crunching applications, when you don't do any allocation and just do a lot of math (not explicitly vectorized), the performance indeed should be similar.

However as soon as you start using allocation, "objects" and standard library, the differences in runtime, memory management, allocation patterns should make a big difference, and Kotlin and Rust are very different in these respects.

Also, I would naively expect that Kotlin JVM would be faster than Kotlin/Native for typical workloads: Kotlin's object model is basically the Java object model, and JVM is very optimized for dealing with it.

regexident · October 12, 2018, 9:50pm

Initial setup:

$ cargo new hello_world

Build with:
$ cargo build

589,004 bytes

Optimization Step 1:

Build with:
$ cargo build --release

586,028 bytes

Optimization Step 2:

Change contents of main.rs to:

use std::alloc::System;

#[global_allocator]
static A: System = System;

fn main() {
    println!("Hello, world!");
}

335,232 bytes

Optimization Step 3:

Add …
[profile.release]
lto = true
to Cargo.toml.

253,752 bytes

Optimization Step 4:

Strip executable via …
$ strip target/release/hello_world

177,608 bytes

(Note: Optimization Step <N> here makes use of Optimization Steps 1..<N - 1> as well.)

Savings

johnthagen · October 13, 2018, 12:48am

Step 5:

You should also add opt-level = "z" to Cargo.toml

Step 6:

To drop the size even further, you can use Xargo to build the Rust stdlib from source when you build the application, which allows you apply lto and opt-level to the stdlib build, stripping out lots of the stdlib you don't need.

robsmith11 · October 13, 2018, 1:35am

I've been meaning to try this for some of my performance-critical crates. I would think that it might speed up cases of many Vec calls.

What the easiest way to do this? Do I need to list every stdlib crate that I want recompiled in my Cargo.toml, or is there a single option to Xargo that will do it? (Target CPU is just my host CPU)

hyousef · October 13, 2018, 8:09am

I did not understand this statement my friend.

hyousef · October 13, 2018, 8:16am

By changing the main.rs + Cargo.toml as you recommended, and using Cargo build --release, the size I got is: 238K which is less than the 253K you mentioned, any idea what could be the reason for this variance in numbers!

After using strip additional to the above, the final file size is: 169K

NobbZ · October 13, 2018, 8:33am

Different architecture and/or different versions of Rust and/or different versions of LLVM and/or …

gurry · October 13, 2018, 11:59am

He means every optimization step depends on preceding steps, unless I misunderstood.

leonardo · October 13, 2018, 12:01pm

And is that true?

NobbZ · October 13, 2018, 12:36pm

In his figures, every step includes the previous steps recursively.

I'm not sure if there are any technical dependencies between those optimizations.

regexident · October 13, 2018, 1:13pm

This is what I meant, yes.

Neither do I.

I also tried the new size optimization option of rustc, but the size was the same as with strip.

johnthagen · October 13, 2018, 2:01pm

You'll probably need a more realistic example than only printing Hello World to see a difference. In this example, a large majority of the code is probably static linking the stdlib. You'd need Xargo to help make that smaller by building from source.

hyousef · October 13, 2018, 2:22pm

What is the format/code for this option?

regexident · October 14, 2018, 4:21pm

See "Compiler" in the 1.28.0 release notes:

github.com

rust-lang/rust/blob/master/RELEASES.md

Version 1.74.0 (2023-11-16)
==========================

<a id="1.74.0-Language"></a>

Language
--------

- [Codify that `std::mem::Discriminant<T>` does not depend on any lifetimes in T](https://github.com/rust-lang/rust/pull/104299/)
- [Replace `private_in_public` lint with `private_interfaces` and `private_bounds` per RFC 2145.](https://github.com/rust-lang/rust/pull/113126/)
  Read more in [RFC 2145](https://rust-lang.github.io/rfcs/2145-type-privacy.html).
- [Allow explicit `#[repr(Rust)]`](https://github.com/rust-lang/rust/pull/114201/)
- [closure field capturing: don't depend on alignment of packed fields](https://github.com/rust-lang/rust/pull/115315/)
- [Enable MIR-based drop-tracking for `async` blocks](https://github.com/rust-lang/rust/pull/107421/)
- [Stabilize `impl_trait_projections`](https://github.com/rust-lang/rust/pull/115659)

<a id="1.74.0-Compiler"></a>

Compiler
--------

This file has been truncated. show original

The s and z optimisation levels are now stable. These optimisations prioritise making smaller binary sizes. z is the same as s with the exception that it does not vectorise loops, which typically results in an even smaller binary.

Topic		Replies	Views
Minimal Executable Size (Win 32-bit) help	22	9518	January 12, 2023
Huge file size on resulting binary help	2	1878	January 12, 2023
Opinion: First impressions of Rust coming from Java	19	1471	December 19, 2020
Rust vs Kotlin native community	3	3258	January 12, 2023
Rust hello world binary file size is huge help	9	3502	April 3, 2021

Why is a rust executable smaller than Kotlin-native?

Related Topics