Optimizing code

farbodpm · April 6, 2022, 10:10am

I am working on a project that has many peripherals including Gps, Accelerometer, Can and ... . Recently I am facing code size error. I started to optimize my code and when I checked the size of function rust float variable conversion has much cost for me. How can I reduce this size ?

Yandros · April 6, 2022, 10:50am

1. avoid generics / `impl Trait`-in-argument-position.

At least for big functions.

For instance, look at how the standard library does it:

github.com

rust-lang/rust/blob/ce1a8131c6cf6973ff5ce2be2b7500232efbdd5f/library/std/src/fs.rs#L305-L310


      
          pub fn write<P: AsRef<Path>, C: AsRef<[u8]>>(path: P, contents: C) -> io::Result<()> {
              fn inner(path: &Path, contents: &[u8]) -> io::Result<()> {
                  File::create(path)?.write_all(contents)
              }
              inner(path.as_ref(), contents.as_ref())
          }

Notice how it features a generic function to improve call-site ergonomics, but how it avoids implementing it as:
```
fn write<P: AsRef<Path>, C: AsRef<[u8]>>(path: P, contents: C) -> io::Result<()> {
    File::create(path.as_ref())?.write_all(contents.as_ref())
}
```
Indeed, the non-generic part of that code (here, granted, rather small, but in other cases it can definitely become bigger), that is, the File::create()? call and the .write_all()? call are otherwise copy-pasted for each choice of generics across all the possible callers of that function, leading to duplication which could otherwise have been avoided.

Trick: use `dyn Trait` instead of `impl Trait` in argument position to more easily factor out code.

2. Use `opt-level = "z"` in the profile settings (as well as agressive lto)

For the Cargo workspace generateing the final binary, you'd add a [profile.{dev/release}] section (depending on whether you are targetting cargo build or cargo build --release) with an opt-level = "z" line under it.

You may also add a lto = "fat" for link-time optimizations.

And I am not sure about it, but maybe codegen-units = 1 could help as well?

3. See GitHub - johnthagen/min-sized-rust: 🦀 How to minimize Rust binary size 📦 for more advanced tweaks

PFaas · April 7, 2022, 12:14pm

This is a good tip, but it comes at the cost of preventing inline calling of the trait methods, thus losing potential compiler optimizations. So avoid using dyn Trait for small and simple methods like struct field access (i.e. getters/setters), especially in performance critical areas. But definitely use it as a means to monomorphize access to big method calls which are implemented in many different forms across a range of different types.

I found it interesting to learn that dyn Trait is implemented as two fat pointers; one to the base struct and one to the vtable of trait methods. So, unless I'm missing something, there is probably no real benefit to using &dyn Trait vs. Box<dyn Trait>.

2e71828 · April 7, 2022, 12:36pm

The benefit is in not requiring a heap allocation, and letting the caller retain ownership. If you have a local variable var of some concrete type which implements Trait, you can pass &var as an &dyn Trait argument without allocating additional memory for it, and you will still be able to use var after the call is complete.

Strictly speaking, that's the implementation of *dyn Trait, which is the underlying representation of both Box<dyn Trait> and &dyn Trait — The extra information is stored beside the data pointer, rather than beside the referent. This lets a plain dyn Trait have the same memory representation as whatever concrete type it was created from.

PFaas · April 7, 2022, 12:41pm

Ahh, I was thinking entirely in terms of potential benefits on the receiving method side, but passing by reference rather than value is definitely a benefit for the caller. Good point!

Also helpful info, thanks! Good to know that the heap allocations are only for the owned version. It does make sense that a reference could bypass that. And that's helping me get past some of my disillusionment about the type erasure preventing inlining.

farbodpm · April 7, 2022, 4:27pm

Thanks for your response. The link helped me very much. I also found some linkes about "xorgo". This feature is very good and I found it useful but I don't know how much stable this is. Do you have any idea?

farbodpm · April 7, 2022, 4:30pm

My biggest problem is decimal to float conversion. There is a 10 kilobytes table named "power five".

gkcjones · April 7, 2022, 4:47pm

I think we probably need more information to be able to help you with that specific issue. What do you mean by “decimal”? (Numbers as strings? Some other non-float numeric type?) And how are you converting them? (f64::from(), as etc.) Is the table in the source code or generated during compilation?

Cerber-Ursi · April 7, 2022, 5:23pm

I guess they might be referring to this table - rust/table.rs at 1.60.0 · rust-lang/rust · GitHub - which is somehow not optimized out.

farbodpm · April 8, 2022, 6:37am

these are size of some memories and functions. they are float to decimal conversions and reverse. they cost me about 30 percentage of my code and that is really high.

00002552 t core::num::flt2dec::strategy::grisu::format_shortest
00010416 r core::num::dec2flt::table::POWER_OF_FIVE_128
00001296 r core::num::flt2dec::strategy::grisu::CACHED_POW10
00000616 t core::num::dec2flt::decimal::Decimal::left_shift
00000616 t core::num::dec2flt::decimal::Decimal::right_shift
00000576 t core::num::flt2dec::strategy::grisu::format_shortest_opt::round_and_weed
00000576 t core::fmt::float::float_to_exponential_common_shortest
00000556 t core::num::dec2flt::lemire::compute_float
00000548 T compiler_builtins::float::add::__addsf3
00000092 t core::num::dec2flt::decimal::Decimal::trim
00000200 t core::num::dec2flt::lemire::compute_product_approx
00000240 t core::num::dec2flt::decimal::Decimal::round
00000556 t core::num::dec2flt::lemire::compute_float
00000600 t core::num::dec2flt::lemire::compute_float
00000976 t core::num::dec2flt::decimal::parse_decimal
00001340 t core::num::dec2flt::dec2flt
00001392 t core::num::dec2flt::dec2flt
00001628 t core::num::dec2flt::parse::parse_number

I usually convert f64 to u64 using "as". but I have found out that "into" is also another option. what is the difference between them ?

farbodpm · April 8, 2022, 4:08pm

exactly. this variable has 10kbyte size and that is really big.

system · July 7, 2022, 4:08pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Minimize code size	15	938	December 10, 2022
The infamous question of bin file size ;)	8	617	January 20, 2022
Rust Generics - could this be simplified?	7	1248	January 12, 2023
Rust Size in embedded syatem for ARM CR4 help	13	167	November 28, 2024
Rust hello world binary file size is huge help	9	4203	April 3, 2021

Optimizing code

1. avoid generics / impl Trait-in-argument-position.

Trick: use dyn Trait instead of impl Trait in argument position to more easily factor out code.

2. Use opt-level = "z" in the profile settings (as well as agressive lto)

3. See GitHub - johnthagen/min-sized-rust: 🦀 How to minimize Rust binary size 📦 for more advanced tweaks

Related topics

1. avoid generics / `impl Trait`-in-argument-position.

Trick: use `dyn Trait` instead of `impl Trait` in argument position to more easily factor out code.

2. Use `opt-level = "z"` in the profile settings (as well as agressive lto)