Optimizing code

I am working on a project that has many peripherals including Gps, Accelerometer, Can and ... . Recently I am facing code size error. I started to optimize my code and when I checked the size of function rust float variable conversion has much cost for me. How can I reduce this size ?

You may be interested in reading:

But the main rules of thumb are:

1. avoid generics / impl Trait-in-argument-position.

At least for big functions.

For instance, look at how the standard library does it:

  • Notice how it features a generic function to improve call-site ergonomics, but how it avoids implementing it as:

    fn write<P: AsRef<Path>, C: AsRef<[u8]>>(path: P, contents: C) -> io::Result<()> {
        File::create(path.as_ref())?.write_all(contents.as_ref())
    }
    

    Indeed, the non-generic part of that code (here, granted, rather small, but in other cases it can definitely become bigger), that is, the File::create()? call and the .write_all()? call are otherwise copy-pasted for each choice of generics across all the possible callers of that function, leading to duplication which could otherwise have been avoided.

Trick: use dyn Trait instead of impl Trait in argument position to more easily factor out code.

2. Use opt-level = "z" in the profile settings (as well as agressive lto)

For the Cargo workspace generateing the final binary, you'd add a [profile.{dev/release}] section (depending on whether you are targetting cargo build or cargo build --release) with an opt-level = "z" line under it.

You may also add a lto = "fat" for link-time optimizations.

And I am not sure about it, but maybe codegen-units = 1 could help as well? :person_shrugging:

3. See GitHub - johnthagen/min-sized-rust: πŸ¦€ How to minimize Rust binary size πŸ“¦ for more advanced tweaks

5 Likes

This is a good tip, but it comes at the cost of preventing inline calling of the trait methods, thus losing potential compiler optimizations. So avoid using dyn Trait for small and simple methods like struct field access (i.e. getters/setters), especially in performance critical areas. But definitely use it as a means to monomorphize access to big method calls which are implemented in many different forms across a range of different types.

I found it interesting to learn that dyn Trait is implemented as two fat pointers; one to the base struct and one to the vtable of trait methods. So, unless I'm missing something, there is probably no real benefit to using &dyn Trait vs. Box<dyn Trait>.

The benefit is in not requiring a heap allocation, and letting the caller retain ownership. If you have a local variable var of some concrete type which implements Trait, you can pass &var as an &dyn Trait argument without allocating additional memory for it, and you will still be able to use var after the call is complete.


Strictly speaking, that's the implementation of *dyn Trait, which is the underlying representation of both Box<dyn Trait> and &dyn Trait β€” The extra information is stored beside the data pointer, rather than beside the referent. This lets a plain dyn Trait have the same memory representation as whatever concrete type it was created from.

2 Likes

Ahh, I was thinking entirely in terms of potential benefits on the receiving method side, but passing by reference rather than value is definitely a benefit for the caller. Good point!

Also helpful info, thanks! Good to know that the heap allocations are only for the owned version. It does make sense that a reference could bypass that. And that's helping me get past some of my disillusionment about the type erasure preventing inlining.

Thanks for your response. The link helped me very much. I also found some linkes about "xorgo". This feature is very good and I found it useful but I don't know how much stable this is. Do you have any idea?

My biggest problem is decimal to float conversion. There is a 10 kilobytes :dotted_line_face: table named "power five".

I think we probably need more information to be able to help you with that specific issue. What do you mean by β€œdecimal”? (Numbers as strings? Some other non-float numeric type?) And how are you converting them? (f64::from(), as etc.) Is the table in the source code or generated during compilation?

I guess they might be referring to this table - rust/table.rs at 1.60.0 Β· rust-lang/rust Β· GitHub - which is somehow not optimized out.

these are size of some memories and functions. they are float to decimal conversions and reverse. they cost me about 30 percentage of my code and that is really high.

00002552 t core::num::flt2dec::strategy::grisu::format_shortest
00010416 r core::num::dec2flt::table::POWER_OF_FIVE_128
00001296 r core::num::flt2dec::strategy::grisu::CACHED_POW10
00000616 t core::num::dec2flt::decimal::Decimal::left_shift
00000616 t core::num::dec2flt::decimal::Decimal::right_shift
00000576 t core::num::flt2dec::strategy::grisu::format_shortest_opt::round_and_weed
00000576 t core::fmt::float::float_to_exponential_common_shortest
00000556 t core::num::dec2flt::lemire::compute_float
00000548 T compiler_builtins::float::add::__addsf3
00000092 t core::num::dec2flt::decimal::Decimal::trim
00000200 t core::num::dec2flt::lemire::compute_product_approx
00000240 t core::num::dec2flt::decimal::Decimal::round
00000556 t core::num::dec2flt::lemire::compute_float
00000600 t core::num::dec2flt::lemire::compute_float
00000976 t core::num::dec2flt::decimal::parse_decimal
00001340 t core::num::dec2flt::dec2flt
00001392 t core::num::dec2flt::dec2flt
00001628 t core::num::dec2flt::parse::parse_number

I usually convert f64 to u64 using "as". but I have found out that "into" is also another option. what is the difference between them ?

exactly. this variable has 10kbyte size and that is really big.