Performance impact of Result vs Option

rusty_ron · May 30, 2018, 2:01pm

I have a HashMap<String,HashMap<String,String>> pairs, roughly ~50,000 items for the outer hashmap and between 1-20 values for the inner hashmap.

I call the following method 300 times

pub enum PdlEntry {
    COMMENT,
    NUM(f64),
    STR(String),
    VNUM(Vec<f64>),
    VSTR(Vec<String>),
}
pub fn get_field<'a, T: 'a>(name: &str, entries: &'a HashMap<String, PdlEntry>) -> result::Result<T, Error>
where
    &'a HashMap<String, PdlEntry>: GetPdlData<T>,
{
    entries
        .get_pdl_data(name)
        .ok_or(err_msg(format!("Could not find {} in entity {:?}", name, entries)))
}

And then repeat the exercise with this method

pub fn get_field2<'a, T: 'a>(name: &str, entries: &'a HashMap<String, PdlEntry>) -> Option<T>
where
    &'a HashMap<String, PdlEntry>: GetPdlData<T>,
{
    entries.get_pdl_data(name)
}

The overhead is quite dramatic. When using Result type the timings are
~5s
when using the Option object it's closer to
0.01s

Compiled in release mode with timings tested by

        let now = Instant::now();
        let cpdm: PowerDual = PowerDual::try_from(&pdl).unwrap();
        println!(
            "Getting fixings end {}.{:03}",
            now.elapsed().as_secs(),
            now.elapsed().subsec_millis()
        );

Should there be such a overhead using Result<> and, if so, I'm thinking that my code should generally be constructed using Option return types and only at the higher level use Result, especially not in inner loops.

vitalyd · May 30, 2018, 2:06pm

You're formatting and creating a String each time here, even on success - you want ok_or_else(|| ...).

rusty_ron · May 30, 2018, 2:22pm

That sped up everything in the code. I had ok_or() everywhere. Is there a way to profile for allocations in rust?

vitalyd · May 30, 2018, 2:31pm

ok_or is a perf footgun because of this. Even rustc devs got bit: Lazily evaluate EvalErrorKind::*.into() calls. by nnethercote · Pull Request #50051 · rust-lang/rust · GitHub. And its docs do mention:

But it's somewhat easy to miss while slinging code around.

As for profiling, not sure there's a great story right now. I think most people use the standard C/C++ tools, like valgrind, for alloc profiling. Take a look at Improve the heap (and cpu) profiling story - #3 by mbrubeck - tools and infrastructure - Rust Internals and see if that helps.

BurntSushi · May 30, 2018, 2:58pm

On Linux, perf has served me very well over the years. It has guided a large amount of the optimization work I've done in Rust.

radix · May 30, 2018, 3:03pm

Use clippy, it warns you about use of ok_or (and similar methods) that perform calculations, and suggests switching to ok_or_else.

vitalyd · May 30, 2018, 3:07pm

How well does perf work specifically for allocation tracking? I've mostly played with it for CPU perf counters purpose.

BurntSushi · May 30, 2018, 3:19pm

I don't use it for allocation tracking. I use it to tell me where time is being spent in my program. Allocations tend to show up there.

vitalyd · May 30, 2018, 3:25pm

That's fair . Alloc tracking is slightly different but for @rusty_ron's case it sounds like it's one and the same in the end.

kornel · May 30, 2018, 4:16pm

5 seconds is still a suspiciously long time, even with extra allocations. Are you measuring with --release flag? In generic code it makes rust literally a thousand times faster.

vitalyd · May 30, 2018, 4:26pm

He mentioned it's release mode. Note the string includes debug formatting of the entire HashMap that's passed in. So this is a combo of allocation + formatting cost.

kornel · May 30, 2018, 4:28pm

Ah, I've missed that. That makes sense.

rusty_ron · May 30, 2018, 4:28pm

Absolutely using --release

C:\>cargo test test_coupon_construction --release -- --nocapture

sadly on windows, not linux, so I'll have to stick to timings, rather than a specific tool for now.

HadrienG · May 30, 2018, 5:00pm

AFAIK, the closest thing to perf on Windows is a tool called Windows Performance Analyzer (WPA). You may want to check it out. I have never tried WPA itself, but its predecessor xperfview helped me a lot back when I was doing performance work on this OS.

Pzixel · June 2, 2018, 7:58am

You could try to use some tools like perfview

Topic		Replies	Views
Performance issue - High complexity code help	17	1819	July 1, 2020
How much overhead is there with Options and Results?	7	7018	April 22, 2020
How to improve rust performance to 1.4 seconds? help	32	2240	September 14, 2023
Profilers and how to interpret results on recursive functions help	12	9055	January 12, 2023
Did Rust make the right choice about error handling?	73	5910	August 16, 2020

Performance impact of Result vs Option

Related topics