Performance impact of Result vs Option


I have a HashMap<String,HashMap<String,String>> pairs, roughly ~50,000 items for the outer hashmap and between 1-20 values for the inner hashmap.

I call the following method 300 times

pub enum PdlEntry {
pub fn get_field<'a, T: 'a>(name: &str, entries: &'a HashMap<String, PdlEntry>) -> result::Result<T, Error>
    &'a HashMap<String, PdlEntry>: GetPdlData<T>,
        .ok_or(err_msg(format!("Could not find {} in entity {:?}", name, entries)))

And then repeat the exercise with this method

pub fn get_field2<'a, T: 'a>(name: &str, entries: &'a HashMap<String, PdlEntry>) -> Option<T>
    &'a HashMap<String, PdlEntry>: GetPdlData<T>,

The overhead is quite dramatic. When using Result type the timings are
when using the Option object it’s closer to

Compiled in release mode with timings tested by

        let now = Instant::now();
        let cpdm: PowerDual = PowerDual::try_from(&pdl).unwrap();
            "Getting fixings end {}.{:03}",

Should there be such a overhead using Result<> and, if so, I’m thinking that my code should generally be constructed using Option return types and only at the higher level use Result, especially not in inner loops.


You’re formatting and creating a String each time here, even on success - you want ok_or_else(|| ...).


That sped up everything in the code. I had ok_or() everywhere. Is there a way to profile for allocations in rust?


ok_or is a perf footgun because of this. Even rustc devs got bit: And its docs do mention:

But it’s somewhat easy to miss while slinging code around.

As for profiling, not sure there’s a great story right now. I think most people use the standard C/C++ tools, like valgrind, for alloc profiling. Take a look at and see if that helps.


On Linux, perf has served me very well over the years. It has guided a large amount of the optimization work I’ve done in Rust.


Use clippy, it warns you about use of ok_or (and similar methods) that perform calculations, and suggests switching to ok_or_else.


How well does perf work specifically for allocation tracking? I’ve mostly played with it for CPU perf counters purpose.


I don’t use it for allocation tracking. I use it to tell me where time is being spent in my program. Allocations tend to show up there. :slight_smile:


That’s fair :slight_smile:. Alloc tracking is slightly different but for @rusty_ron’s case it sounds like it’s one and the same in the end.


5 seconds is still a suspiciously long time, even with extra allocations. Are you measuring with --release flag? In generic code it makes rust literally a thousand times faster.


He mentioned it’s release mode. Note the string includes debug formatting of the entire HashMap that’s passed in. So this is a combo of allocation + formatting cost.


Ah, I’ve missed that. That makes sense.


Absolutely using --release

C:\>cargo test test_coupon_construction --release -- --nocapture

sadly on windows, not linux, so I’ll have to stick to timings, rather than a specific tool for now.


AFAIK, the closest thing to perf on Windows is a tool called Windows Performance Analyzer (WPA). You may want to check it out. I have never tried WPA itself, but its predecessor xperfview helped me a lot back when I was doing performance work on this OS.


You could try to use some tools like perfview