Why join 7% faster than reduce?

I use to use reduce to concatenate string parts with some separator. However I recently found that join works faster.

use std::error::Error;
use std::time::Instant;
use std::collections::HashMap;

fn use_join(arr: &[&str]) -> String {
    arr.into_iter().map(|curr| "\"".to_owned() +
            &json_encode(&curr) + "\"").collect::<Vec<_>>().join(", ")
}

fn use_reduce(arr: &[&str]) -> String {
    arr.into_iter().map(|curr| "\"".to_owned() +
            &json_encode(&curr) + "\"").reduce(|a,e| a + ", " + &e).unwrap_or_else(String::new)
}

fn main() -> Result<(), Box<dyn Error>> {
    let start = Instant::now();
    let mut sum = 0;
    for _ in 0..10_000_000 {
        let res = use_reduce(&opts_val);
        sum += res.len()
    }
    let duration = start.elapsed();
    println!("Time elapsed: {:?} :{sum}", duration);
    let start = Instant::now();
    let mut sum = 0;
    for _ in 0..10_000_000 {
        let res = use_join(&opts_val);
        sum += res.len()
    }
    let duration = start.elapsed();
    println!("Time elapsed for join: {:?} :{sum}", duration);

    Ok(())
}

My theory is that a closure gives overhead. What could be other reasons? Which technique do you use?

in the reduce() version, you repeated use the string + operator, which is inefficient because it needs to grow/reallocate every iteration. the join() version can calculate the final length before hand and do a single allocation.

relevant implementation:

I presume that json_encode returns String, so this approach is in general inefficient. If you had json_encode that writes to &mut String, or writes to impl std::io::Write, or returns something that implements Display, you could stream the output without allocating temporary strings.

Note that this is also because of the linear structure of the reductions done by reduce. If you instead use Itertools::tree_reduce that cost will be less because you'll be combining smaller strings on average -- the documentation for the method even shows an example of that.

Of course, it's better to use https://doc.rust-lang.org/std/primitive.slice.html#method.concat/https://doc.rust-lang.org/std/primitive.slice.html#method.join if you can, but tree_reduce is a nice trick to keep in your back pocket for times where the basic things aren't enough.

Is tree_reduce 3rd party code? Unfortunately it isn't allowed in my company unless it passes a long process of code review. So I stick with join for now. It's good knowledge the Rust performance depends on memory reallocation requests, so I will try to reduce such requests.

Not just Rust, it's one of common bottlenecks.

It's hard to generalize yet, because I am working on some language where no reallocation happens at all. So, we can say better: it's the bottleneck of most modern languages.

Hmmm, that code is suspicious in how it handles result sizes isize::MAX < Size < usize::MAX

It's why some companies switch to Ada. You clearly define ranges there.