Question regarding strings: format!("{} {}", s1, s2) vs. s1 + &s2


#1

Hi, I am new to Rust and currently reading the second-edition book.

I have a question from the Strings chapter (https://doc.rust-lang.org/nightly/book/second-edition/ch08-02-strings.html#concatenation-with-the--operator-or-the-format-macro)

The text on concatenation highlights that s1 + &s2 is efficient because it does only 1 copy. But, as a side effect s1 is consumed and cannot be used later.
Whereas, format!("{} {}", s1, s2) doesn’t takes ownership of s1 and s2. But, does that mean that now there are 2 copies and it is less efficient than s1 + &s2?


#2

Technically speaking, yes. However, usually realloc (which is called when extending a Vec or String) is usually implemented by means of allocating space, copying old space, and freeing old space, which reduces impact of this. jemalloc does have some optimizations, but they require pretty specific conditions to trigger. format! macro for about 4 months allocates with capacity, so it isn’t as slow as it used to be.

Essentially, don’t be too concerned about which one you use. Use whatever is better for your use case, don’t micro optimise (unless you found with profilling that you need to optimize that part of code).

Micro-benchmarks

#![feature(test)]

extern crate test;

use test::Bencher;

macro_rules! a_long_string {
    ($q:tt * $($rest:tt)*) => {
        concat!(a_long_string!($q $($rest)*), a_long_string!($q $($rest)*))
    };
    ($q:tt) => {
        $q
    };
}

const LONG_STR: &str = a_long_string!("a" ****************);

#[bench]
fn format(b: &mut Bencher) {
    b.iter(|| {
               let a_long_string = String::from(LONG_STR);
               let b_long_string = String::from(LONG_STR);
               format!("{}{}", a_long_string, b_long_string)
           })
}

#[bench]
fn concat(b: &mut Bencher) {
    b.iter(|| {
               let a_long_string = String::from(LONG_STR);
               let b_long_string = String::from(LONG_STR);
               a_long_string + &b_long_string
           })
}

Benchmark results on my computer (AMD FX-8320E):

test concat ... bench:      15,345 ns/iter (+/- 782)
test format ... bench:      15,581 ns/iter (+/- 1,564)

The benchmarks in this case are inconclusive, and don’t really say which one is faster.

(a string tested was 65536 bytes long, so 15000 ns is about what was expected when simply copying data four times, I think there was barely any overhead, just simply copying what needed to be copied)


#3

Thanks @xfix.
append would typically use realloc, and I missed that point. Both implementations would do one allocation and 2 copies unless there is extra capacity.