Fast String Concatenation?


#1

Hi all,

I’m currently benching some ideas on building up a url with params. This is something that’ll be called often in my library, but isn’t really cachable.

So far I’ve got these two examples:

#[bench]
fn format_url_macro(b: &mut Bencher) {
	let index = "test_idx".to_string();
	let name = "test_alias".to_string();

	b.iter(|| {
		let _ = format!("/{}/_alias/{}", index, name);
	});
}

#[bench]
fn format_url_concat(b: &mut Bencher) {
	let index = "test_idx".to_string();
	let name = "test_alias".to_string();

	b.iter(|| {
		let mut url = "/".to_string();
		url = url + &index[..] + "/_alias/" + &name[..];
	});
}

With these results on my machine:

test format_url_concat ... bench:         180 ns/iter (+/- 56)
test format_url_macro  ... bench:         192 ns/iter (+/- 10)

Does anyone have any other ideas I could try to build a string efficiently? (Also not sure why the concat one has such crazy variance?)

Because this stuff is codegenned, and is internal, it doesn’t matter if it’s ugly.


#2

First of all, your tests are just throwing away the result of the formatting. This means the compiler is free to totally ignore your code, because the result is never needed.

Having fixed that, I tried the only thing I could think of: pre-allocate the string to the correct length and don’t do any spurious intermediate allocations. I also tried pre-allocating and using formatted write!. First, the code:

#![feature(test)]
extern crate test;
use test::Bencher;

#[bench]
fn format_url_macro(b: &mut Bencher) {
    let index = "test_idx".to_string();
    let name = "test_alias".to_string();

    b.iter(|| {
        format!("/{}/_alias/{}", index, name)
    });
}

#[bench]
fn format_url_concat(b: &mut Bencher) {
    let index = "test_idx".to_string();
    let name = "test_alias".to_string();

    b.iter(|| {
        let mut url = "/".to_string();
        url = url + &index[..] + "/_alias/" + &name[..];
        url
    });
}

#[bench]
fn format_url_push(b: &mut Bencher) {
    let index = "test_idx".to_string();
    let name = "test_alias".to_string();

    b.iter(|| {
        let mut url = String::with_capacity(1 + "/_alias/".len()
            + index.len() + name.len());
        url.push_str("/");
        url.push_str(&index);
        url.push_str("/_alias/");
        url.push_str(&name);
        url
    });
}

#[bench]
fn format_url_write(b: &mut Bencher) {
    let index = "test_idx".to_string();
    let name = "test_alias".to_string();

    b.iter(|| {
        use std::fmt::Write;
        let mut url = String::with_capacity(1 + "/_alias/".len()
            + index.len() + name.len());
        write!(url, "/{}/_alias/{}", index, name).unwrap();
        url
    });
}

Then, the results:

test format_url_concat ... bench:         403 ns/iter (+/- 12)
test format_url_macro  ... bench:         411 ns/iter (+/- 11)
test format_url_push   ... bench:         108 ns/iter (+/- 3)
test format_url_write  ... bench:         156 ns/iter (+/- 4)

Joining str: many ways
#3

Thanks for your response. I hadn’t considered that the compiler would happily optimise my code away, but it does make sense.

I’m a big fan of the format_url_push implementation, it’d be really easy to codegen. So as a general optimisation when working with collections of things I’m guessing pre-allocation is a big win?


#4

Preallocation is a big win but so is avoiding the formatting subsystem. Both the write! and format! macros invoke the formatting subsystem which is quite slow.

Also note, there’s an RFC issue about allowing format to preallocate: https://github.com/rust-lang/rfcs/issues/1145


#5

Why is there no use of .concat() / .join() in the benchmarks? It was created for this…


#6

Good point! I’ve added the following two benches using .concat(). One with a vec and the other with an array:

#[bench]
fn format_url_vec_concat(b: &mut Bencher) {
    let index = "test_idx".to_string();
    let name = "test_alias".to_string();

    b.iter(|| {
        let url = vec![ "/", &index[..], "/_alias/", &name[..] ].concat();
        url
    });
}

#[bench]
fn format_url_array_concat(b: &mut Bencher) {
    let index = "test_idx".to_string();
    let name = "test_alias".to_string();

    b.iter(|| {
        let url = [ "/", &index[..], "/_alias/", &name[..] ].concat();
        url
    });
}

Results on my machine look like:

test format_url_array_concat ... bench:          69 ns/iter (+/- 16)
test format_url_concat       ... bench:         280 ns/iter (+/- 59)
test format_url_macro        ... bench:         287 ns/iter (+/- 36)
test format_url_push         ... bench:          51 ns/iter (+/- 7)
test format_url_vec_concat   ... bench:         109 ns/iter (+/- 4)
test format_url_write        ... bench:          83 ns/iter (+/- 30)

So the array concat is only a fraction slower than the push_str method and the vec method is a bit slower than the write one.

(Note, I’m also getting crazy variance in the benches on Windows… Some runs show the figures above, some look more like this:

test format_url_array_concat ... bench:         181 ns/iter (+/- 7)
test format_url_concat       ... bench:         708 ns/iter (+/- 21)
test format_url_macro        ... bench:         734 ns/iter (+/- 40)
test format_url_push         ... bench:         129 ns/iter (+/- 4)
test format_url_vec_concat   ... bench:         110 ns/iter (+/- 148)
test format_url_write        ... bench:         197 ns/iter (+/- 4)

Not sure why, but taking the variance into account the url_push still looks the fastest.


Looking for a code review on a KDTree project of mine before I go present it at a conference. I am unsure of the idiomatic-ness of it
#7

That was interesting, disappointed concat() didn’t win :smile: . Maybe your windows benches are having trouble with cpu frequency scaling?


#8

If I wasn’t codegenning the url construction in my library I would probably go for the array concat method, it’s almost as fast but much more compact.