I'm about to add a fill_inplace to my textwrap crate. In short, the goal is to turn some spaces into '\n' without reallocating the input String. Like this, where the break_points are already computed:
fn fill_inplace(text: &mut String, break_points: &[usize]) {
let mut bytes = text.into_bytes();
for &idx in break_points {
bytes[idx] = b'\n';
}
*text = String::from_utf8(bytes).unwrap();
}
pub fn main() {
let mut text = String::from("foo bar baz");
println!("before: {:?}", text);
fill_inplace(&mut text, &[3, 7]);
println!("after: {:?}", text);
}
You can use let mut bytes = std::mem::take(text).into_bytes().
But the swapping is essential (unless you want to venture into unsafe) -- this is what ensures basic exception safety if the code panics in the middle of fill_inplace.
It's unsafe because you may end up with non-UTF8 if you overwrite e.g. half a two-byte codepoint. Your swapping version (or @matklad's suggestion) is safe as from_utf8 verifies the UTF8-ness of your updated bytes.
Oh, thanks! I didn't know about this function — though it does the same, it looks simpler somehow
Thanks, that's definitely also a nice option. The version with the UTF-8 check is already plenty fast, so I'll probably just stick with safe code for now.
I benchmarked both version and could not measure any difference in the timings. Wrapping 1,600 and 3,200 character long strings:
String lengths/fill_inplace/1600
time: [11.549 us 11.663 us 11.855 us]
change: [-1.7676% -0.8406% +0.2085%] (p = 0.07 > 0.05)
No change in performance detected.
String lengths/fill_inplace/3200
time: [23.666 us 23.796 us 23.964 us]
change: [-1.4300% -0.8371% -0.1698%] (p = 0.01 < 0.05)
Change within noise threshold.
My conclusion so far is that the computation of the break points (which I removed for simplicity in the example code) is dominating the computation time.