Modify String in-place

mgeisler · November 10, 2020, 10:26pm

Dear Rust experts,

I'm about to add a fill_inplace to my textwrap crate. In short, the goal is to turn some spaces into '\n' without reallocating the input String. Like this, where the break_points are already computed:

fn fill_inplace(text: &mut String, break_points: &[usize]) {
    let mut bytes = text.into_bytes();
    for &idx in break_points {
        bytes[idx] = b'\n';
    }
    *text = String::from_utf8(bytes).unwrap();
}

pub fn main() {
    let mut text = String::from("foo bar baz");

    println!("before: {:?}", text);
    fill_inplace(&mut text, &[3, 7]);
    println!("after:  {:?}", text);
}

(Playground)

However, this doesn't work as-is. It fails with this error:

error[E0507]: cannot move out of *text which is behind a mutable reference

The offending line is let mut bytes = text.into_bytes() which consumes text. I can fix this by using a temporary like this:

fn fill_inplace(text: &mut String, break_points: &[usize]) {
    let mut tmp = String::new();
    std::mem::swap(&mut tmp, text);
    let mut bytes = tmp.into_bytes();
    for &idx in break_points {
        bytes[idx] = b'\n';
    }
    *text = String::from_utf8(bytes).unwrap();
}

I believe this still avoids reallocating the original string, but it seems a bit weird to me to swap things back and forth like that.

Does anybody have an idea for how I can write this better?

Thanks for any help!

matklad · November 10, 2020, 10:34pm

You can use let mut bytes = std::mem::take(text).into_bytes().

But the swapping is essential (unless you want to venture into unsafe) -- this is what ensures basic exception safety if the code panics in the middle of fill_inplace.

quinedot · November 10, 2020, 10:37pm

You can use as_mut_vec: playground.

It's unsafe because you may end up with non-UTF8 if you overwrite e.g. half a two-byte codepoint. Your swapping version (or @matklad's suggestion) is safe as from_utf8 verifies the UTF8-ness of your updated bytes.

mgeisler · November 10, 2020, 11:09pm

Oh, thanks! I didn't know about this function — though it does the same, it looks simpler somehow

Thanks, that's definitely also a nice option. The version with the UTF-8 check is already plenty fast, so I'll probably just stick with safe code for now.

mgeisler · November 10, 2020, 11:17pm

I benchmarked both version and could not measure any difference in the timings. Wrapping 1,600 and 3,200 character long strings:

String lengths/fill_inplace/1600
            time:   [11.549 us 11.663 us 11.855 us]
            change: [-1.7676% -0.8406% +0.2085%] (p = 0.07 > 0.05)
            No change in performance detected.
String lengths/fill_inplace/3200
            time:   [23.666 us 23.796 us 23.964 us]
            change: [-1.4300% -0.8371% -0.1698%] (p = 0.01 < 0.05)
            Change within noise threshold.

My conclusion so far is that the computation of the break points (which I removed for simplicity in the example code) is dominating the computation time.

steffahn · November 10, 2020, 11:38pm

Using the ascii crate:

use ascii::{AsMutAsciiStr, AsciiChar};
fn fill_inplace(text: &mut str, break_points: &[usize]) {
    for &idx in break_points {
        text.slice_ascii_mut(idx..=idx).unwrap()[0] = AsciiChar::LineFeed;
    }
}

system · February 8, 2021, 11:38pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Trying to modify Strings community	3	564	August 9, 2019
Why can't we have multiple &mut help	43	3493	January 12, 2023
Doing `trim_matches` in-place on a `String` help	14	649	July 1, 2023
Replace space in string with U+00A0 help	3	2005	October 22, 2021
How do I change a character in a string? help	5	3133	December 21, 2022

Modify String in-place

Related Topics