Trim String in place?

I wonder why no trim operation for String, that modify it by it self,
for example:

impl String {
  fn trim_right_mut(&mut self) { }
}

?

I'm asking in context of this:

let mut s: String = some_calculations();//allocation one
let s = s.trim().to_string();//allocation two

For me the second heap allocation is strange.
Why not reuse existing storage of s.
May be I should use some magic with iterators, like drain to reuse "allocation one" space?

3 Likes

trim_right_mut can be fairly easily implemented using truncate

1 Like

One of the problems I can see is that trimming on the left in-place is going to be almost as expensive as calling s.trim().to_string(). Allocating memory is actually quite fast, and regardless you're going to be copying the entire string (minus the leading whitespace) around.

1 Like

So malloc + memcpy is cheaper then memmove?

5 Likes

I do benchmark, and looks like memove is faster then malloc + memcpy:

static SRC: &'static str = "  AAA BBB CCC DD       ";

#[bench]
fn bench_alloc_trim(b: &mut Bencher) {
    let src = SRC.to_string();
    b.iter(|| {
        test::black_box(src.trim().to_string());
    });
}

#[bench]
fn bench_inplace_trim(b: &mut Bencher) {
    let src = SRC.to_string();
    let trim = src.trim();
    let start_off = trim.as_ptr() as usize - src.as_str().as_ptr() as usize;
    let trim_len = trim.len();
    let mut src2 = src.clone();
    let mem = unsafe { src2.as_mut_vec() };
    b.iter(|| {
        test::black_box(src.trim());
        test::black_box(unsafe {
            ptr::copy(
                ((mem.as_ptr() as usize) + start_off) as *const u8,
                mem.as_mut_ptr(),
                trim_len,
            )
        });
    });
}

Results:

running 2 tests
test bench_alloc_trim   ... bench:          51 ns/iter (+/- 2)
test bench_inplace_trim ... bench:          32 ns/iter (+/- 1)
7 Likes

I feel we should have additional trim*() methods that avoid reallocation in cases where it's possible do so (which will be when the trimming is happening at the trailing end of the buffer).

1 Like

You can write s.truncate(s.trim_end().len()) when you want to truncate the original string in place.

12 Likes

This is how in-place trimming could be implemented:

In-place (allocation-less) String trimming

trait TrimInPlace { fn trim_in_place (self: &'_ mut Self); }
impl TrimInPlace for String {
    fn trim_in_place (self: &'_ mut Self)
    {
        let (start, len): (*const u8, usize) = {
            let self_trimmed: &str = self.trim();
            (self_trimmed.as_ptr(), self_trimmed.len())
        };
        unsafe {
            core::ptr::copy(
                start,
                self.as_bytes_mut().as_mut_ptr(), // no str::as_mut_ptr() in std ...
                len,
            );
        }
        // EDIT: this is bugged since it does not behave as `set_len`, see posts below.
        self.truncate(len); // no String::set_len() in std ...
    }
}
  • EDIT: :warning: usage of .truncate() is bugged, see below
Playground

You can check it works as intended with the Playground

Benchmarks

running 4 tests
test bench_alloc_trim       ... bench:          37 ns/iter (+/- 1)
test bench_inplace_trim     ... bench:           9 ns/iter (+/- 0)

test bench_alloc_trim_big   ... bench:         147 ns/iter (+/- 4)
test bench_inplace_trim_big ... bench:          67 ns/iter (+/- 2)
Benchmarking code
#![feature(test)] extern crate test; #[cfg(test)] use test::Bencher;

use ::iter_python::{iter, Join}; // iter-python = "0.9.2"

const SRC: fn() -> String = || {
    test::black_box("  AAA BBB CCC DD       ".into())
};

const SRC_BIG: fn() -> String = || {
    test::black_box(format!(
        "       {}         ",
        " ".join(iter!("AAAA" for _ in 0 .. 1000))

    ))
};

#[bench]
fn bench_alloc_trim (b: &'_ mut Bencher)
{
    let mut src = SRC();
    b.iter(|| {
        src = src.trim().to_owned();
        test::black_box(&src);
    });
}

#[bench]
fn bench_inplace_trim (b: &'_ mut Bencher)
{
    let mut src = SRC();
    b.iter(|| {
        src.trim_in_place();
        test::black_box(&src);
    });
}


#[bench]
fn bench_alloc_trim_big (b: &'_ mut Bencher)
{
    let mut src = SRC_BIG();
    b.iter(|| {
        src = src.trim().to_owned();
        test::black_box(&src);
    });
}

#[bench]
fn bench_inplace_trim_big (b: &'_ mut Bencher)
{
    let mut src = SRC_BIG();
    b.iter(|| {
        src.trim_in_place();
        test::black_box(&src);
    });
}

Conclusion

  • The feature is definitely justified given the performance gain and the fact that the implementation requires unsafe-ly memmove-ing bytes around;

  • The same implementation logic could be applied to other (potentially-)left-trimming functions in str;

    • (since for right-only-trimming functions truncate suffices (i.e., no memmove required), the *_in_place variants are less needed)
  • This would require marking str's trimming functions with #[must_use] (not the case in current Rust!), in order to lint against calling the wrong function;

12 Likes

This benchmark looks like it is just measuring the performance overhead of cloning the String,since src is mutated in every iteration it will be trimmed only on the first iteration.

Here is the source of test::iter.

Also,here is a version of trim_in_place that does not use any unsafe code:

use std::iter;
use itertools::Itertools;

    
fn trim_in_place(this:&mut String){
    let trimmed: &str = this.trim();
    let trim_start=trimmed.as_ptr()as usize-this.as_ptr()as usize;
    let trim_len  =trimmed.len();
    if trim_start!=0{
        this.drain(..trim_start);
    }
    this.truncate(trim_len);
}

fn main(){
    let a=" hello\n\n";
    let b="    hello   world";
    let c="hello   world    ";
    let strings=vec![a,b,c];
    
    for string in strings {
        let mut owned=string.to_string();
        trim_in_place(&mut owned);
        assert_eq!( &*owned,string.trim() );
    }
}

This approach can be generalized to any function which returns a slice of the original str:

use std::cmp::min;

fn sliced<F>(this:&mut String,f:F)
where F:FnOnce(&str)->&str
{
    let original_len=this.len();
    let new_slice: &str = f(this);
    let start=(new_slice.as_ptr()as usize).wrapping_sub(this.as_ptr()as usize);
    if original_len < start {
        this.clear();
        return;
    }
    
    let len  =new_slice.len();
    if start!=0{
        this.drain(..start);
    }
    this.truncate(len);
}

fn main(){

    {
        let mut a=" hello\n\n".to_string();
        sliced(&mut a,|x| x.split("e").nth(1).unwrap().trim() );
        assert_eq!( &*a,"llo" );
    }
    
    {
        let mut b="    hello   world".to_string();
        sliced(&mut b,|x| x.split_whitespace().next().unwrap().trim() );
        assert_eq!( &*b,"hello" );
    }
    
    {
        let mut c="hello   world    ".to_string();
        sliced(&mut c,|x| x.split_whitespace().nth(1).unwrap().trim() );
        assert_eq!( &*c,"world" );
    }

    {
        let mut buffer="hello   world    ".to_string();
        sliced(&mut buffer,|_| "hello" );
        assert_eq!( &*buffer,"" );
    }

    {
        let mut buffer="hello   world    ".to_string();
        sliced(&mut buffer,|_| "world" );
        assert_eq!( &*buffer,"" );
    }
}
4 Likes

Well spotted, I will adjust the test to do the actual trimming each time.

Regarding the neat .drain(.. start) usage, do you know if it compiles to the same code in release?

2 Likes

They don't compile to the same assembly,my version produces more assembly code,with a few more jumps.

2 Likes

Good to know, then maybe for a std function and given the simple semantics of the op (one memmove), I think we can afford using unsafe (although I will keep the .drain trick in mind for future inlined uses of similar patterns).

I love the sliced generalisation, since by renaming it to in_place and using self instead of this, we can write:

let mut s = String::from("  Hello, World! \n");
s.in_place(str::trim); // so functional, much elegance
assert_eq!(s, "Hello, World!");

Note that both using memmove and a generic in_place would require guarding the unsafe ptr::copying with a assert!(new_slice.len() <= self.len()), in order to be able for in_place() to be sound (we could thus also consider adding an unsafe _unchecked variant).

3 Likes

Asymptotically equally expensive even (i.e. O(n)), reducing any performance delta to a constant factor.
However, I disagree with your "memory allocation is fast" assessment (though I suppose it depends on your frame of reference).
If it were fast, we wouldn't need a stack as it exists now, we could just simulate one with the heap.
I my experience, memory (de)allocation is one of the most expensive parts of real world programs.

2 Likes

:thread: Unlocked thread to add important note: the version using core::ptr::copy + truncate has a bug and will sometimes panic on non-ASCII text.

The bug occurs because self.truncate(len) checks if you're cutting on a UTF-8 character boundary. It does this by checking the byte at [len], which is outside the range you just copied, and so could fall in the middle of a multi-byte UTF-8 sequence. (The self.drain(..trim_start) solution doesn't have this issue, drain handles this stuff.)

For example, try running it on the string " aé" (playground), which is the byte sequence [20, 61, c3, a9]. The [c3, a9] is the 2-byte code point U+00E9 (é). The trimmed length is 3 bytes, core::ptr::copy makes it [61, c3, a9, a9]. The second 0xa9 at [3] is not a valid UTF-8 character on its own, it's a continuation. So truncate(3) will see it and panic.

Using std::string::String

Firstly, the drain(..trim_start) solution is excellent, and does not have this problem. And is has no unsafe. It will be slightly slower by a few nanoseconds, and if you are trimming many, many tiny strings, that may add up since 2ns is 20% of 10ns. For big strings it will be negligible, because drain uses core::ptr::copy AKA libc::memmove and the bulk of either approach will be in that highly optimised routine.

The absolute fastest way is to avoid truncate's UTF-8 boundary check entirely, using String::as_mut_vec + Vec::set_len. That's what the trim-in-place crate does, so you can just use that.

Here (playground) is a tiny version if you don't like dependencies. It is slightly different in that it returns orig_len - trimmed_len instead of self.as_str(). This enables easy checking of whether the trimming actually did anything (string.trim_start_in_place() != 0), and is a better choice than the original IMO because you already have very easy access to the trimmed string slice if you want it.

What if I'm not using std::string::String?

if you are working with a possibly-inline string, like the smartstring crate or any of the similar ones out there, then you may not have String::as_mut_vec available. The reason std's String has as_mut_vec is because String is just a newtype wrapper for Vec that checks UTF-8 in some places. Some inline or sometimes-inline strings cannot give you such an API, and a short survey (smartstring no, inlinable_string no, smallstr yes, tinyvec_string yes and no, ascii no, bstr no) reveals most don't. All of them have a truncate API.

So the goal is to trick truncate into not panicking. In a technical sense, there could be up to 3 bytes of invalid UTF-8 adjacent to the final trimmed area. One way to definitively correct the UTF-8 for sure is to blast zeroes up until the end of the string, but that's a bit of a waste. A better way is to write a single valid UTF-8 character after the copied bytes, exactly at the spot truncate() is about to check. This is a bit of hack, but truncate does assume the whole string is valid UTF-8, so it will only ever need to check if you're on a char boundary, not whether the whole string is UTF-8. One byte is therefore sufficient.

The result looks like this (playground).

7 Likes

If you're going to copy-paste something, I'd suggest using a simple safe solution instead of messing about with ptr::copy.

For example, one can stay with the normal String API to not have to do anything complex at all:

pub fn trim_end_in_place(s: &mut String) {
    let trimmed = s.trim_end();
    s.truncate(trimmed.len());
}

pub fn trim_start_in_place(s: &mut String) {
    let trimmed = s.trim_start();
    s.replace_range(..(s.len() - trimmed.len()), "");
}

pub fn trim_in_place(s: &mut String) {
    trim_end_in_place(s);
    trim_start_in_place(s);
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f38633568291502b397dc74406f21409

After all, trim_end, trim_start, truncate, and replace_range are all char_boundary-aware.

And replace_range like that will do the memmove for you (https://rust.godbolt.org/z/fzbvrdEMz), so you get simpler code and safe code, for the cost of maybe one extra is_char_boundary check. I'd take that.

(Though I wouldn't mind seeing an ACP to see whether libs-api would be willing to just add in-place versions to alloc.)

7 Likes

Yes, I agree. The drain example from above doesn't show a memmove in Godbolt, but it does call alloc::string::Drain's Drop implementation, which is just alloc::vec::Drain's Drop implementation, which in turn is just memmove. Maybe rustc can see through replace_range a bit better.

The main issue for a change to std would be trying to implement trim_matches_in_place & friends. It's easy to implement on str because there's no mutable borrowing going on. The lifetimes are just not at all forgiving once you introduce &mut self and start trying to borrow it to run the pattern, AND modify the string in the same function. The pattern trait has a lifetime Pattern<'a> that on str::trim_matches is unified with the input lifetime &'a str. Here is a starting point using for<'p> to avoid having the pattern lifetime related to the lifetime of the string, which will limit the kinds of borrowed patterns you can put in there.

1 Like

Good catch @cormacrelf! :pray:


Heh, got bitten by:

So, ironically, the fault in that code was not in its usage of unsafe, but in its lack of usage of some raw unsafe set_len function[1], and having delegated to the safe truncate. In general, this is indeed the important rule of thumb here: when writing unsafe code, try not to mix raw operations with high-level operations (and obviously if non-unsafe alternatives that have not been seen to be slower (benchmark required), then go for that).

In this case, for instance, I shouldn't have been using any str/String methods after the ptr::copy and until manual truncation, since the ptr::copy can break the UTF-8 well-formedness safety invariant of String, an invariant which these methods can perfectly legitimately rely on.


  1. for instance, I could have gone through the underlying Vec<u8> if I had wanted to operate on the raw bytes ↩︎

This topic was automatically closed after 5 days. We invite you to open a new topic if you have further questions or comments.