A little benchmark Strings vs SmartString with LTO

joaocarvalhoopen · November 10, 2021, 8:05pm

Hello,

I have made a simple benchmark that creates 1.000.000 small strings “Mary had a little lamb!” with the Standard Library String vs the SmartString crate and I applyed to it the compilation flag LTO - Link-Time Optimization. SmartString wins big, the CPU time passed from 81 ms to 33 ms (2.5x faster).

The really nice thing is that, if you use LTO, the performance of a SmartString with length greater than 23 bytes, that live in the Heap, has the same performance that the ones from the STD String, and that’s I consider to be a really nice feature

I have put the code on:

Rust benchmark String vs SmartString and LTO
https://github.com/joaocarvalhoopen/Rust_benchmark_String_vs_SmartString_and_LTO

Thank you,

Best regards,
João

joaocarvalhoopen · November 11, 2021, 8:48am

I would like to ask the following question to the forum?

As I benchmarked in the previous post the creation of 1.000.000 strings of 23 bytes (23 ASCII characters) and the search inside the string for the word “lamb!”, is 2.5 times faster with SmartString’s (stack allocation and not heap allocation ) and has exactly the same performance for strings longer length strings (heap allocation) then the current and default String, so why isn’t it (SmartString) used as the default String algorithm instead of the current String algorithm?

The numbers are in the previous link README.md file with the code.

Thank you.

Best regards,
João

Michael-F-Bryan · November 11, 2021, 3:02pm

I'm not one of String's original authors, but I can think of some possible reasons for a String to always be a heap-allocated string.

Simplicity - this can't be stated enough. You want your core types to be maintainable and bug free, especially when they contain a lot of unsafe code
Unlike languages like C++, Rust has good support for dependencies so it's quite easy to use a more specialised string type if you need something better or different guarantees
Specifying that String will always use a small-string optimisation locks the standard library in and forces them to make guarantees about String's implementation/layout that they may not want to provide. You can't just say "this is an implementation detail that shouldn't be relied upon" because all abstractions are leaky

This thread on Hacker News might point you in the right direction. This comment stands out in particular...

IIRC, there are a variety of factors:

Conventions and prevailing idioms. In C++ copying strings is common, both unintentionally due to the implicit copy constructor and intentionally as a means of defensive programming. In Rust, copying strings is relatively uncommon: Rust has no copy constructors (copying must be done explicitly with the .clone() method), and the borrow checker obviates the need for defensive copies and therefore passing string slices is overwhelmingly preferred (and recall that string slices in Rust (&str) are 16 bytes, in comparison to 24 bytes for C++'s std::string).

Alternative optimizations. For example, in Rust, initializing a Vec (and by extension String) does not perform a heap allocation for zero-length values. IOW, you can call String::new() (or even String::from("")) in a hot loop without worrying about incurring allocator traffic. Any hypothesis that short strings are more common than long strings will likely also hypothesize that the most common short string length is zero (and this appears to be borne out in practice; this optimization is really important for Vec, so important that, for Servo, it often makes Vec faster in practice than their own custom SmallVec type which strictly exists only on the stack), so this optimization goes a fair ways towards satisfying the "I actually do have many short strings and I'm not just copying around the same string a lot" use case.

Weighing trade-offs. The pros of SSO are better locality and less memory usage, with the cons of larger code size and additional branches on various operations.

Putting it all together, this gives the Rust devs reasonable incentive to take the conservative path of not implementing SSO for the default string type. That's not to say that we'd be incapable of finding Rust code that would benefit from SSO, or that C++ chose their default incorrectly (different contexts allow different conclusions), or that the Rust devs will always have this stance (if performance benefits of SSO were to be clearly demonstrated on Rust code in the wild in such a definitive way as to justify changing the default behavior, then I don't think they couldn't find a way to make it happen).

joaocarvalhoopen · November 11, 2021, 3:44pm

@ Michael-F-Bryan

Thank you so much for your explanation, for citing the in depth explanation from hacker news comment and for linking to the original thread.

Best regards,
João

Dushistov · November 11, 2021, 9:34pm

As I know the part of problem is this method String in std::string - Rust . The interface of stdlib should not be changed,
and this method cause String to have Vec<u8> inside.
Here details: `String::as_mut_vec` prevents small string optimization · Issue #20198 · rust-lang/rust · GitHub

H2CO3 · November 11, 2021, 9:56pm

Another, crucial point is that with SSO, pointers to the backing buffer are changing all the time. That is very hard to account for correctly in unsafe code. People know that allocating a String is expensive and they will avoid it statically if possible.

joaocarvalhoopen · November 11, 2021, 10:23pm

@ Dushistov

Thank you, the method you pointed out and the discussion that started in Dec 2014,
in this this thread issue is really interesting, long and in depth, I will read more on it, but from what I could already read, is like you said the external exposition of the internal machinery of the String, (vec) as a mutable pointer to the exterior allow the implementation detail, to not leak but instead to be clearly in the open to any one to external change of state in the String, that depend on the inner workings of String. Like is discussed in the thread, previously in 2014 before the method was in stable, it would be a decision that would make “future” possible change to the string implementation impossible.

@ H2CO3
I didn't see the internal code of SmartString, but a pointer should only change back to a Stack allocation, if you made a clear() or if you make something like put a smaller string and did a shrink_to_size() kind of method, because in other ways the capacity was already allocated..

Best regards,
João

chrefr · November 12, 2021, 12:01am

I would say this benchmark is a very untypical usage of strings, you usually don't create X strings of the same length and same content, performing the same operation on them. For this usage, SmallString really shines - it will not be wrose than normal String in case of a heap allocation (big string), and will be much faster with stack allocated strings.

In the general case, if you have a string that is usually short, and doesn't change after creation, SmallString is very good. Even if strings were immutable, though, I wouldn't want std to perform a small-string optimization since there is a perf penalty, and I just don't want to have it in case my strings are anyway going to be long.

But it is much worse when you change the string. The storage may change - which will hurt branch prediction, sometimes even rendering it completely unusable and even harming. Copying the string twice also have some performance penalty.

And BTW, I'm not sure we can compare C++'s std::string to SmallString. I haven't tested that, but since C++ supports self-referencing classes (with move constructors), you can have a pointer that references the active buffer - eliminating the cost of branching for each operation (at the price of +8 bytes).

joaocarvalhoopen · November 12, 2021, 10:19am

Hello,

Thank you all for your in depth input.

To satisfy my curiosity, in the first post link, I now added tests for SmallString and SmallStr.

In my small tests, for small strings, SmartString and SmallStr come out as clear winners with LTO, although I can not say if it is a representative benchmark, or not.

Best regards,
João

system · February 10, 2022, 10:20am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Help design a "smart string" optimization code review	5	769	February 14, 2022
Strings on stack much faster than String code review	17	5611	June 18, 2023
Announcing iString: a String type with small string optimization announcements	9	2586	January 12, 2023
String concatenation best practices/performance? help	5	10761	January 12, 2022
Focus on String, Rust vs Java, 5.23 seconds vs 464 milliseconds? help	15	2519	April 21, 2021

A little benchmark Strings vs SmartString with LTO

Related topics