I've recently seen https://www.youtube.com/watch?v=A4cKi7PTJSs and I keep thinking about it. It seems 75% of my
String use should have been
Arc<str> all along. I think I remember seeing people thinking about these things in the past, but can't find it now, and life goes on, I just keep doing
String because of the habit.
Am I missing something? Seems like if you're not going to be mutating, it's generally (not always, but generally) preferable to go with
Would it be a good idea to have
Arc<str> aliased/newtyped in the stdlib? Seems like the biggest problem with
Arc<str> is that is not well known, doesn't have semantic name and is harder to type.
For the most part, it doesn't matter.
String is a well-supported, easy to use type without the shenanigans of multiple ownership or needless atomicity. It's fine to just use
String as the default owned string type.
The performance difference between
Box<str> is tiny enough that it doesn't need more discoverability.
Arc<str> would be the counterpart to
Arc<String>, which should always replace it except maybe if you're unwrapping
The standard library has methods for converting between
Box<str> which is probably enough.
Nope. Without benchmarks it's really hard to say whether
String or compact_str is better.
Naïvely it feels as if
Arc<str> should be able to easily beat other alternatives. And that was even true. Many, many, many years ago. When CPUs had one core and
LOCK prefix was cheap.
But I know how Google jumped from gcc 2.95 to gcc 4.2 specifically to avoid
Arc<str>. Well… what they wanted to avoid, actually, was gcc 3 provided
std::string (and they switched to gcc 5
str::string which was called __gnu_cxx::__versa_string before gcc 5), but the fact remains: they saved quite a few million by using strings which used cheap small-string-optimization instead of expensive atomic refcounting.
Small-string-optimized strings win if you have lots of tiny strings which you copy around often and
Arc<str> wins if your strings are long while
String in neither here not there and makes you code a bit smaller, instead.
Strings are hard, there are no silver bullet.
Have you considered using
Cow<'a, str>? It's the more lenient option if you prefer borrows but may need to mutate it as a
String. It's also still
Send + Sync if that is important from your use of
After I got some response I realized there's a lot of tradeoffs to consider.
One thing is small string optimization - it's probably always good to have, but requires bringing in custom implementation.
The other one aspect is memory use.
Arc<str> will re-use the string, which if you have a lot of copies that you keep for a longish time, might be worthwhile, especially due to cache use.
Another is cost of cloning and dropping. Atomic counter vs extra allocation. My bet would be that
Arc<str> will win anyway, but it's just a guess.
Performance wise I would think that "custom with small-string opt." is best,
Arc<str> is going to be second, and
You can't make an
Arc<str> from a
format!) without extra copy due to
Arc having two counters, and probably without extra allocation too, unless you're careful to format into an SSO or stack-based string type first.
I'm trying to use
Box<str> wherever I don't need to grow the string (e.g
Error fields), but it's a PITA to use, since Rust lacks things like
&Box<str>, so it ends up with ugly syntax like
&**s == "wtf".
String type with small-string-optimization is neat if you need to return small formatted strings.
But if you need to use a lot of strings, and copy them around, you probably want string interning. 4-byte
Copy "strings" beat any other string type. There's also a special magic
ustr for interning global identifiers and hashmap keys.
I have tried to use
Box<[T]> in place of
Vec<T> in similar contexts (when I don't need to change the length of the collection) and found myself hampered by the lack of
IntoIterator<Item = T>.
Vec::from(box).into_iter() is zero-cost.
String::into_boxed_str().leak() gives you a cheap
Copy, easy ready to use
Just don't use it in a loop...
Just FYI the video creator just chose to use
Arc<[T]> throughout the video because it always works but specifically mentions that
Rc<[T]> should be used whenever no Sending between threas is needed.
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.