In Rust, str
is the most basic type for strings, but itâs a bit complicated because str
itself is just the string data itself without any indirection. (A concept called an âunsizedâ type.) Comparing with e.g. Java, Rustâs type called String
is actually most similar to something like Javaâs StringBuilder
. A growable object that contains some string data and that can have some additional unused capacity it can grow into without re-allocating its internal buffer. This means that you can append stuff to it rather efficiently. The str
type on the other hand can be used flexibly with all kinds of pointer types, e.g. Box<str>
or Arc<str>
or &str
. Java doesnât have this intricacy about ownership and memory management, I suppose judging by the garbage-collected immutable nature of Javaâs string, theyâre indeed most similar to Arc<str>
, but many use cases of javaâs string type â namely use cases that fit the idiom of âborrowingâ â would be using &str
in Rust.
Some rules: Owned strings are String
or Box<str>
or Arc<str>
/Rc<str>
. The last ones are shared ownership (with reference counting) and thus immutable, Box<str>
is owned and can be mutated, but since itâs not a growable buffer with extra capacity you cannot change itâs length without having to copy all the data every time. This is why String
is most commonly used for owned strings. Thereâs also sometimes situations where youâre pretty much only working with string literals, so itâs useful to know the type of those. The type of string literals is &'static str
(but not every &'static str
is a string literal), so if youâre planning on only storing string literals in some structs or whereever, working with &'static str
can be a good and efficient choice.
If youâre working only or mostly with String
, then youâll sometimes want to â as you do with any other type â pass the string as a borrow, e.g. use the &String
type. Using this type is an antipattern because it can be cheaply converted to &str
. The type &str
has one less level of indirecition and if you use it in your function arguments, they cannot only be called by borrowing a String
, but also by borrowing a Box<str>
or Arc<str>
or using a string literal or re-borrowing a &str
you got from somewhere else, etc. So in a sense &str
is âmore generalâ than &String
which is why you should prefer it if possible. Noteworthy is that the conversion from &String
to &str
can happen implicitly. If you have some variable x: String
and a function fn f(arg: &str)
you can just call f(&x)
and it works fine! Regarding the conversion cost, the internal buffer in the String
type already contains properly encoded data corresponding to the str
type; internally a String
is a triple consisting of a pointer to the buffer, an integer for the length of the str
data in that buffer and another integer indicating the total capacity of the buffer which can be larger than the length. A reference/borrow of type &str
is just a pointer and some length information, so all the conversion does is temporarily forgetting about the unused capacity. Itâs kind-of zero cost if you will, just calling it âcheapâ makes it seem more expensive than it is.
Mutable borrows, so &mut String
are common though. Note the difference to &mut str
: You can grow the string buffer through a &mut String
reference, but the size of a &mut str
canât be changed⌠well you can split it into multiuple parts and in a sense make it smaller but you canât grow it in place.
As others already mentioned the situation is very much related to Vec<T>
and slices. Vec<T>
corresponds to String
and [T]
corresponds to str
, so behind an indirection, itâs types like &[T]
, Box<[T]>
, Arc<[T]>
, etc. &mut Vec<T>
is growable, &Vec<T>
is an antipattern, etcâŚ
In fact String
is just a wrapper around Vec<u8>
and str
is a wrapper around [u8]
but both of them make sure to always contain valid UTF-8 data.
TL;DR Comparing to Java: Rust String
is like Javaâs StringBuilder
, Javaâs String
is like Arc<str>
in Rust if youâre talking about strings that can be garbage-collected at runtime, or like &'static str
if youâre talking about string literals (or other strings that will be valid for the duration of the whole program run), or they are like &str
if youâre in a setting that fits borrowing in Rust so that you donât need any garbage collection at all. So the reason for the distinction of String
vs anything-involving str
is very similar to the distinction of StringBuilder
and String
in Java, and the different versions of types involving str
are due to Rustâs memory management.
Unlike Java
where calling a function expecting String
means youâll eventually have to convert your StringBuilder
into a String
(which has a cost of copying the data), in Rust, if the function expects &str
, you can leave your data in the buffer it was built in and just pass a pointer/reference into it. A nice win.
Side-note: Taking garbage-collection more seriously, StringBuilder
would be Rc<RefCell<String>>
and StringBuffer
would be more like Arc<Mutex<String>>
or maybe Arc<RwLock<String>>
.
Edit: Another thought on Box<str>
: A Box<str>
can be cheaply converted into a String
. Converting the other way is only cheap if the String
had no unused extra capacity. In this sense a function expecting String
can be called with a Box<str>
argument by converting it first (though this has to be an explicit conversion, e.g. using .into()
), so that using String
for such argument types is the âmore generalâ and thus the preferable type. Really, the only disadvantage of using String
over Box<str>
is that you might end up having a string with lots of unused extra capacity. But if that ever is a problem, you can also modify the String
in-place with something like .shrink_to(capacity)
or .shrink_to_fit()
.
While &String
is an antipattern, Arc<String>
is something you might encounter. Using it saves the need for paying the conversion cost of String
to Arc<str>
which is â like with Box â not âcheapâ if the String
had unused excess capacity.
While Iâm at it throwing around string types Iâll also mention that Cow<'_, str>
is a thing, but Iâm not going to explain what it does in this comment.