What's the difference between &String and &str?


#1

Hello, I’m a bit confused by why we need both String and &str.

I understand that String is an owned string while &str represents a borrowed string.

I also read that I can pass &String to a function that wants an &str and it will get coerced but it’s not clear why they are different types. What’s the difference between the two?

Is str alone (without being a reference) used anywhere or is it something that’s only ever created and owned by the compiler?

Finally, if I write a function that takes a string reference, what should it’s type be? I feel that the answer it’s &str but I’m not entirely sure why


#2

Check out Chapter 4 of the second edition of the Rust book:

https://doc.rust-lang.org/nightly/book/second-edition/ch04-00-understanding-ownership.html

It goes into all of this in a lot of detail, including diagrams.

(This hasn’t hit stable yet, so it’s totally reasonable to have missed it!)

This is the one part that isn’t covered there. str on its own is an “unsized” or “dynamically sized” type; you can’t ever create str values directly, they must be created behind some kind of pointer; &str, Box<str>, Rc<str>, that kind of thing.


#3

Basically a String wraps and manages a dynamically allocated str as backing storage.
Since str cannot be resized, String will dynamically allocate/deallocate memory.

A &str is thus a reference directly into the backing storage of the String, while &String is a reference to the “wrapper” object.
Additionaly, &str can be used for substrings, i.e. they are slices. A &String references always the whole string.


#4

For C programmers:

struct str {
   char text[]; // unknown number of bytes, which is why `str` as such is not used
}

struct &str {
   size_t length; 
   const char *text; // not owned
};

struct String {
   size_t length; 
   size_t free_capacity; 
   char *text; // malloc()'ed
}

#5

This feels backwards to me, and contradicts

Which is more accurate.


#6

How so?
I did write a dynamically allocated str, not &str.
A String is essentially a Box<str> with additional bookkeeping to allow resizing.


#7

Because a String doesn’t contain any strs; it’s a Vec<u8> internally. Just like a str is a [u8] internally.

“The owning type contains a borrowed type inside it” feels wrong; “a borrowed type refers to some owning type” feels right.


#8

str is not a borrowed type, &str is.

String containing a Vec<u8> internally is an implementation detail. Conceptually, it must contain a str, because otherwise how could you borrow a &str from it?

EDIT:
To elaborate a bit more, you cannot have a reference without something it references. So a &str must point to a str, and where should that str live if not in the String?


#9

String implements Deref<Target=str>, making it conceptually an owning pointer to a str, even though it doesn’t literally have a *mut str field. Similarly, a Vec<T> is a type of (smart) pointer to [T].


#10

While we’re on the topic, I always hear the ‘resizable/heap vs fixed-size/stack’ explanations (and that’s helpful), but I’m more curious as to why the Rust didn’t just go with, say, String and give the implementation a small-string optimization to avoid unnecessary heap allocations for the typical use cases?

I have my own theory, but I’m wondering if anyone knows the real reason(s)?


#11

See here https://internals.rust-lang.org/t/small-string-optimization-remove-as-mut-vec/1320


#12

I think of String as a specific allocation and grow policy for string data. Slice / owner separation is very powerful that way, though of course what we enjoy as power makes a hill that learners have to climb.


#13

@bluss, yes, it’s the slice / owner separation pattern that the Programming Rust book I’m reading called out in &[T]/Vec, Path/PathBuf and one other place that escapes me at the moment. Anyway, that enabled me to see S/OS as a pattern employed throughout Rust, not just as a one-of (er, two-of) thing for &str/String and &[T]/Vec.

You’ve captured both the motivation and the cost very succinctly–that’s helpful, thank you!


#14

this is helpful.what I am confused is ,what kind of struct should str(not &str) to be.thanks!


#15

what kind of struct should str(not &str) to be

It would be a dynamically sized type looking something like this:

struct str {
  contents: [u8],
}

Notice how the contents contains a slice directly, not a reference to a slice. The nomicon explains this pretty well if you want to find out more.

Due to their lack of a statically known size, these types can only exist behind some kind of pointer. Any pointer to a DST consequently becomes a fat pointer consisting of the pointer and the information that “completes” them (more on this below).


#16

It would be just the raw bytes of the string.

But because the length is held in the reference, not str itself, such type is mostly an unusable abstract concept in Rust.


#17

I’ve got a question related to this one: Why does a String slice have a special type &str, whereas a slice of a vec of integers doesn’t? (There the slice type is &[i32]).


#18

What are you looking for? Why doesn’t &[T] fit the bill?

[T] is as special as you can get. [T] and str are both primitive, unsized types you can’t construct. I mean, str is basically just [u8] with a UTF8 invariant.

And one time, long, long ago, Vec<T> and String were known as ~[T] and ~str.


#19

If you think of str and [] as the fundamental types, their owned forms are both different than these, namely String and Vec, respectively. This seems consistent, unless I am misunderstanding your question?


#20

Because [u8] doesn’t have to be valid UTF-8, str does.