There are 2 types of strings in Rust namely , String (mutable) and string slice "str".
Why I need to annotate the type as reference when I use string slice .
For example I need to use like
let str1 : &str = "Chennai" ;
Why not like
let str1: str = "Chennai"
When I use without reference, the rust compiler says that
error[E0308]: mismatched types
--> string1.rs:4:22
|
4 | let str1: str = "Chennai";
| --- ^^^^^^^^^ expected `str`, found `&str`
| |
| expected due to this
error[E0277]: the size for values of type `str` cannot be known at compilation time
My question is the compiler knows the size of string and in this case "Chennai" it is 7 characters
and it is English and hence the size can be known.
If is simple str instead of &str , the consistency is there..
I'd add that I think it's truer to say there is one type of string in Rust: a str.
[edited for accuracy: thanks trentj!]
A str is some UTF-8 bytes.
A reference to a str (&str) is a pointer to a str plus a length. Knowing the length is obviously important if you actually want to use the pointer.
A String is simply one way (albeit the main way) of managing dynamic memory for a str. It's an &str plus the capacity of a memory buffer that the &str can grow into. So in total it's a pointer to a str, the length of the str and the total capacity of the String's buffer.
The String can also relocate its buffer to increase its capacity.
That's misleading, IMO. There's nothing pointery about a bare str, it's just the type of the thing that &str and String both point to. I'd rather just say "A str is some UTF-8 bytes".
Other pointer types like Rc, Arc and Box can also point to str, just like they point to sized types.
In a fully consistent way, str is zero or more bytes of data which is a valid UTF8 string. It has a length, but that length is not stored inside it. Rather, it is "unsized" because the length is fully runtime-dependent and located elsewhere. You cannot store a str by itself because it does not have a constant size, and it does not store its own size.
When you put something unsized behind a reference or pointer, that reference becomes "fat" and stores the size. &str is, in memory, (ptr, size). ptr points to the start of the str, and size is how many bytes it is. Box<str> is exactly the same layout, except it owns the memory rather than borrowing from it.
I view String as something similar, but different. String also represents zero or more bytes of data which are valid UTF8, but it does not contain a str, and str does not contain a String. Instead, String allows creating an &str referring to itself, or to part of itself. String is owned: if you have one, you can modify it, and you can drop it. The difference between String and Box<str> is that you can change the length of a String after creating it, and str's length is fixed upon creation.
Box<str> is rarely used, but it is an important part of the story to understand exactly what str and String are.
Does that make sense as a consistent abstraction representing both of these things?
Thanks. Based on your inputs and others and other materials,
I understand that string slice (str) has 2 components namely pointer to a "stream of bytes" (which are valid UTF encoding) and size (or length).
It is also mysterious that it can only be referenced not directly owned. that is one cannot have
let s1 : str = "Hello";
It implements Copy trait.
String is more natural. I.e I can have
let s1 : String = String::from("Hello");
It is always implemented using heap storage. No Copy trait implemented. The level of
abstraction from programmers perspective is consistent.
When to use &str and String.
For smaller fixed "strings" &str could be more useful as its fast but more importantly copy trait
will make programmers friendly. Example, fixed user messages (example messages) etc.
For large sized strings which are to be muted, String type is useful.
Be careful here. &str is Copy, but str itself is not
It's worth noting that &ANYTHING is always copy, no matter what ANYTHING is. You can always copy references. &String is copy, for that matter.
It's worth noting that &str is extremely useful for large strings too - mainly because it can be passed around for free, and you can take slices for free. For instance, when parsing large documents, a common approach is to allocate the whole document as a String, immediate make an &str from it, and then inside the parser just make new &str views into that string. This way you don't have to allocate at all when parsing, you just make new views into the existing data.
Similarly, if you ever need to write to or create a new string at runtime, no matter what size, you need String. I think the difference is more what you want to do with the string, not what size it is. How does that seem?
In practice, I don't think any rust programmers start out with a full understanding of &str and String. It's something that you get used to over time, as you read rust code, and write more of it. It's definitely worth trying to understand fully, but it's also possible to gloss over the full technicalities until you understand more of rust in general, if you want.
Another way to look at this is that str is just a type alias for [u8]. There are guarantees its provides about its contents being valid UTF8, but that is only because there are special methods for creating str - those guarantees aren't particularly important for your question here.
Following from that, [u8] is a dynamically-sized array. Rust doesn't let you used dynamically-sized things (whatever those things may be) on the stack. I don't know the official reasons behind that decision, but stack overflows are one really good reason to disallow dynamically sized data on the stack.
I've had success in the past using my browser's "save webpage" feature. In Firefox this is triggered by Ctrl+s on a webpage, and will allow you to save a local copy of the HTML and a directory with all the other needed files.
Besides that, I'm not sure. Doing a select-all, copy and paste into vim seems like a good idea. I get pretty readable text doing that.
It's worth noting that this forum has never deleted posts, and will run for the imagineable future. As long as it keeps being a place for the rust community, you'll always be able to access this conversation from the same URL.
FWIW, it was never "decided" that we shouldn't support unsized values on the stack. In fact, an RFC for allowing this was even accepted at some point, but it's in the big pile of unimplemented RFCs:
Although if the comments there are any indication, all of the "unresolved questions" together are easily enough material for an RFC of their own. Likely not something anyone will be able to push through until a lot of far higher priority projects settle down.