&str string literals

Continuing from Why can't you specify the data type within tuples?

That’s because they are hard-coded into your program, so you’re just referencing the data in your program’s memory.

@OptimisticPeach I am not quite getting it, I thought strings are stored in the heap so if you wanted to refer to it you would use the & symbol but string literals are stored on the stack so why do you need to refer to it, sorry I don’t really understand how it works.

String literals are not stored in the heap or the stack, they are stored directly in your program’s binary. Literally embedded in the binary, and the reference is a reference to the location in the binary.

3 Likes

They’re in a section of your program’s binary. Either loaded into memory by the OS together with your program’s code, or memmapped from the executable on disk. They are in RAM, but that section of RAM is neither stack nor heap.

In Rust there isn’t a difference between something on the heap, stack, or some section of the program. Ownership and borrowing applies to all of them the same. For example, Box::leak(string.into_boxed_slice()) gives you &'static str that is on the heap, and not compiled into your program.

2 Likes

@kornel’s post brought to mind that not only strings need to be in the program’s embedded data, there can be any of the following:

//functions 
fn foo() {}
//x is a *pointer* to something in the "other" parts of the program
let x: fn() -> () = foo; 

//byte string literals
let x: &'static [u8] = b"abcd";

//references to pre-built values
let x: &'static Option<Option<Result<usize, &'static str>>> = &Some(Some(Err("abc")));

//Not too sure, but statics too?
static x: [u8; 20] = [123; 20];

//chars
let x: &'static char = &'a';
let x: &'static u8 = &b'a';

And a few more I’m probably forgetting.

Like a good’ol C programmer, you can store str directly on the stack, using fixed size array as a buffer.

let buf = [0_u8; 10];
buf[0..3].copy_from_slice("foo".as_bytes());
buf[3..6].copy_from_slice("bar".as_bytes());
let output = std::str::from_utf8(&buf[0..6]).unwrap();
assert_eq!(output, "foobar");

Of course, there’s a crate to conveniently do so. Check arrayvec::ArrayString.

You might have heard of Sized before. A sized type is one that always has the same size in bytes (and this size is known to the compiler). This is very important to the compiler, because if some_variable is 4 bytes large, it knows the variable after some_variable is stored 4 bytes later, which means it can hardcode that four in the compiled program. You can only store a variable on the stack if it is sized.

Now, how many bytes is a string? Well it depends on the length. “abc” is three bytes, while “hello world” is 11. So str is not Sized, which means you cannot put it on the stack. Luckily a reference &str is always the same number of bytes (usually 4 or 8 depending on the computer). So storing a reference on the stack is fine.

When you type

let s = "Hello world";

the compiler will place the bytes H, e, l, … somewhere in the executable, and in your function it will hardcode a reference to that location. This is why s has the type &str. You cannot put a str in a variable at all, because variables are stored on the stack.

The situation is the same as [T] vs &[T], where [T] is just some number of Ts after each other, thus having variable length, while &[T] is a pointer (of 4 or 8 bytes) that points to somewhere with the Ts. Do note that there is an [T; n] type, where you hardcode the number of elements. As an example, a [i32; 8] has a known size and can be stored on the stack.

let a: [i32; 8] = [0, 1, 2, 3, 4, 5, 6, 7];

However, you will not be able to resize a [i32; 8], because the 8 is part of the type. In Hyeonu’s example you can see that you could store a string in a [u8; 10], although you should note this means you can only have strings of length 10, while a &str can point to a string of any length.

3 Likes