The Copy trait - what does it actually copy?

Ah! Yes, thank you! That makes sense! The fat pointer is on the stack, but where are the actual characters that make up the string stored?

The actual “Hello, World!” data for a literal string will be in the program binary, like the .rodata section in ELF files. For an arbitrary &str though, you don’t know where the underlying memory came from. Whoever gave you that reference must know, and you only know the lifetime for how long it is valid.

3 Likes

I’d say most of the confusion is specifically due to “strings” as people come from languages that don’t draw the distinction that Rust does. In addition, Rust has builtin syntax for string literals and you don’t see the &str for it.

If one was to instead use, say, Vec<i32> vs &[i32] I think the difference would be a bit more apparent, at least visually.

4 Likes

So cool.

Thank you! I’m going to update those blog posts ASAP!

Does an image like this better express what's going on? When we Copy &str, are hello and hello1 grabbing onto the same fat pointer, or is the fat pointer copied and placed on the stack?

The fat ptr is copied. hello and hello1 themselves have a different location (on the stack), but they’re the same value. It’s like having 2 ints on the stack with the same value - each has its own address but same value.

1 Like

The fat pointer is on the stack, so yeah that’s not quite right either.

So if I grok the visual metaphor you’re going with, the pointer and the length go inside the green shape, and the characters of the string go inside the purple shape.

2 Likes

If you want to get fancy showing the effects of fat pointers, you could try slicing:

let hw = "Hello, World!"; // the stack value is a pointer and length, e.g. (0x1234f00, 13)
let hello = &hw[..5]; // same pointer, different length: (0x1234f00, 5)
let world = &hw[7..12]; // (0x1234f07, 5)

(Note that str indexing is on byte offsets, not Unicode codepoints, but here it’s just ASCII anyway.)

1 Like

Another playground that may or may not help show how much is in the &str (on the stack) versus how much is in the str (on the heap or read-only memory): https://play.rust-lang.org/?gist=b74869d847cb760faad1c8e74d38594a&version=stable&mode=debug&edition=2015

fn main() {
    let hello = "Hello, world!";
    let empty = "";
    let long = "The quick brown fox jumps over the lazy dog.";

    // size_of_val(&T) returns the size of T, so...
    
    // size_of_val(&str) returns the size of the str being referred to,
    // i.e. the underlying character buffer (typically in read-only memory)
    println!("{}", std::mem::size_of_val(hello));
    println!("{}", std::mem::size_of_val(empty));
    println!("{}", std::mem::size_of_val(long));

    // size_of_val(&&str) returns the size of the &str being referred to,
    // i.e. the fat pointer (pointer and length) on the stack.
    println!("{}", std::mem::size_of_val(&hello));
    println!("{}", std::mem::size_of_val(&empty));
    println!("{}", std::mem::size_of_val(&long));
}
13
0
44
16
16
16
2 Likes

This is great!! Thank you!

Does this image make sense?

Pointers

The pointer to "Hello, World!" is stored on the stack. hello is stored on the stack with a fat pointer whose memory address points to the place in the stack that points to "Hello, World!". When we let hello1 = hello;, we copy the fat pointer (incl. the memory address), and pop that onto the stack as well.

That’s not quite it either. In your setup you’re imagining that hello and hello1 point to an intermediate address which then points to "hello, world!", but in reality there is no intermediary. hello and hello1 are the addresses that do the pointing.

1 Like

Nope, there’s no pointers to pointers going on here. A “fat pointer” is not a pointer to another pointer. It’s just a pointer and some other value that the type system refuses to let you split apart (unless you go unsafe).

Lemme try this…

let hello = "Hello, world!";
let hello1 = hello;

does this:

Read-only memory:


Hello, World! // let’s say the ‘H’ is at address 0xABC

Stack:


0xABC // hello's pointer
13 // hello's length
0xABC // hello1's pointer
13 // hello1's length

1 Like

You may find this cheat sheet that someone created a while ago useful: https://i.redd.it/moxxoeir7iqz.png

4 Likes

Thank you all so much!

As I now understand, &str and &i8 behave the same (maybe as all &T), so when the Rust Book says:

let x = 5;
let y = x;

We can probably guess what this is doing: “bind the value 5 to x; then make a copy of the value in x and bind it to y.” We now have two variables, x and y, and both equal 5. This is indeed what is happening, because integers are simple values with a known, fixed size, and these two 5 values are pushed onto the stack.

The value being Copy is the integer 5.


But if we were to instead create a reference, i.e. &T:

let x = &5;
let y = x;

The value being Copy the reference to a hardcoded 5 somewhere off in read-only memory, i.e. not the stack or heap.

1 Like

This is a small rewrite. Here, I'm just specifying that it is a (fat) pointer which is being copied, not the actual string:

The string literal, "Hello, World!", is stored somewhere in read-only memory, (neither in the stack nor heap), and a pointer to that string is stored on the stack. Because it's a string literal, it usually shows up as a reference, meaning that we use a pointer to the string stored in permanent memory, (see Ownership in Rust, Part 2 for more on references), and it's guaranteed to be valid for the duration of the entire program, (it has a static lifetime).

Here, the pointers stored in hello and hello1 are using the stack. When we use the = operator, Rust pushes a new copy of the pointer stored in hello onto the stack, and binds it to hello1. At the end of the scope, Rust adds a call to drop which pops the values from the stack in order to free up memory. These pointers can be stored and easily copied to the stack because their size is known at compile-time.

It might also be useful to make another block for the string “Hello, World”, which is in a separate region titled “static data”, and add dotted arrows between the pointers and that block.

1 Like

That’s a good idea, thank you!

More to the point, the two intermediary fat pointer blocks should be relatively small in size, with just one large block for the actual string. So I would add two small yellow “fat pointer” rectangles that are pointed to from the stacks, with both rectangles pointing to the large purple blob that represents the actual string literal. If you wanted to put content in the “fat pointer” rectangles, they should each contain the same two usize items: the byte address of the start of the string literal and the byte length of the string literal.

1 Like

The book also has a good explanation of ownership: https://doc.rust-lang.org/book/second-edition/ch04-01-what-is-ownership.html

1 Like