Question about borrowing and indirection

Hello, I’m new here, so apologies if I’m posting in the wrong place.

If I understand correctly, if you pass a value to a function, it’ll be moved there unless its type implements copy. If you pass a reference, it’ll be borrowed. But doesn’t this add an unnecessary layer of indirection if you just want to borrow a value immutably, and you’re only working in one thread? For example, String doesn’t implement copy, because it allocates data in the heap. The portion that lives on the stack contains a pointer the data in the heap, a length, and a capacity. If you want to call a function, giving it it a String, you must pass a reference:

fn main() {
    let s = String::from("Hello, world!");
    borrows_string(&s);
    println!("{}", s); // s was only borrowed, so is still valid
}

fn borrows_string(s: &String) {
    println!("Printing borrowed string: {}", s);
}

In main, s refers to the portion of the String that lives on the stack, which points to it’s data ("Hello, world), which lives in the heap. But when borrows_string is called, a reference to s is passed and pushed into the new stack frame, not s itself. This means that when println! is called in borrows_string, the CPU must follow the reference back to s, and then follow the pointer to the data in the heap.

But if s is being borrowed immutably, wouldn’t it be safe just to copy s itself (pointing to the same data in the heap) into borrows_string’s stack frame? Since borrows_string cannot modify s, it should be safe to read the copied value, and then pop it off of the stack when it returns to main, not calling drop() on s. This would eliminate one layer of indirection.

Maybe this is something the Rust compiler already does under the hood, but the way The Book makes it seem, borrowing always passes a reference to the data, and to me that means a reference to the data that lives on the stack.

(Playground)

Output:

Printing borrowed string: Hello, world!
Hello, world!

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.60s
     Running `target/debug/playground`

I won’t go for the thorough explanation, just note one particular important part: usually you don’t want to require &String or similar type.
Your function may as well accept &str:

fn main() {
    let s = String::from("Hello, world!");
    borrows_string(&s);
    println!("{}", s); // s was only borrowed, so is still valid
}

fn borrows_string(s: &str) {
    println!("Printing borrowed string: {}", s);
}

Playground
This works due to the magic of Deref coercions, which effectively let you treat reference to the container (here, the String) as the reference to the contained value (here, the string slice a.k.a. str).

In this case, every excessive indirection would be removed on the call site - you take reference to String only temporarily and immediately turn in into reference to str, so the compiler won’t even bother really creating it.

4 Likes

Similarly you usually don’t want to accept &Vec<T> or &Box<T> (or other smart pointers) as an argument instead you want &[T] or &T.

I did read that it was a bad idea to pass around &String, and figured it might come up, but I think my question still applies. I’ve modified my code to remove references to strings.

struct Person<'a> {
    name: &'a str,
    age: u8,
}

impl<'a> Person<'a> {
    fn new(name: &str, age: u8) -> Person {
        Person {
            name,
            age,
        }
    }
}

fn main() {
    let p = Person::new("Caillou", 4);
    borrows_person(&p);
    println!("{} is {}", p.name, p.age); // p was only borrowed, so is still valid
}

fn borrows_person(p: &Person) {
    println!("Printing borrowed person: {}, who is {}", p.name, p.age);
}

In this case, Person is not a smart pointer. Since the borrows_person function cannot modify the Person it receives, and because this code is running in only one thread, I don’t see the harm in copying the string reference and age value into borrows_person’s stack frame and popping it off upon returning, so long as drop isn’t called.

Playground

Ok, then make Person implement Copy and pass it by value. This is exactly what Copy is for.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e0cd1c27513922a2d15951c0f370c9aa

#[derive(Clone, Copy)]
struct Person<'a> {
    name: &'a str,
    age: u8,
}

impl<'a> Person<'a> {
    fn new(name: &str, age: u8) -> Person {
        Person {
            name,
            age,
        }
    }
}

fn main() {
    let p = Person::new("Caillou", 4);
    copies_person(p);
    println!("{} is {}", p.name, p.age); // p was copied, so is still valid
}

fn copies_person(p: Person) {
    println!("Printing borrowed person: {}, who is {}", p.name, p.age);
}
1 Like

To add, the way that &str works under the hood is that it is represented the same way as &[u8], meaning that coercing a &String to a &str at compile time will result in the internal &str in the String, doing what you were mentioning. Or, as @KrishnaSannasi mentioned, make it Copy.

Weird. I didn’t think you could derive Copy on a type that contains a field that implements drop, which &str could if it’s a String, I thought.

My point, and perhaps I’m not expressing it well, is that I do not want to copy around a bunch of data on the heap. I just want to copy by value the portion contained on the stack that points to it. If the borrow were mutable, this would be a problem, as the original version would not be updated. But in this case, it seems to make sense that eliminating one level of indirection would be better. As I said in my original post, maybe Rust already makes this optimization under the hood, but I’m no where near advanced enough to even begin trying to figure that out.

&str is Copy because &T is Copy for all T. This is because &T means a shared reference, and importantly it doesn’t own the data it points to. So it is safe to copy. Because it doesn’t own the data it points to, it also doesn’t drop anything when it goes out of scope. Btw, &T is represented as a pointer in memory.

Also note that str != String, they are different types. str is a view into some string, while String is a growable owned string.

3 Likes

You can derive Copy on any type whose contents are also Copy, therefore this applies, because &T is Copy, while &mut T isn’t and String isn’t. In most cases, it’s better to be moving around a reference and having the indirection, because in certain cases copying too much data when passing it to another function is wasteful of system resources. In the case that the size of the data is one or two pointers wide, it’s applicable to copy without having to worry about runtime.

1 Like

So if you did have a struct that owned some data on the heap through a smart pointer, would you just pass that struct around by reference (assuming you didn’t want to move it), and accept the additional level of indirection (borrowed ref -> your struct -> data on the heap)?

If I just stick to references = borrow, values = move or copy, I do just fine. I’m just trying to understand how it all works in memory.

It makes sense now that Person doesn’t own the &str, because it’s a shared immutable reference.

By the way, sorry if my Person code was lifetime spammy. Still trying to wrap my mind around that one.

Yes, although in many cases Rust will elide the indirection (Rust is good at seeing through references).

That will work well. Another way to think about references are that references are temporary views into other memory.

No problems!
In practice, try to avoid putting references in types, because it leads to a whole tangle of lifetimes that because very unwieldy very fast. Instead try and use the owned variants of the types. So instead of &T use Box<T>, &[T] -> Vec<T>, &str -> String.
In some cases this may not be possible (cyclic graphs), in that case you can use Rc and Arc instead of Box.

1 Like

Yes, although in many cases Rust will elide the indirection (Rust is good at seeing through references).

That’s what I wanted to know. Is a reference in code always a pointer in the compiled version? It seems like I just need to trust Rust in this case to do the right thing, but in C, trying to eliminate too much indirection is something worth thinking about.

In practice, try to avoid putting references in types, because it leads to a whole tangle of lifetimes that because very unwieldy very fast. Instead try and use the owned variants of the types. So instead of &Tuse Box, &[T] -> Vec, &str -> String.
In some cases this may not be possible (cyclic graphs), in that case you can use Rc and Arc instead of Box.

Do you try to make functions and methods accept references, and use the owned variants in your structs? I think that would mean, in the case of Person, that new would need to take a String, so it can take ownership of it. I’d need to clone it first if I wanted to continue using it in other places.

It’s interesting that &T derives Copy. It makes sense, because you can make multiple aliases of data so long as none of them can change it, but I was under the impression that you could only copy things that were totally on the stack (didn’t point to data in the heap).

I will probably try to use owned variants as much as possible in structs, to eliminate lifetime spam, but if I ever do use references, I’ll definitely derive Copy.

In debug more yes, in release mode, it is either a pointer or it doesn’t exist (because it was elided). But because references in Rust are more constrained that pointers in C, it is always safe to elide references. Note that Rust won’t be as aggressive about eliding raw pointers (*const T and *mut T) because they have almost no guarantees about anything.

Exactly

Yes, you could also pass it by reference (&Person) to other functions becasue cloning a String is almost always more expensive than passing a reference.
Note there is a speical case for string literals, if you know that you will only be using string literals then you can put &'static str in you types, this will be more efficient and easier to use. Similarly for slices, you can put &'static [T] if you can.

One way to think of Copy is that the type doesn’t have any resources that the compiler doesn’t know about. So &T are Copy because they are just a view, no attached references. Whereas String isn’t Copy because there are resources (namely on the heap) that the compiler doesn’t know about.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.