Borrowing vs Transferring/Returning Ownership pattern

I had expected borrowing and transferring/returning ownership pattern to be equivalent. However, the fact that the later results in a different address for the input and result values suggest they are not. Is this a limitation on the state of the compiler or more fundamental limitation? I would think they are fundamentally equivalent.

fn mutate(v: &mut String) {
    v.push_str("...Borrowed and Mutated");
}

fn replace(mut v: String) -> String { 
    v.push_str("...Taken and Replaced");
    v
}

fn main() {
    let mut v = String::from("Old Mut");
    println!("Original  {:p}, {}", &v, v);
    mutate(&mut v);
    println!("Mutated   {:p}, {}", &v, v);
    
    println!("\n");
    
    let v1 = String::from("Old Repl");
    println!("Original  {:p}, {}", &v1, v1);
    let v2 = replace(v1);
    println!("Replaced  {:p}, {}", &v2, v2);
}

(Playground)

Output:

Original  0x7ffe1c349320, Old Mut
Mutated   0x7ffe1c349320, Old Mut...Borrowed and Mutated


Original  0x7ffe1c349438, Old Repl
Replaced  0x7ffe1c3494b8, Old Repl...Taken and Replaced

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 1.05s
     Running `target/debug/playground`

For a meaningful comparison, you should compile with the release profile, not the debug profile. The compiler generates a lot of "physical" moves (memcpys), relying on LLVM to optimize them away; but this optimization doesn't happen when you compile with the debug profile.

2 Likes

If you borrow a book you are obligated to return it. However, if I give you a book (i.e., transfer ownership), you can do whatever you want with it, including return it. The two are not equivalent, either ethically or morally or in any other way.

2 Likes

Yes, borrowing and transferring ownership are different; however, in both cases the item identity remains the same. In this case, a pointer to a specific memory location. Additionally, since this exchange is occurring within the same thread there is also scope containment.

Perhaps the extra copying that apparently occurs is due to the potential transfer of cleanup responsibility such as for error cases that may exists beyond the analysis scope such as would be the case if the item is further transferred outside the code base or is constructed from such a result. Still, a mutable pointer suffers the same.

Fundamentally, borrowing seems equivalent to the "Transfer and Return Ownership Pattern" (TROP) and therefore the more efficient implementation should be used for both.

TROP does implement the Functional programming "everything is a constructor" style but also the OOP method chaining fluent programming style. :slight_smile:

I subsequently did so and the resulting differing pointer values still occurs. Clearly, even under optimization they are not considered equivalent.

A String is not "a pointer to a specific memory location", at least, not the memory location you are printing the address of in your code. What exactly do you mean by "item identity"? Moving is not guaranteed to preserve memory location; in fact, in many cases it is guaranteed not to (e.g. moving an object from the stack into a vector). Not every value even has a location in memory (although these Strings do, since you print their addresses).

So, to be clear, any "extra copying" we are talking about here that survives optimization is the copy of 3 usizes from one place on the stack to another. In fact, it's possible that if you don't try to print the location of the String with {:p}, it won't be stored on the stack at all, and will instead be kept in registers. The actual bytes of the string are stored on the heap at a location your code doesn't inspect. It's very unlikely that copying 24 bytes is a performance bottleneck of any kind. But we can talk about it anyway.

What I want to focus on instead is the last sentence: "Still, a mutable pointer suffers the same." This is not true. Borrowing and ownership are different exactly because they have different semantics for what cleanup is required for a value (in unwinding from a panic as well as in normal control flow). I don't know if the reason LLVM puts the return value in a different stack location than the original is because of some logic to do with unwinding. It's possible, but it seems unlikely, so to be honest I don't think this is the culprit. However, it's completely wrong to say that &mut String would have the same problem because references have no drop glue, so in the &mut String case the cleanup code for the passed value is exactly nothing, whereas in the String case it's something.

It is not equivalent, at the very least because of unwinding, as I mentioned before. However, even if it is equivalent in this case, that doesn't mean there's a clear optimization to turn one into the other. Which version is "more efficient"? You don't know. You haven't measured it. So why second-guess the compiler?

5 Likes

This seems highly likely to me. In other words: The only reason why the copying wasn’t fully optimized away was because we were “looking” where each of the Strings was located. Optimizers have a tendency to be particularly good at optimizing whenever the changes they introduce don’t change behavior (one might go as far as saying that optimizations only happen when the changes they introduce are not observable), so it seems straightforward that inspecting the address of a pointer can preclude certain optimizations.

A particularly effective optimizer might have even optimized away the copy while still printing two different addresses in order to achieve maximal optimization with minimal behavioral change (again, this is exactly what optimizations are all about), or it could e.g. just keep the parts making up the String in registers while still fabricating some (otherwise unused) addresses where the Strings are located locigally but not actually; i.e. the program might as well lie to you about where the Strings are located. (I’m not sure if this is something that can actually happen with rustc.)

To get a definitive answer on what happens without the print statements, one would need to look at and try to understand the generated assembly.

2 Likes

Thanks for the great responses all.

The issue was not really about the semantic distinction between borrowing and owning but the equivalence of lending X vs giving an X and expecting an X back.

In current Rust, while a borrowed X can be treated nearly equivalently to a given X including the complete replacement of its value, Rust seems to permanently associate variables with their stack segment. Therefore, a borrowed or given argument refers to a stack segment in the caller and a returned value is assigned to a new stack segment in the caller.

While lifetime semantics and syntactic guarantees of Rust make it seem as though stack segment reuse is plausible and even natural, doing so may be excessive or complicated relative to the benefit.

Note: identity refers to memory address whether on the stack or heap since it uniquely identifies an object in the context of the program code (as oppose to the problem domain).

It's not guaranteed, see for example UCG #15.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.