What exactly is moving of ownership means in rust

struct Foo {}
let z1 = Foo {};
println!("z1 addr {:p}", &z1);
let z2 = z1;
println!("z2 addr {:p}", &z2);

Above code prints two different addresses for the same object pointed by z1 & z2. with this My understanding is that moving of ownership is basically copying the value from z1 location to z2 and destroying the value at z1. is my understanding is correct?

1 Like

Yes, moving is basically a destructive copy. The value is blindly, bitwise copied from its old place to a new place, and the old place is invalidated. ("Invalidation" is a purely compile-time concept, it means that the compiler won't let you access it afterwards, and doesn't run its destructor, so as to prevent using invalid values and avoid double-frees.)

10 Likes

I still have doubt on what exactly a move is

    impl Drop for Foo {
        fn drop(&mut self) {
            println!("dropped");
        }
    }
    let z1 = Foo {};
    println!("z1 addr {:p}", &z1);
    let z2 = z1;
    println!("z2 addr {:p}", &z2);

Since both the println!() printing the addresses, that means there should be two Foo objects in the memory. And both of them should get destroyed at the end. But all i can see is only one dropped print during the execution. This creates more doubts on what exactly is a move is.

1 Like

Can I guess that in many cases the compiler might optimise away that actual bitwise copy?

For example:

pub struct Foo {
    a: u32,
    b: u32
}

pub fn thing(f: Foo) -> Foo{
    let z1 = f;
    if z1.a == 1 {
        return z1;
    }

    let z2 = z1;
    if z2.a == 2 {
        return z2;
    }
    Foo{a:3, b:4}
}

Compiles to:

example::thing:
        mov     edx, esi
        mov     eax, edi
        cmp     edi, 1
        je      .LBB0_4
        cmp     eax, 2
        jne     .LBB0_3
        mov     eax, 2
        ret
.LBB0_3:
        mov     eax, 3
        mov     edx, 4
.LBB0_4:
        ret

At which point "move" becomes a purely compile-time concept as well.

There are two places, but only one of them (the one that was assigned last) has a valid value. The compiler tracks the validity of the places at compile time, and moved-from places will be essentially treated as uninitialized (i.e., does not need dropping).

The existence of any particular address doesn't imply that there should be anything meaningful at that exact address. You can make up addresses out of thin air, yet this doesn't mean that there's a value there:

println!("arbitrary address = {:p}", 0xcafe0100 as *const u64);

Yes, almost always. (Some notable exceptions are e.g. Box construction in specific cases, but it doesn't usually matter even then.)

4 Likes

You have to distinguish between "moving out of a location" and "destroying a location". They are not the same. Destructors only run when you actually destroy a value, and moving it does not run the destructor of the old location.

In your original example, z1 and z2 are two locations in memory. First, you create a Foo at z1, then you print its address. Then you move the value in z1 to z2, and you print the address of z2. The move operation will (without optimizations) compile down to this:

  1. Copy the bytes in z1 to z2.
  2. Don't touch the bytes in z1 anymore.

So, "destroying" the z1 location isn't actually an operation. It's the absence of an operation. You just stop using the data and say that, from now, those bytes are random garbage (even though they would be a valid Foo). Maybe you use them location for a different value later, and if so, then you're just going to overwrite the random garbage with the new value, without performing any cleanup of the old value.

Now, when z2 goes out of scope, the compiler runs the destructor of the value stored there. It runs no destructor for z1, because the compiler keeps track of move operations and knows that z1 no longer holds anything that we need to destroy.

Once the compiler has run the destructor on z2, the compiler will consider the bytes there as random garbage too. (Even though it doesn't actually do anything to clean it up. It probably still contains whatever bytes your Foo value corresponded to.)


Let's add a third operation to distinguish from: "deallocating memory". This is different from the two other operations. Deallocating the memory makes the memory location unusable.

Once a memory location contains random garbage, you can either reuse it for some other value, or you can deallocate it. In the case of stack variables, deallocation does not happen until you return from the function, even if you ran the destructor earlier than that.

In the case of heap variables, this is the same. Cleaning up a Box<Foo> corresponds to the following sequence of operations:

  1. Run the destructor on the memory location.
  2. Deallocate the memory location.

Optimizations are also a thing. When compiling with optimizations, the compiler might decide to merge two memory locations. For example, both z1 and z2 might have the same address.

If you have two memory locations, but you know that they never contain an actual value at the same time. That is, whenever one contains a value, the other is random garbage. Then, you can merge them without running into conflicts.

11 Likes

what is exactly uninitialized here ? does it mean object is still there in the memory and only the owner z1 is invalid? if object still there when it will be destroyed? if it is already destroyed why drop is not being called?

In case you are familiar with C++, note that Rust’s concept of “move” is different in that a variable that was moved out of will no longer run any destructors.

You can think of every local variable in Rust as implicitly also introducing a bool flag “is_valid”, which will get set to true once the variable is initialized, and set back to false when you more the value is moved from the variable into another location. When the variable is out of scope, the compiler will execute something like if is_valid { …call destructor(s)… }.

Such boolean flags can actually exist at run-time in situations where whether or not a variable will need to be dropped can no longer be tracked in a compile-time analysis.


There’s also a compile-time analysis tracking where a variable can be invalidated (i.e. is_valid == false) and everywhere where it can be invalidated (i.e. uninitialized or moved-out-of), the compiler will deny access to the value. E.g.

struct Foo {}
impl Drop for Foo {
    fn drop(&mut self) {
        println!("dropped");
    }
}
let z1 = Foo {};
if false {
    drop(z1);
}
// z1 is still not yet dropped, because the `if` was “false”
// but static analysis does not analyze the *value* of `if`-conditions,
// so `z1` is considered potentially invalid beyond this point
println!("after the if statement");

// will print "dropped" now

outputs

after the if statement
dropped

but

struct Foo {}
impl Drop for Foo {
    fn drop(&mut self) {
        println!("dropped");
    }
}
let z1 = Foo {};
if false {
    drop(z1);
}
// z1 is still not yet dropped, because the `if` was “false”
// but static analysis does not analyze the *value* of `if`-conditions,
// so `z1` is considered potentially invalid beyond this point
println!("after the if statement");

// since `z1` is considered potentially invalid, we cannot access it
let r = &z1;

// will print "dropped" now

is a compilation error

error[E0382]: borrow of moved value: `z1`
  --> src/main.rs:18:9
   |
8  | let z1 = Foo {};
   |     -- move occurs because `z1` has type `Foo`, which does not implement the `Copy` trait
9  | if false {
10 |     drop(z1);
   |          -- value moved here
...
18 | let r = &z1;
   |         ^^^ value borrowed here after move

However (if z1 is made mutable) assigning a new value is still allowed

struct Foo {}
impl Drop for Foo {
    fn drop(&mut self) {
        println!("dropped");
    }
}
let mut z1 = Foo {};
if false {
    drop(z1);
}
// z1 is still not yet dropped, because the `if` was “false”
// but static analysis does not analyze the *value* of `if`-conditions,
// so `z1` is considered potentially invalid beyond this point
println!("after the if statement");

// assigning a new value drops the old one
z1 = Foo {}; // prints "dropped"

println!("after the assignment");

// `z1` can also be accessed again here, by the way
let r = &z1;

// will print "dropped" now (again)

Output:

after the if statement
dropped
after the assignment
dropped
3 Likes

Steffahn suggests looking at this using drop flags. A drop flag is a hidden binary variable that the compiler keeps track of. I think that's a good idea.

Consider the following low-level pseudocode, where all operations, including destructors, are explicit function calls:

let z1, z1_drop, z2, z2_drop;

// let z1 = Foo {};
z1 = Foo {};
z1_drop = true;

// let z2 = z1;
copy_bytes(z1, z2);
z1_drop = false;
z2_drop = true;

// z2 goes out of scope
if z2_drop {
    drop(z2);
}

// z1 goes out of scope
if z1_drop {
    drop(z1);
}

The reason that no destructor runs on z1 is because z1_drop is false.

Now, the compiler is clever and will easily optimize the above this:

let z1, z2;

// let z1 = Foo {};
z1 = Foo {};

// let z2 = z1;
copy_bytes(z1, z2);

// z2 goes out of scope
if true {
    drop(z2);
}

// z1 goes out of scope
if false {
    drop(z1);
}

and then to this:

let z1, z2;

// let z1 = Foo {};
z1 = Foo {};

// let z2 = z1;
copy_bytes(z1, z2);

// z2 goes out of scope
drop(z2);

// z1 goes out of scope

In fact, I would expect it to optimize it even further to this:

let z1_z2_merged;

// let z1 = Foo {};
z1_z2_merged = Foo {};

// let z2 = z1;
/* do nothing */

// z2 goes out of scope
drop(z1_z2_merged);

// z1 goes out of scope
2 Likes

I'm still trying to wrap my mind around these questions. They don't make any sense! You are talking like object is something physical, something that needs to be cooked and prepared and then, eventually, destroyed.

But we are dealing with code! Only zeros and ones exist there! Objects are not in the code! They only exist in our imagination!

Before move object lives in one piece of the memory, after move it's in the other place. What so hard and strange about that description?

2 Likes

You're obviously running it in debug build, because in release it prints the same address.

3 Likes

If one works with the reasonable (yet incorrect) assumption(s), that do hold true in some programming languages including C++, that an “object” inherently belongs into a single (unchanged) memory location, then “move” may be (as e.g. in C++) understood as a euphemism for “create a new object, cheaply, whilst potentially stealing ‘resources’ from the old one”. Then actually two objects would be involved and both need to be destroyed at some point.

Hence, depending on the, possibly incorrect, assumptions that come with (imprecise) terminology such as “object” or “move”, it may be hard to understand an explanation that contradicts those assumptions; at least until one has identified the incorrect assumptions and fixed them.

2 Likes

The place, i.e., the "physical" memory region is still there, and the compiler considers it as not containing a valid value of type Foo anymore, because the value was moved over to some other place.

Technically, it's undecided. Perhaps the object is still there, but inaccessible. Perhaps the memory becomes literally uninitialized, so that trying to access the object is UB. Perhaps something else. If you writing safe code, then it doesn't matter, because you can't access the moved-from location anyway, unless you move in something new. If you're writing unsafe code, you should still treat that memory as uninitialized, because that's the safest interpretation.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.