Why do values need to be moved?

In Rust, values are often "moved" around in the memory. For instance, in the following program:

fn sayHello(name: String) {
    println!("{}", name);
}

fn main() {
    let name = "Clément".to_string();
    sayHello(name);
}

The name variable is moved when used by sayHello.

My question is: why do the variable needs to be moved in the memory? Why can't it stay to its initial location, and the compiler just turn it under the hood into something like:

fn sayHello(name: &String) {
    println!("{}", name);
}

fn main() {
    let name = "Clément".to_string();
    sayHello(&name);
}

This would avoid to have to move the data, which implies a 1:1 copy, which means we have to perform (not always but often) an allocation to copy the data, and then write it to the memory, which is very costly.

I didn't find any resource on this, and I though about "moving" maybe not being about "moving the values in the memory" but many articles seems to tell the opposite.

Could anyone help me understand this :slight_smile: ?

EDIT: Ok so by moving I indeed meant memcpy() and instead of String which has some pointed out is essentially made of a pointer + length, let's say an [u8; 16] or anything that does not contain a pointer itself.

1 Like

The String type consists of a pointer to an allocation and two integers indicating the allocation capacity and the used size. Moving a String only moves this pointer and the two integers. It doesn't move the actual string data.

1 Like

It's important to know that a String has two parts:

  1. The struct itself, which contains a pointer and two integers (length and capacity).
  2. The allocation on the heap, pointed at by the pointer.

Moving a String only moves the first part. The allocation is not moved, and this makes it cheap to move a String.

Of course, in your example, it is better to not move it, but sometimes the function actually needs ownership of the value, and in that case, you must move it.

6 Likes

Is there actually guaranty that the first part is going to move anywhere? It might be in the end, after all optimizations, that either reference (pointer) is used instead of moving the struct, or the function is completely inlined?

And it is indeed not about moving in memory.
In fact, move and copy operations are compiled to exactly the same code - without optimisations, it's equal to memcpy, and after optimisations it may very well disappear at all. The only difference is compile-time semantics.

7 Likes

This notion of "move" vs "copy" vs "clone" confused me no end when I first discovered Rust. These words do not have their colloquial meaning in Rust.

You see, if one comes from the assembly language programming world one would be familiar with instructions like MOV which actually, physically, move bytes around between memory memory locations or memory to register and back (Never mind that 'MOV' generally actually makes a copy, the original is still there!)

It is important to realise that "move" in Rust is nothing to do with physically moving bytes around. As one might naturally guess.

Rather it is about moving the ownership of data (values) from one place to another. For example moving the ownership of a value from a local variable into a function through a parameter. Or even just assigning a value to a different variable.

Further "ownership" is all about who has responsibility to deallocate the values memory (drop) when it is no longer required.

By analogy it's like a bank 'moving' the gold you own from one account to another or moving it to a different bank altogether. Typically the gold you own does not physically move anywhere, it stays put in the volts at Fort Knox or the Bank of England or wherever, what with being heavy and expensive to move physically. Only the record of its ownership is moved around in the bank accounts.

Perhaps when one reads "move" in Rust one should think "transfer".

Someone correct me if I am saying this wrongly.

9 Likes

Move doesn't mean "memory move". It's easier to think about moves as "moving ownership". Then

  • fn sayHello(name: String) -- you pass ownership of the value from the caller to the callee
  • fn sayHello(name: &String) -- you pass a shared reference, the caller keeps the ownership
  • fn sayHello(name: &mut String) -- you pass an exclusive reference, the caller keeps the ownership

Whether or not the value is actually moved is up to the compiler. It can be optimised out or it can be a register copy or a memcpy.

Another example would be returning a String from a function. It's also a move (transferring ownership from the callee to the caller) but it doesn't mean the value is moved in memory. It's up to the compiler to choose whatever it wants (and in modern compilers with return-value-optimisation it won't do a memcpy.)

6 Likes

Ok so let me reformulate: why does the compiler need to copy the data in memory, instead of just passing a pointer around?

Seems like I didn't take the best example ^^
Let's then say for instance an [u8; 128]

Values need to be moved, because Rust doesn't have a garbage collector. With a GC everything can be passed by reference, and GC will figure out which references need to be valid and when to free things.

Without a garbage collector there needs to be another way of keeping track where objects are referenced from (borrow checker), when they're not used any more, and most importantly, which part of code is responsible for freeing objects (that is done by move semantics). Objects must be freed once, so distinction between referencing/sharing and moving is important — moved objects pass responsibility to free them, referenced objects don't.

So "moving" is mainly a conceptual operation of passing ownership, i.e. specifying that some other part of the code will be responsible for freeing this object.

4 Likes

But why does the value need to be copied in the memory to be used elsewhere? Why can't it stay at its current location?

With your specific case it can. But the values can be returned from the function, pushed to the vector, transferred to another thread using channel. If you don't copy the memory, you'll get dangling reference to the popped stack.

In general - because this location can be in wrong stack frame. In many real cases these moves will be optimised away, but this cannot be always guaranteed.

1 Like

Because the current location may be destroyed soon. It could be on stack, which will unwind. It could be an item in a Vec that will be destroyed or will replace space with another element.

Items can move far, in complicated ways (e.g. in loops with conditionals, via global variables, through many layers of function calls). Tracking where objects were originally from to avoid moving them could be quite complicated. The optimizer will avoid redundant moves when they're obvious.

1 Like

But if you give the value to a function for instance, the callee's stack will be unwinded before the caller's one, right?

What do you mean by the "wrong" stack frame? And could you give an example where it couldn't be optimized away, I struggle to represent it mentally :slight_smile:

but the function that took ownership could try to save that String in a global variable (it can do whatever it wants with a value it owns) and return. And then it's gone!

You can pass by reference if you want a function to reference something higher on the stack, and then the borrow checker will prevent it from saving the borrowed data somewhere that is not appropriate.

But moving means the recipient is free to keep the value forever, wherever they want. So the value can't be bound to some temporary stack frame.

Note that Rust doesn't care what functions actually do. It enforces function interfaces. If the interface says it wants to keep the value forever, then it gets what it demands.

But Rust precisely doesn't allow to assign to global variables, so that behaviour cannot happen (unless using unsafe which is a whole other story).

But moving means the recipient is free to keep the value forever, wherever they want. So the value can't be bound to some temporary stack frame.

This is where the problem lies, I don't see any non-unsafe case where the data could be "freed" from the caller before it is from the callee.

You can assign to global variables through a Mutex.

4 Likes
fn sayHello(name: String) {
    drop(name);
}

^ that's a double free if the parent frame thought it still has ownership

3 Likes