Why do values need to be moved?

This is just like I say, the value is dropped from the callee (drop) first, and only then by the caller (sayHello), which is safe

lazy_static! {
   static ref last_greeting: Mutex<String> = Mutex::new();
}


fn hello(s: String) {
   *last_greeting.lock().unwrap() = s;
}
1 Like

Ok that's a perfectly fine example, indeed ^^
So the concept of moving values is only tailored for global-assignment cases?

No, it could be arbitrarily convoluted:

fn maybe_keep(s: String, out: &mut Vec<String>) {
   if random() {
      out.push(s);
   }
}
6 Likes

Ok so then I guess that the fact that when "moving" a value, the caller doesn't just "mark it" as moved, without copying it it to prevent use-after-free when it unwinds its stack while the value is still "alive" in the memory (e.g. in a mutex), is that right?

For complex flow within functions there are drop flags on the stack, so that unwinding knows which destructors to run.

But conceptually old locations of moved-out-of values are abandoned and ignored. This is different from C++'s move which requires old value to have some not-crashing state.

4 Likes

Ok, thanks a lot for your explanation :slight_smile: !

I think what kornel is saying is "move" is a concept. The data does not necessarily "move" from one memory location to another. The "ownership" is what moves.

That is my thinking anyway, but I may be far from having a clue.

2 Likes

I already knew that from the beginning but my question was more about why in some cases the value needs to be copied to another location in order to be used.

Perhaps one way to think of it is that if the code was completely unoptimized, the move would always be an actual copy of the data because this is the only behavior that is correct in every case. However, in a lot of cases, the compiler will be able to remove those moves automatically.

(You probably can't get to that level of unoptimized. I'm pretty sure that it would optimize out some moves even in debug mode.)

3 Likes

But here it's not the order of dropping that's wrong. It's the mere fact that there are two drops that is wrong. Every value must be dropped exactly once, no more (and ideally, no less, if you don't want a memory leak).

Why was my question, in which case could there be two drops ? But that has been answered :slight_smile:

First of all, @ClementNerma, you have to keep in mind that

In that example, a function taking ownership of a Vec (through a ManuallyDrop) and a function taking a Vec "by reference" compile down to the same machine code.

Ownership, in Rust, is only about one thing: who calls the destructor of a type, and where. It has nothing to do with "by-value" vs. "by-reference".

  • This already covers why Copy is something that will not really impact codegen: it's basically a bound that guarantees the annotated type has no attached destructor.

  • It just so happens that when a function takes a parameter behind a reference, it's usually not taking ownership of that value, and when one wants to express that they take ownership of a value, syntax-wise, we say we pass the element "by value", but I think that the phrasing "we give the value to the function in an owned fashion" is better suited.

That being said, in practice, a function that takes a parameter in an owned fashion is could potentially lead to a by-value ABI rather than a by-ref ABI; basically that choice will be left to the compiler. Whereas a function that takes a parameter in a borrowing fashion will be able to use a by-ref ABI "for free".

But the main counterpoint to all these things is something that has been suggested be added to the language a bunch of times, and which is already doable using third-party libraries (so if you are really concerned about these "potential big memcpys, feel free to use that crate): &move / &own / owning "references":

fn sayHello(name: &'_ move String) {
    println!("{}", name);
 // drop(*name);
}
  • Using the library
    fn sayHello(name: StackBox<'_, String>) {
        println!("{}", *name);
     // drop(name); /* -> `drop_in_place(&mut *name)` */
    }
    

On the other hand, when passing parameters by-reference, be it in a

  • (shared) borrowing fashion (&'_ String),

  • (exclusive) borrowing fashion (&'_ mut String),

  • owning (and thus exclusive) fashion (&'_ move String / StackBox<'_, String>),

you will notice that we are tied to the caller's stack frame, represented by that lifetime parameter '_ infecting the type (that is, even in the &move case where we have ownership of the String (i.e., the responsibility to dispose of that String or to relinquish ownership of it), precisely because we did not want to "copy" the data needed to use that String (the .ptr, .len, .cap fields), that data is living within the caller's frame, and so we are kind of borrowing that first layer).

This kind of "owning but lifetime-constrained" situation is not that common in Rust, and may break many people's simplified mental model (whereby ownership would be equivalent to (or at least, imply) being 'static) which may be one of the reasons behind the language not featuring that &'_ move abstraction: it adds a full layer of complexity to the language, just to get an unergonomic entity (those who have had to deal with Box<dyn … + '_> know what I am talking about), and all that just to be able to explicitly elide the occasional big memcpy that the compiler fails to elide.

For those rare instances, using a third-party library like the one I've mentioned (disclaimer: I'm its author) seems to do the job quite well. Another Option (heh) is to use &'_ mut Option< _ > to quickly hack a (runtime-checked) owning reference equivalent:

fn sayHello(name: &'_ mut Option<String>) {
    drop(name.take()); // we can drop the `String`, proof that we own it
}
1 Like

Ownership, in Rust, is only about one thing: who calls the destructor of a type, and where. It has nothing to do with "by-value" vs. "by-reference".

Well implicitly it is mentally, as a non-reference type such as [u8; 16] will need to be "moved", while a reference type such as &[u8; 16] won't.

Thanks for the rest of explanation though :slight_smile:

1 Like

Another interesting "theoretical example" is the complete opposite of &'_ move: instead of having ownership over the contents / responsibility to call the associated cleanup logic, but having a by-reference access to data itself, one can imagine a "copying" reference, whereby we have copied the (first layer of) data but we don't have ownership over the data nevertheless:

mod lib {
  use ::core::{
      marker::{
          PhantomData as HasTheSameSemanticsAs,
      },
      mem::ManuallyDrop as /* House */ MD,
  };

  pub
  struct StupidRef<'lt, T> {
      data: MD<T>,
      _lifetime: HasTheSameSemanticsAs<&'lt T>,
  }

  impl<'lt, T> From<&'lt mut T> for StupidRef<'lt, T>
  where
      // EDIT: to avoid safety issues with interior mutability
      T : ::core::marker::Freeze, // no interior mutability at the first layer
  {
      fn from (it: &'lt mut T)
        -> StupidRef<'lt, T>
      {
          StupidRef {
              // Perform a memcpy */
              data: MD::new(unsafe { <*const T>::read(it) }),
              _lifetime: Default::default(),
          }
      }
  }
  
  impl<'lt, T> ::core::ops::Deref for StupidRef<'lt, T> {
      type Target = T;

      fn deref (self: &'_ StupidRef<'lt, T>)
        -> &'_ T
      {
          &self.data
      }
  }
}
use lib::StupidRef;

fn example (s: StupidRef<'_, String>)
{
    println!("{}", *s);
}
3 Likes

I don't think that would work with anything that has interior mutability (too easy to create leaks, and possibly unsafety). And may run into issues with Aliasing rules for Box<T> · Issue #258 · rust-lang/unsafe-code-guidelines · GitHub (for example with Mutex which has something like Box<sys::Mutex> in it).

2 Likes

I believe that the &mut constructor circumvents the aliasing issue. I initially thought it would also circumvent the interior mutability issues, buy you are right, it does not (double drop possible with T = Cell<Option<U>>), thanks for pointing it out! Luckily a Freeze bound does solve that :slightly_smiling_face: (edited the post to add it).

2 Likes

The fun thing here is that the reference is moved, conceptually, just not the thing referred to. A reference is semantically a pointer type, pointing to a target, and that reference is moved when you pass it, but the thing it points at is not moved.

And, of course, depending on your machine and on what the compiler does, [u8; 16] may not exist in memory at all - the compiler could choose to put it in 2 registers on a 64-bit system (or 4 on a 32-bit system) and never have a memory location for it to begin with.

5 Likes

@yandros " Ownership, in Rust, is only about one thing: who calls the destructor of a type, and where "

Rust ownership rules can drive people new to Rust 'nuts'...essentially rustc is basically deciding which scope an object belongs to and when it should be destroyed, unlike many (if not all) other languages.... I may be wrong, but it seems that many Rust folks use references to over-ride rustc 'moving' stuff here and there, thus negating Rust ownership decisions .. I for one, don`t see anything wrong with calling a destructor to get rid of stuff no longer needed, or the automatic destruction of all objects within a scope defined by {...} whenever the closing bracket is reached.

Indeed they can.

Most languages have the concept of "local variables". C, C++, Javascript, Python, any other language I know. Local variables include the parameters of a function and any declarations within it. When that function ends, flow of execution exits the scope of the function, all those local variables are are "destroyed".

Now, admittedly languages like JS that have garbage collection can store references to such locals in places external to the function/scope and cause them to live as long as anything references them. Basically they wrap such things in reference counted smart pointers. Which one can do in Rust as well.

But those "ownership rules" exist in all languages, even if the language has no means to describe them.

In languages like C and C++ those rules express themselves in terms of mysterious program crashes, erroneous results and security vulnerabilities. That take a lot of time and expense to debug and fix.

In languages like Java, Python and so on the rules are well hidden, being taken care of by garbage collectors. But those rules express themselves by the need of a complex runtime environment and unpredictable delays and jitters as garbage collection happens.

That is not correct. In Rust a reference, mutable or otherwise, does not move ownership. The thing with the reference does not get the responsibility to deallocate (drop) the referenced object. That responsibility still remains with whoever the reference came from.

Yep. Works fine. Up to some point.

Problem is that in large, complex, especially multi-threaded programs in can become hard for programmers, especially in large teams, to keep track of who should destruct what. Sometimes things get destructed too soon, leading to "use after free" bugs and crashes later on. Sometimes things are forgotten about, leading to memory leaks.

That is exactly what Rust does!

1 Like