Why non-scoped thread needs to move variable/data inside it?

I am learning multi-threading programming in Rust. Here is a code snippet that has been bugging me for a few days:

fn main()
{
    let numbers = Vec::from_iter(0..=99);

    let t = thread::spawn( move || 
        {
            let len = numbers.len();
            let sum: i32 = numbers.into_iter().sum();

            sum as f32/len as f32
        });

    let average = t.join().unwrap();

    println!("The average is {}", average);
    println!("{:?}", numbers);
}

The code snippet fails to compile because we try to use the vector numbers in this println!("{:?}", numbers) line after numbers moved into the thread t.

In the above code snippet two threads exist: main and t thread.

In this let average = t.join().unwrap() line we want our main thread to wait for the execution of the t thread so that we can have a return value from it. Hence we can safely say that the lifetime of the t thread is not bigger than the lifetime of the main thread. So our t thread could have taken a reference to the vector numbers. I am aware of the scoped threads in Rust that can take reference to local variables.

As t thread finished execution before the main thread why does it move the numbers into it? Or is it because threads always have static lifetime for its argument type?

Thanks,
Yousuf

Spawned threads can outlive their spawning thread. The compiler can't statically know/prove that they are joined (because you can ignore the join handle).

Why don't you check the declaration of thread::spawn() yourself? It's clearly documented.

2 Likes

The general idea in Rust is that, when it comes to memory safety of users of some API, the compiler will allow or disallow things based on the function signatures in question. There’s no magical further reasoning beyond this. You, a human, may be able to make deductions about lifetimes of local variables in one thread and the duration of execution of another thread yourself, but there’s nothing in the type signature of .join() that explains this kind of information to the compiler, in a way the compiler could understand.

The way that APIs (that use unsafe internally) are thus designed to cope with these limitations is often by posing conservatively strict requirements. In this case, since it’s impossible to explain, via type signatures, to the compiler how long a thread runs before .join is called, the conservative strict requirement is placed into the type signature of thread::spawn that threads spawned via this API will not be able to reference any short-lived data at all, a restriction that can be expressed by means of restricting the closure, the only thing that determines the behavior of the spawned thread and gives it access to any non-global data, is restricted to a closure type F with a F: 'static bound.

The scoped threads API is a clever API that allows a less restrictive spawn method (see the bound is not 'static but some 'scope lifetime whose precise meaning is not entirely trivial to understand when you’re new to lifetime annotations, but most notable it’s (typically) less restrictive than 'static), and achieves to do this in a safe and sound manner by no longer relying on a (though still extant) .join() method, because those are impossibly hard to explain to the compiler via type signatures, but instead it also waits for termination of threads at the end of a so-called “scope”, and it’s this waiting that allows for the less restrictive spawn function signature to still be safe.

The reason why the scope concept it something the compiler can understand better is not trivial at all… just look at the type signature, which is certainly a rather advanced usage of lifetime annotations… (that would take some explanations to fully break down, both the meaning and the practical design decisions)

pub fn scope<'env, F, T>(f: F) -> T
where
    F: for<'scope> FnOnce(&'scope Scope<'scope, 'env>) -> T,

but still, this is all the information the compiler needs to enforce, together with the F: 'scope bound on the spawn method, the practically important restriction that no data is referenced by a spawned thread that gets dropped (or otherwise invalidated) at a point where the thread could still be running (i.e. at a point inside of the scope).


I like this Rust introduction video on YouTube (aimed at people with some C++ knowledge) among other things for pointing out and emphasizing, on multiple examples, this idea that safety-relevant facts are annotated in function signatures, so the compiler can reject (potentially) dangerous code (or the other way around, only allows known-safe code) based on this. Understanding this gives a better understanding for how Rust operates in principle, and this idea is the foundation for how some of Rust’s strengths, memory safety and safe abstractions around unsafe code, can be achieved.

6 Likes

Some addendum regarding the code in question and the simple question of why it doesn’t compile, the compile gives some good first steps of explanation:

fn main()
{
    let numbers = Vec::from_iter(0..=99);

    let t = thread::spawn( move || 
        {
            let len = numbers.len();
            let sum: i32 = numbers.into_iter().sum();

            sum as f32/len as f32
        });

    let average = t.join().unwrap();

    println!("The average is {}", average);
    println!("{:?}", numbers);
}
Compiling playground v0.0.1 (/playground)
error[E0382]: borrow of moved value: `numbers`
  --> src/main.rs:16:22
   |
3  |     let numbers = Vec::from_iter(0..=99);
   |         ------- move occurs because `numbers` has type `Vec<i32>`, which does not implement the `Copy` trait
4  |
5  |     let t = thread::spawn( move || 
   |                            ------- value moved into closure here
6  |         {
7  |             let len = numbers.len();
   |                       ------- variable moved due to use in closure
...
16 |     println!("{:?}", numbers);
   |                      ^^^^^^^ value borrowed here after move
   |
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0382`.

So the error right now has nothing to do with the type signature of thread::spawn and its usage of F: 'static bounds yet. The message that’s relevant is

variable moved due to use in closure

So numbers is moved into the closure and thus into the other thread, because it’s used in the closure; the missing piece of information why a value used in this particular closure is always immediately moved (and not e.g. borrowed) then is that it’s a move-closure! It’s written right there in the code

move ||

the Rust code for “please move absolutely everything that this closure uses into the closure”.


If we removed the move keyword

…
    let t = thread::spawn( || 
…

the error message changes slightly

error[E0382]: borrow of moved value: `numbers`
  --> src/main.rs:16:22
   |
3  |     let numbers = Vec::from_iter(0..=99);
   |         ------- move occurs because `numbers` has type `Vec<i32>`, which does not implement the `Copy` trait
4  |
5  |     let t = thread::spawn( || 
   |                            -- value moved into closure here
...
8  |             let sum: i32 = numbers.into_iter().sum();
   |                            ------- variable moved due to use in closure
...
16 |     println!("{:?}", numbers);
   |                      ^^^^^^^ value borrowed here after move
   |
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0382`.

Now the explanation points to a different point inside the closure body and still says

variable moved due to use in closure

honestly, probably it wouldn’t be a bad idea if this error case came with slightly different wording. At least now it points to not just any usage, but – and that’s the relevant detail – it points to a usage of numbers that moves the variable, a call to .into_iter(), a function that moves and consumes its arguments. If a closure body moves a captured variable, then that variable is captured by-move (otherwise the closure would not have the necessary ownership to move the value as needed). If we change this place in the code to some by-reference access, e.g. .iter().sum() instead of .into_iter().sum(), the error message changes yet again:

error[E0373]: closure may outlive the current function, but it borrows `numbers`, which is owned by the current function
  --> src/main.rs:5:28
   |
5  |     let t = thread::spawn( || 
   |                            ^^ may outlive borrowed value `numbers`
6  |         {
7  |             let len = numbers.len();
   |                       ------- `numbers` is borrowed here
   |
note: function requires argument type to outlive `'static`
  --> src/main.rs:5:13
   |
5  |       let t = thread::spawn( || 
   |  _____________^
6  | |         {
7  | |             let len = numbers.len();
8  | |             let sum: i32 = numbers.iter().sum();
9  | |
10 | |             sum as f32/len as f32
11 | |         });
   | |__________^
help: to force the closure to take ownership of `numbers` (and any other referenced variables), use the `move` keyword
   |
5  |     let t = thread::spawn( move || 
   |                            ++++

For more information about this error, try `rustc --explain E0373`.

Now finally, we get the error that appears due to the F: 'static requirement in the function signature of thread::spawn, as the error message points out, “function requires argument type to outlive 'static ”, pointing to the (argument of the) call to thread::spawn. The rustc --explain information on this error code also contains some further elaboration on this very kind of example of using thread::spawn and potentially even .join(), see the online version of this information here.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.