Unintuitive behavior of rust borrow checker

use std::thread;
use std::sync::mpsc;
use std::sync::mpsc::{Sender,Receiver};
use std::sync::Arc;
use std::sync::Mutex;

type ArcReceiver = Arc<Mutex<Receiver<i32>>>;

fn aFunction(arc_sender: ArcReceiver)
{

}
fn nextFn()
{

}
fn main() 
{
    let (sender,receiver):(std::sync::mpsc::Sender<i32>,std::sync::mpsc::Receiver<i32>) =  mpsc::channel();
    let sb : ArcReceiver = Arc::new(Mutex::new(receiver));
    let mut th: Vec<std::thread::JoinHandle<()>> = Vec::with_capacity(10);
    for i in 0..10
    {
        let c:ArcReceiver = Arc::clone(&sb);
        th.push(std::thread::spawn(|| aFunction(c)));
        //th.push(std::thread::spawn(|| aFunction(Arc::clone(&sb))));//doesn't compile!!!

    }
    for iter in th.into_iter()
    {
        iter.join();
    }
    println!("Hello, world!");
}

Why is it that I have to create the c variable and then pass that c variable if Arc::clone does it anyway, that is it does creates and returns an object?
I'm really puzzled by this.

Everything inside the closure is run in the child thread. In your second example, you’re attempting to send a reference to the local variable sb to the other thread, which will then make the clone. This is blocked by the borrow checker because the compiler can’t prove the reference doesn’t outlive the stack frame in the main thread.

Both case doesn't compiles. Please share us your exact code you have so we can help your real problem.

I did paste the original code now. Please check it.

Hi and thanks for the reply.
But both cases are identical.
First creates named variable c and this is passed to a clone.
The second creates unnamed variable and this is passed to a clone.

Your first closure,

|| aFunction(c)

is roughly equivalent to this code (which isn’t quite valid Rust, but should get the point across):

struct Closure {
    c: ArcReceiver;
}

impl FnOnce() for Closure {
    fn call(self) {
        aFunction(self.c)
    }
}

This is fine to send across a thread boundary because there are no references stored in the closure’s struct. Your second closure, on the other hand, is this:

struct Closure<'a> {
    sb: &'a ArcReceiver;
}

impl<'a> FnOnce() for Closure<'a> {
    fn call(self) {
        aFunction(Arc::clone(&self.sb))
    }
}

Because this holds a reference to a variable in the stack of the main thread, the compiler doesn’t allow it to be sent to a child thread.

1 Like

It is incorrect behavior though. Because both cases do send an temporary object. One is a named variable and the other is unnamed. Everything else is the same.

If the new thread has ownership of the object, there's no issue. The dangerous part is if it is not the new thread that controls when the value is destroyed, since then it may be destroyed before the new thread is done using it.

Creating a cloned Arc to give to the thread allows the new thread to destroy its Arc when it's done with it, however if it's isn't cloned until after the thread is spawned, the original Arc could be destroyed before it has a chance to clone it.

A more functioning example would help, indeed. Here’s your code a bit polished up. (I wrote this before you updated your post.)

use std::sync::mpsc;
use std::sync::mpsc::Receiver;
use std::sync::Arc;
use std::sync::Mutex;
use std::thread;

type ArcReceiver = Arc<Mutex<Receiver<i32>>>;

fn a_function(_arc_receiver: ArcReceiver) {}

fn main() {
    let (_sender, receiver): (std::sync::mpsc::Sender<i32>, std::sync::mpsc::Receiver<i32>) =
        mpsc::channel();
    let sb: ArcReceiver = Arc::new(Mutex::new(receiver));
    let mut threads: Vec<std::thread::JoinHandle<()>> = Vec::with_capacity(10);
    for _ in 0..10 {
        /* this compiles */
        let c: ArcReceiver = Arc::clone(&sb); // WHY DO I HAVE TO CREATE the temporary c ???
        threads.push(thread::spawn(|| a_function(c)));

        /* this doesn't */
        // threads.push(thread::spawn(|| a_function(Arc::clone(&sb))));
    }
    for thread in threads {
        thread.join().unwrap();
    }
    println!("Hello, world!");
}

Including a link to the Rust playground is also a good idea.

It’s also always useful to include the actual error message, so if line 22 is uncommented, we get:

   Compiling playground v0.0.1 (/playground)
error[E0373]: closure may outlive the current function, but it borrows `sb`, which is owned by the current function
  --> src/main.rs:22:36
   |
22 |         threads.push(thread::spawn(|| a_function(Arc::clone(&sb))));
   |                                    ^^                        -- `sb` is borrowed here
   |                                    |
   |                                    may outlive borrowed value `sb`
   |
note: function requires argument type to outlive `'static`
  --> src/main.rs:22:22
   |
22 |         threads.push(thread::spawn(|| a_function(Arc::clone(&sb))));
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to force the closure to take ownership of `sb` (and any other referenced variables), use the `move` keyword
   |
22 |         threads.push(thread::spawn(move || a_function(Arc::clone(&sb))));
   |                                    ^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0373`.
error: could not compile `playground`.

To learn more, run the command again with --verbose.

Note that the suggested possible fix of adding move does indeed not work either.

The problem is that, as @2e71828 tried to explain, the cloning happens in a different thread between the two versions. First, you need a bit of a mental modal of what a closure, like your expression || a_function(Arc::clone(&sb)), actually is. It gets compiled into an anonymous datatype that “captures” (i.e. contains the values of) all the local variables that the code you wrote for it needs to be executed. And it implements the appropriate Fn traits so that calling the closure runs the code you wrote there using those values for the local variables. There’s two steps to this process, one is packaging up all the local variables into the closure data type and the second one is executing the closure.

Crucially, the execution step can happen much later and in the case of thread::spawn the execution will happen in a different thread.

Accessing variables can be done in different ways. When you pass a variable directly to something like a function, its contents get moved. If a variable x is used in an expression &x or &mut x (this can also happen implicitly when a &self or &mut self method is called on x) then the value is only borrowed (immutably or mutably, respectively). When compiling a closure, the compiler will determine how a local variable is used. In the case of || a_function(c), the value of the variable c needs to be moved when executing the closure. In the case of || a_function(Arc::clone(&sb)), the value of the variable sb needs to be (immutably) referenced.

In order to be able to move or reference the respective values at execution time, there needs to happen different things when a closure is build (i.e. when the captured local variables are packaged up into the anonymous data type). In order to support moving at execution time, the original variable needs to be moved into the closure at closure-building time as well. For a value that only needs to be referenced at execution time, you just need to have a reference created at closure-build time as well. This reference is stored in the closure and still references the original variable. This is really helpful for closures in functions like Iterator::map or Option::unwrap_or_else where you immediately execute the closure in the same scope.

But it doesn’t support the closure being move out of the scope of the referenced variable. Returning a closure from the function it is created in, or sending it to another thread are both common examples where the closure (and all the values or references it contains) are leaving the current scope. Trying to have a local variable captured by reference in such a closure will trigger lifetime related errors like the one above.

closure may outlive the current function, but it borrows `sb`, which is owned by the current function

Finally, there is the move keyword. It will change the closure-building/packaging process so that every captured variable is packaged up directly (and thus the original variable is moved) no matter what kind of usage the compiler detects for the closure’s execution time. In practice, you can actually always use std::thread::spawn(move || ...) with the move keyword, since it won’t make a difference in the cases that compiled without it, but it will make code work in other cases that didn’t work without it. Once you accepted that thread::spawn needs to move the value out of every variable it captures, you’ll understand that you cannot possibly directly mention your variable sb inside of a closure in a loop since you’d need to move out of sb multiple times for that to work (and sb’s type is not Copy). Rust also does not have any special syntax for something like “capturing by clone”, so the only way is to use a local variable, as you did, to first clone the value and then capture move that clone into the closure.

1 Like

I repeat: Both cases are identical:
First sends named variable called c to a clone.
Second sends unnamed variable to a clone.
Everything else is the same.

The difference between your two cases is in which thread clones the value:

  • In the working example, the value is cloned in the main thread and then the copy is sent to the worker thread.
  • In the non-compiling example, a reference is sent to the worker thread, which then makes a clone.

The root of the problem is that the compiler can’t guarantee the sent reference will be destroyed before the variable it refers to.

To further illustrate the kind of difference in action here, look at this example:

use rand::Rng;

fn do_twice(mut f: impl FnMut()) {
    f();
    f();
}

fn main() {
    let mut rng = rand::thread_rng();
    
    // compare this:
    let x = rng.gen::<u8>();
    do_twice(|| println!("{}", x));
    
    // with this:
    do_twice(|| println!("{}", rng.gen::<u8>()));
}

Example output:

42
42
253
115

(playground link)

The first version will print the same number twice, the second one can print two different numbers. Why is this so?

2 Likes

Yep, I get it. Thanks. Somewhat unintuitive but I believe that I understand now.
So basically the cloning happens in the other thread, but this is weird as the arguments passed to a thread should in theory be prepared before the thread started execution.

You might also be interested in scoped threads from the crossbeam crate:

use std::sync::mpsc;
use std::sync::mpsc::Receiver;
use std::sync::Arc;
use std::sync::Mutex;

type ArcReceiver = Arc<Mutex<Receiver<i32>>>;

fn a_function(_arc_sender: ArcReceiver) {}
fn another_function(_ref_sender: &Mutex<Receiver<i32>>) {}
fn main() {
    let (_sender, receiver): (std::sync::mpsc::Sender<i32>, std::sync::mpsc::Receiver<i32>) =
        mpsc::channel();
    let sb: ArcReceiver = Arc::new(Mutex::new(receiver));
    
    crossbeam::scope(|s| {
        for _ in 0..10 {
            s.spawn(|_| a_function(Arc::clone(&sb)));
        }
    // the end of this "scope" will automatically join all the spawned
    // threads for you
    }).unwrap();
    
    // also works without any "Arc" at all:
    let sb: Mutex<Receiver<i32>> = Arc::try_unwrap(sb).unwrap();
    crossbeam::scope(|s| {
        for _ in 0..10 {
            s.spawn(|_| another_function(&sb));
        }
    }).unwrap();
    
    println!("Hello, world!");
}

(playground)

1 Like

Hi, thanks, appreciate it.

The important thing to realize is that a closure is defining a new function: None of the code in the closure body (between || and )) will be run until the closure gets called, and any variables mentioned there will be automatically stored in the closure object for later use.

In the case of spawn(), the closure gets shipped to the new thread before getting called, so you have to take care to only mention variables that you want sent to the new thread.

OK, but in which thread those variables are created?

Depending on context, they are either referenced or moved from the same-named variables in the parent scope when the closure object is created (which is on the main thread here).

In every example so far, this has been a temporary variable that’s immediately passed to spawn as an argument, but it doesn’t need to be: it can be stored and passed around like any other value.


I’m trying to be careful about distinguishing between making/calling closures and making new threads because they’re really two separate things: lots of APIs in Rust use closures but have nothing to do with threading, so it’s important to understand them on their own.

Which thread will this unnamed variable be created? Main or the one just spawned

All of this, including the temporary result of Arc::clone is run on the child thread:

aFunction(Arc::clone(&sb))

What the main thread does is see the mention of sb and pack it up into an object that implements the FnOnce trait (as I showed above). That entire object gets shipped to the child thread which then runs all of the code.

1 Like