Read-only access to shared variables from a thread

I have the following code. I can't enable the 2nd print statement as the animals variable has been moved. If I remove the move keywords Rust complains that "may outlive borrowed value animals".

However I still don't understand why? I have a call to handle.join and, as I understand, is the point where the thread is closes, so I'd think that Rust could know that the thread will end before animals goes out of scope of the main function.

fn main() {
    let animals = Vec::from_iter(["mouse", "elephant", "cat", "dog", "girafa"].map(|animal| animal.to_string()));
    println!("{:?}", animals);
    let handle = std::thread::spawn(move || {
        list_animals(&animals);
    });
    handle.join().unwrap();
    //println!("{:?}", animals);
}

fn list_animals(animals: &Vec<String>) {
    for animal in animals {
        println!("{}", animal);
    }
}

Reading further I found a solution using Arc

use std::sync::Arc;

fn main() {
    let animals = Arc::new(Vec::from_iter(["mouse", "elephant", "cat", "dog", "girafa"].map(|animal| animal.to_string())));
    println!("{:?}", animals);
    {
        let animals = animals.clone();
        let handle = std::thread::spawn(move || {
            list_animals(&animals);
        });
        handle.join().unwrap();
     
    }
    println!("{:?}", animals);
}

fn list_animals(animals: &Vec<String>) {
    for animal in animals {
        println!("{}", animal);
    }
}
  • Do I understand correctly that the clone only clones the reference and not all the data?
  • Is there a better way to do this without the extra pair of curly braces?

Yes, calling clone on an Arc will just increment an integer for the number of clones. It is very cheap.

One pattern that some people use is this:

let handle = std::thread::spawn({
    let animals = animals.clone();
    move || {
        list_animals(&animals);
    }
});
handle.join().unwrap();

Note also that you don't actually need an Arc. You can transfer ownership back when you join the thread:

fn main() {
    let animals = Vec::from_iter(["mouse", "elephant", "cat", "dog", "girafa"].map(|animal| animal.to_string()));
    println!("{:?}", animals);
    let handle = std::thread::spawn(move || {
        list_animals(&animals);
        animals
    });
    let animals = handle.join().unwrap();
    println!("{:?}", animals);
}

fn list_animals(animals: &Vec<String>) {
    for animal in animals {
        println!("{}", animal);
    }
}
4 Likes

The issue that the compiler is worried about is that main will exit (perhaps via a panic) before join gets called, causing a dangling reference in the spawned thread. You can use the scoped thread API instead, which gets around this issue by waiting for all child threads to terminate before returning from thread::scope:

fn main() {
    let animals = Vec::from_iter(["mouse", "elephant", "cat", "dog", "girafa"].map(|animal| animal.to_string()));
    println!("{:?}", animals);
    thread::scope(|s| { 
        s.spawn(|| list_animals(&animals) );
    });
    println!("{:?}", animals);
}
4 Likes

The compiler knows nothing about this, all it cares are the signatures of std::thread::spawn and join. spawn requires 'static so the borrow checker complains, and what comes after can't influence this.

What could be done is to reverse this, make join not require 'static and then join automatically, providing a separate detach that requires 'static. The problem with this is that you need a guarantee that the join will automatically happen, and this is trickier than it seems (the stdlib used to do this but it was found to be unsound and was removed before 1.0, see Pre-Pooping Your Pants With Rust). This was eventually added back in the stdlib, in a fixed way, through the std::thread::scope API.

Thank you for all the responses!