Passing slice into a thread

Hello everyone,

I practice multithreading and I want to pass an array to 2 threads, so I split this array and I pass each slice to thread as propose into this topic : Multithread matrix multiplication . But I have an error...

My code :

use std::thread;

fn main() {
    const N: usize = 1000;

    let mut a: [f64;N*N] = [0.0;N*N];

    let (a_1, a_2) = a.split_at(N/2);

    let t = thread::spawn(move || {
        println!("{}", a_1[0]);
    });

    t.join().unwrap();
}

And so :

error[E0597]: `a` does not live long enough
  --> src/main.rs:8:22
   |
8  |     let (a_1, a_2) = a.split_at(N/2);
   |                      ^
   |                      |
   |                      borrowed value does not live long enough
   |                      cast requires that `a` is borrowed for `'static`
...
15 | }
   | - `a` dropped here while still borrowed

I know the compiler want say "Impossible because a can be removed during the thread execution" but how do this safely ?

Thank you !

There are two classic solutions to this problem:

  1. Wrap the data into an Arc so that it is only dropped once all threads are done with it. With Arc, the parent thread no longer owns the shared data, every thread has equal access to it. However, this approach also introduces shared mutability complications. Therefore, it is often better to...
  2. ...use a scoped thread API (as provided by e.g. the crossbeam and rayon crates) to ensure that threads are done processing the data before said data is dropped by the parent thread. This allows you to use non-'static borrowed data in your worker threads, with much better ergonomics, but at the (usually acceptable) cost of limiting the threads' execution scope.

The difference between crossbeam's and rayon's approaches to scoped threads is that crossbeam spawns actual OS threads and joins them at the end of the scope, whereas rayon schedules batch jobs on a thread pool. The rayon approach allows you to do less things in jobs, but scales better to many jobs.

rayon also has some very nice pre-built algorithms for data parallelism that are based on a parallel variant of std iterators. So once you understand what happens under the hood, you may want to use it whenever it fits instead of dispatching work across threads yourself.

2 Likes

Thank you for your answer.

However the line :

t.join().unwrap();

doesn't produce the dropping once all threads are done ?

In addition, when I use slice, the compiler consider that I access to the same data even if this data is splited ?

The std::thread::spawn API has no notion of thread scope. It doesn't know that you are going to eventually join the threads (after all, it allows you to forget to do so), and therefore it requires all data used by a thread to be owned by said thread or live forever (i.e. be 'static).

I'm not sure if I fully understand this question, but the very intent of slice splitting is to get two disjoint borrows from the same data which can be used by e.g. two different scoped threads.

However, these are borrows, not owned data. Therefore, you are not allowed to send them to std::thread::spawn because they may not live long enough, and this causes ergonomic pain. Which is why scoped thread abstractions are commonly used.

A long time ago, std had its own scoped thread abstraction. But it was later proven to be unsound (allow memory unsafety) and removed. Since then, there hasn't been a good enough consensus on what a scoped thread API should be for std to get back this functionality, so it lives in dedicated crates for now.

(If you are curious, scoped threads use unsafe internally to trick std::thread::spawn into thinking that the data which they are manipulating is 'static when it actually isn't. Their API & implementation carefully ensures threads are always joined, so that this improper API usage never results in memory unsafety.)

Thank you for your answers,

I can effectively solve my problem using static variables :

use std::thread;

fn main() {
    const N: usize = 1000;

    static mut a: [f64;N*N] = [0.0;N*N];

    unsafe {
        let (a_1, a_2) = a.split_at(N/2);


    let t = thread::spawn(move || {
        println!("{}", a_1[0]);
    });

    t.join().unwrap();
}
}

Where the "unsafe" is only here because I use a mutable variable.

However I will check the API you suggested because they are probably more safes.

Thank you very much and good bye

Beware static mut. It is extremely hard to use correctly. Even if we ignore all the classic issues of global variables, such as how trivial they make it to create a data race between two threads, there are also Rust-specific issues such as the fact that it is way too easy to end up creating two &mut to the same data, which is Undefined Behavior in Rust.

In Rust, when a thread has an &mut to a variable, the compiler is allowed to assume that it is the only thread with access to said variable. If this assumption is broken by your code, then the compiler's optimizer may produce a binary which does not do what you want. Or worse, it may seem to do something that you want, until the day where it doesn't. You really don't want your programs to have UB.

The lang team has a longstanding desire to eventually deprecate and remove static mut for this reason, so it is best not to get used to it.