Rust scoped threads weird behaviour

nicolaefilat · July 23, 2024, 11:15am

use std::thread;

fn main() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    let mut sum = 0;

    thread::scope(|scope| {
        for e in data {
            scope.spawn(move || {
                sum += e;
                println!("Sum: {}", sum);
            });
        }
    });

    // Printing the final sum after all threads have finished
    println!("Final Sum: {}", sum);
}

The behavior of this code is quite unexpected to me.
First of all, I feel like there can be multiple threads that access the sum variable introducing a data race.
Second of all, at the end of the function the sum is 0. Why is that?

I am definitely missing something here.

The full output of the code is

Sum: 1
Sum: 3
Sum: 5
Sum: 7
Sum: 4
Sum: 6
Sum: 8
Sum: 10
Sum: 9
Sum: 2
Final Sum: 0

erelde · July 23, 2024, 11:20am

You moved, by copy, sum into the thread closures.

Each threads owns its own copy of sum and adds e to that. But the original sum is unchanged.

nicolaefilat · July 23, 2024, 11:22am

Oh I see now. So basically move there mean copy because sum is an int type. I did not want to do that lol

erelde · July 23, 2024, 11:24am

Yes. But nitpick, not because it's an integer type, but because that integer type implements Copy. Rust is pretty good at using the type system to represent properties of types.

(And I don't know if the bigger, 128bits and natural, numeric types will implement Copy?)

zirconium-n · July 23, 2024, 11:30am

github.com/rust-lang/rust

Duplicating mut Copy variables doesn't warn or error

opened 03:59AM - 18 Jun 20 UTC

rocallahan

A-lint A-closures T-compiler C-bug

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6dee1348…129e0c1938c67eb66a8f62b0 ```rust fn main() { let mut x = 0; let closure = move || { let mut closure2 = move || { x += 1; }; closure2(); println!("{}", x); }; closure(); } ``` The inner closure essentially forks the mutable variable `x` into *two* variables. I would expect rustc to give a warning or error, but there is no warning or error. I understand the logic behind allowing this but I think it can lead to surprising results. rust-analyzer warns that assignment to `x` in `closure2` is never read. This led me to a real bug in our code. Seems to me that duplicating a `mut` `Copy` variable like this deserves some kind of warning at least.

jameseb7 · July 23, 2024, 11:37am

And to address this point: if you remove the move, so that each thread gets a reference to sum, the borrow checker will complain. The correct way to do this is with atomics:

use std::sync::atomic::{AtomicI32, Ordering};
use std::thread;

fn main() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    let sum = AtomicI32::new(0);

    thread::scope(|scope| {
        for e in data {
            let sum = &sum;
            scope.spawn(move || {
                sum.fetch_add(e, Ordering::Relaxed);
                println!("Sum: {}", sum.load(Ordering::Relaxed));
            });
        }
    });

    // Printing the final sum after all threads have finished
    println!("Final Sum: {}", sum.into_inner());
}

(playground)

paramagnetic · July 23, 2024, 12:41pm

Just out of curiosity: why did you expect the sum variable not to be moved, when you explicitly used a move closure? Perhaps understanding why you thought it would have no effect would allow us to help you with further, likely deeper, misunderstandings you may have.

nicolaefilat · July 23, 2024, 12:48pm

I usually think of moving something in terms of heap-allocated objects. However, I forgot that for simple types that implement Copy the "move" is just a memcpy

nicolaefilat · July 23, 2024, 12:51pm

I implemented a solution where I slice up the array into different slices that each thread computes the sum of. I do not use any atomics in the code and it works.

use std::{ops::Add, thread};
use rayon::prelude::*;
use std::iter::Sum;

fn compute_sum_in_parallel<T>(data: Vec<T>, nr_threads:usize) -> T where T: Send + Sync + std::ops::AddAssign + Add<Output = T> + for<'a> Sum<&'a T> + Default  {
    let chunk_size = data.len() / nr_threads;
    
    let mut total_sum = Default::default();

    thread::scope(|scope|{
        let mut handles: Vec<thread::ScopedJoinHandle<T>> = Vec::new();
        for i in 0..nr_threads {
            let start_index = i * chunk_size;
            let end_index = if i == nr_threads - 1 {
                data.len()
            } else {
                start_index + chunk_size
            };
            
            // get a slice
            let data_slice = &data[start_index..end_index];
            
            // Spawning a thread to compute the sum of the current slice
            let handle = scope.spawn( || {
                data_slice.iter().sum()
            });
            handles.push(handle);
        }

        for handle in handles {
            total_sum += handle.join().unwrap();
        }
    });

    total_sum
}

fn main() {
    let size = 100000001;
    let now = Instant::now();
    let sum:usize = compute_sum_in_parallel((1..size).collect::<Vec<_>>(), 16);

}

Is there any performance benefits of using atomics in this case as compared to my solution in terms of performance?

paramagnetic · July 23, 2024, 1:02pm

First off, moving and heap-allocation are completely orthogonal. It's an apples to oranges comparison. Every value can be moved. Whether it is on the heap or the stack, static memory, or anywhere else. Whether or not it allocates a heap buffer (or any other resource) itself. Moving just means bitwise copying a value from one place to another, transferring its ownership to the new place.

Second, moving is also orthogonal to Copy. Copy and non-Copy types can equally be moved.

And then Copy is also orthogonal to heap-allocation. Integers are not Copy because they are on the stack! If you put a primitive integer on the heap or in a static item, it's still Copy – whether something can be bitwise copied is intrinsic to its type, and has nothing to do with any particular address it resides at.

SkiFire13 · July 23, 2024, 1:39pm

Atomics will be slowers, since you'll have all the threads trying to mutate the same memory location.

kpreid · July 23, 2024, 3:16pm

In my experience of making this mistake, and seeing others make the mistake, the problem is that under non-Copy circumstances, if you move something you don't want to move, you would then get an error when you try to use it after the end of the scope — which is immediate feedback on the mistake. Here, because of the copy, there are now, silently, two mutable sum places. That’s usually not what you want, and it's the sort of thing that Rust usually makes you be more explicit about.

If I were trying to fix this problem, I'd try adding a warning whenever

a closure captures a binding foo by value,
foo is mutated within the closure,
the type of foo is Copy, and
foo is also used later outside the closure (would be an error if the type is not Copy),

and see how many false positives that got.

[Update: I posted a refined version of this idea as a comment on the issue.]

nicolaefilat · July 23, 2024, 4:05pm

I agree with your statement. I was quite confused when I saw that there was a copy made even though I did not see anything like an assignment operator.

paramagnetic · July 23, 2024, 4:31pm

fn foo(x: i32) {}
let y = 42;
foo(y);

Here a copy of the value 42 is made, even though there is no assignment. Is that also confusing?

quinedot · July 23, 2024, 6:14pm

Moves are just a memcpy whether or not the type implements Copy. The difference is that if you don't implement Copy, the original variable/place becomes uninitialized afterwards.

The type system isn't heap-aware. Generally speaking, if you move something that manages a heap allocation, you're moving a pointer. E.g. a Vec<T> is a pointer, capacity, and length triple.

nicolaefilat · July 24, 2024, 7:48am

That is clear because I used other languages like C or C++
I understood now the idea, thank you all for your explanations

drmason13 · July 25, 2024, 3:03pm

I think this is interesting.

I've previously though about the data pointed to by the Vec as being moved conceptually, and as a result, be slightly mystified by how exactly Rust does that without it being horrendously slow.

At some point I became more aware of the pointer being moved, while the heap allocated data doesn't.

And only recently am I becoming aware of what moving actually is at the system level. I.e. a memcpy.

It might help to know more about what a memcpy is exactly. I gather it's a low-level instruction to... the hardware(?)... to copy memory, but the devil is in the details!

jdahlstrom · July 25, 2024, 3:33pm

Memcpy is just the C standard library function for… copying memory. Conceptually it’s just a loop that copies bytes, but of course implementations may optimize it however they want. But the archetypal implementation is simply

unsafe fn memcpy(dest: *mut [u8], src: *const [u8], n: usize) {
  for i in 0..n {
    dest[i] = src[i];
  }
}

When people say "it’s just a memcpy", they mean that the value’s bytes are literally blindly copied from A to B – specifically, no user code is executed in the copy, in contrast to C++, where a user-defined copy or move constructor or assignment operator may be invoked and you can’t be sure how much work is actually done without checking. Rust just does what C does, except the shooting-yourself-in-the-foot part.

kpreid · July 25, 2024, 4:20pm

There's a caveat here: in Rust, it's UB to read uninitialized memory, and values may contain uninitialized memory (padding, MaybeUninit, etc.). So, you can't actually write that loop in Rust (unlike C which, if I remember correctly, has an exemption for doing this) and use it in all the ways Rust moves do. (If you do need to do it explicitly in Rust, you can call std::ptr::copy, or more often, one of its narrower relatives, like std::ptr::write or the methods on pointer types themselves.)

alice · July 26, 2024, 10:06am

Reading uninit memory is not UB.

Rather it is the act of producing an invalid value (from behaviour considered undefined), since the src[i] expression returns (produces) a value of type u8 containing uninitialized bytes. Had the pointers used the MaybeUninit<u8> type instead, then it would still be reading uninit memory but it would not be UB because the value would no longer be invalid for its type.

Topic		Replies	Views
Question about concurrency example in Rust book	5	1858	January 12, 2023
Accessing variables from threads help	4	4074	January 12, 2023
Why non-scoped thread needs to move variable/data inside it? help	4	638	July 7, 2023
Rust multithreading and passing values/references help	6	3100	April 19, 2021
Confused about moving references and closures help	9	943	September 19, 2023

Rust scoped threads weird behaviour

Related topics