Rayon prevent wait

ManuelCostanzo · August 25, 2020, 2:52am

Hello ! I want to use the directive "nowait" of openmp in Rayon, for example when using par_iter_mut. Is it possible ?

Thanks.

alice · August 25, 2020, 6:42am

Not directly, rayon is not a crazy macro. But you can move the work you want to happen in parallel into the closure that runs in parallel.

ManuelCostanzo · August 25, 2020, 1:24pm

Thank you for reply !

But assuming I have 2 par_iter_mut, if I combine them and add more work to the processes, I am in the same problem: there are still idle processes. That's why I want to simulate the nowait of openmp to not have idle processes.

alice · August 25, 2020, 1:25pm

You need to show example code. I don't really understand exactly what the situation is.

ManuelCostanzo · August 25, 2020, 1:30pm

Yes, sorry.

I have this two parallel sections.

    graph
        .par_iter_mut()
        .map(|row| row.get_unchecked_mut(k))
        .zip(column_k.par_iter_mut())
        .enumerate()
        .filter(|(id, _)| *id != k)
        .for_each(|(_, (col, item_k))| {
            floyd_serial(col, *col, kk);
            *item_k = *col;
        });

    graph
        .get_unchecked_mut(k)
        .par_iter_mut()
        .zip(row_k.par_iter_mut())
        .enumerate()
        .filter(|(id, _)| *id != k)
        .for_each(|(_, (row, item_k))| {
            floyd_serial(row, kk, *row);
            *item_k = *row;
        });

In the first par_iter_mut, surely I do not need all the processes, so there will be some idle. So, if i could insert a nowait like openmp, I would solve the problem.

alice · August 25, 2020, 2:20pm

Use join. This function lets you define two pieces of code that will run in parallel.

join(
    || {
        graph
            .par_iter_mut()
            .map(|row| row.get_unchecked_mut(k))
            .zip(column_k.par_iter_mut())
            .enumerate()
            .filter(|(id, _)| *id != k)
            .for_each(|(_, (col, item_k))| {
                floyd_serial(col, *col, kk);
                *item_k = *col;
            });
    },
    || {
        graph
            .get_unchecked_mut(k)
            .par_iter_mut()
            .zip(row_k.par_iter_mut())
            .enumerate()
            .filter(|(id, _)| *id != k)
            .for_each(|(_, (row, item_k))| {
                floyd_serial(row, kk, *row);
                *item_k = *row;
            });
    }
);

ManuelCostanzo · August 25, 2020, 2:23pm

alice:

join(
    || {
        graph
            .par_iter_mut()
            .map(|row| row.get_unchecked_mut(k))
            .zip(column_k.par_iter_mut())
            .enumerate()
            .filter(|(id, _)| *id != k)
            .for_each(|(_, (col, item_k))| {
                floyd_serial(col, *col, kk);
                *item_k = *col;
            });
    },
    || {
        graph
            .get_unchecked_mut(k)
            .par_iter_mut()
            .zip(row_k.par_iter_mut())
            .enumerate()
            .filter(|(id, _)| *id != k)
            .for_each(|(_, (row, item_k))| {
                floyd_serial(row, kk, *row);
                *item_k = *row;
            });
    }
);

Wow thanks !!

I have the following error:

error[E0524]: two closures require unique access to graph at the same time
--> src/blocked.rs:141:13
|
128 | rayon::join(
| ----------- first borrow later used by call
129 | || {
| -- first closure is constructed here
130 | graph
| ----- first borrow occurs due to use of graph in closure
...
141 | || {
| ^^ second closure is constructed here
142 | graph
| ----- second borrow occurs due to use of graph in closure

alice · August 25, 2020, 2:26pm

Mutable access in Rust implies exclusive access, and anything else is undefined behavior. I see that you're already using unsafe to violate that assumption in some places.

kornel · August 26, 2020, 6:17pm

rayon::spawn is used to fire-and-forget in Rayon. Or rayon::scope if you need to run work asynchronously, but then wait for it to finish.

If you have mutable slice to distribute between tasks, you will need split_at_mut() (or chunks_mut()) to get two (or more) mutable non-overlapping sub-slices.

ManuelCostanzo · August 27, 2020, 2:30pm

Hi @kornel, yes but i need the column k and row k of the matrix. How can i get both slices to process in parallel ?

drewkett · August 27, 2020, 2:39pm

You can't do that easily here, because you are mutating graph in both arms of join. It is not safe to do so because you might be trying to read from/write to the same memory address from two threads simultaneously.

If both of these operations take a decent amount of time you might as well just run them one after the other.

kornel · August 27, 2020, 7:38pm

Are they even non-overlapping at all in your case? I'm not sure if I follow what your code is doing. What if you write to ith element when processing jth row, while another thread processes j element in i column?

You could use AtomicU32 elements in your graph, and iter() instead of iter_mut().

If you're sure threads never touch the same memory, then use unsafe to cast mut protections away (which is actually unsafe, sadly).

For your single graph it's likely too much work, but usually when complex mutli-threaded access is needed, it's possible to build a safe abstraction that enables it. rav1e did that to divide images into tiles:

https://blog.rom1v.com/2019/04/implementing-tile-encoding-in-rav1e/

system · November 25, 2020, 7:39pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Using rayon to implement some common parallel patterns help	2	915	December 7, 2022
Rayon par_iter_mut slower than serial help	4	1136	November 23, 2020
Rayon multiple par_iter one after the other help	3	418	February 12, 2021
Rayon is slower than serial algorithm	15	1972	October 30, 2020
Nested parallelism and Rayon	19	1777	February 15, 2023

Rayon prevent wait

Related Topics