Rayon current_thread_index [unsafe]

Hi,

Does anyone know if current_thread_index is gurenteed to return a (for this os thread unique) index in 0..NumberOfThreadsInPool?

Put another way, if I

  1. Create a rayon threadpool with 32 threads
  2. Create an Vec of 32 elements (each element > cache line size), call it data
  3. Run rayon::par_iter() on a set of jobs and in each closure:
    a) Get the current index using let i = current_thread_index
    b) write/update data[i] += ... many times in each closure (note that this will require using unsafe / pointers to update the vec since it will then be mutated from multiple threads)
  4. After the par_iter is finished sum up data

Would this then be a ok/safe use of unsafe / would i be gurenteed to get the same result as if I hade done it sequentially?

Provided that the thread indexes I get from rayon are consistent (they are documented as such) and in 0..32 (couldnt find this in the docs) this feels like it should be safe to me - but have been bitten by assuming that incorrectly before..

Regards

Viktor

Why not use par_chunks? (so not having to touch unsafe.)

because the actual use is deep inside various levels of join - the example was just simplified for the concept of current_thread_index

If you're asking if the current rayon version 1.12.1 will always have this property, it looks like yes. The indices are created by enumerate on a collection created from a range created from the thread count.

If you're asking if rayon is guaranteed to do this in future versions, then all there is to go off is the documentation, and the documentation doesn't make such a guarantee. Luckily it's easy to just check that the index is within the range and panic if it isn't.

2 Likes

true, checking and panic is a good solution

1 Like

More specifically, does that mean that the following code is safe?

My understanding of the unsafe pointer rules is that is should be, but I'm a bit uncertain if I need to Pin<> the Vec in MyData? However, the MyDa struct ensure that is it never resized etc. so it should never be moved?

Or here is maybe a better version where i introduce a PhantomData of a shared reference to ensure that someone does not by accident .take() out the Vec while the Shared structure is still in use by other threads:

To be a little more idiomatic at should be marked unsafe.

Your code allows to create overlapping mutable references

let ref_1 = data_writer.at(some_idx);
let ref_2 = data_rwiter.at(some_idx);
swap(ref_1, ref_2);

_pd should probably be PhantomData<&'a mut [T]> to ensure correct variance.

How often will the threads write to data and how big is T? It might be that you get severe performance problems due to false sharing.

1 Like

Interesting,

Thansk, I will have to ponder your swap example :slight_smile:

Ref false sharing (yes it will be written often), I thought that I would completely eliminate the risk for that by ensuring that T has a size that is a multuple of the cachline size (hence the _padding: [i32; 64 / 4 - 1]) in the example. Am i missing something in assuming that it solves the problem?

/V

Here is how I would do it without unsafe: Rust Playground

The code offloads the unsafe code to spin::mutex::Mutex and uses try_lock().unwrap() to ensure unique access. Instead of manual padding I used a wrapper with align(256) which already includes the mutex.

I did not make spin part of the public interface so the implementation detail doesnt leak. You can get rid of some boilerplate if you do leak it.

1 Like

Thansk for the re-written code,

However, that is not going to work for the case I'm looking at (inside a performance critical simulation, e.g. ensuring interleaving of avx fma instructions to saturate backend). Also, I operate on avx512 so the mutex would effectively double the memory requirement since a avx512 value happens to completely fill a cache-line itself.

(The "usafe" playground code, with inlining and changing the assert to debug_assert will reduce down to just a memory access inside my loop, a memory access that will execute in parallel with subsequent numeric operations)

/V

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.