What is a Rust equivalent to Swift's parallel thread-setting?

I apologise if I ask an already-answered question. I looked through similar questions on the Forum, but couldn’t recognise any that really match my question; I’m still very new to Rust.

Working in Swift, I created a program which iterated over sections of an array in parallel. These sections are set by markers from a different array; the sections are different sizes. I create a DispatchQueue.concurrentPerform, which lets me set the number of iterators/threads, and run loops on each section chosen by the thread number.

In Swift, the code looks like this:

// variables:    
// array: a large, mutable array of u32 numbers
// threads: the number of threads/iterations
// sections: an array of integers marking the start/stop indexes of each section, as set by ‘threads’.

	DispatchQueue.concurrentPerform(iterations: threads) { k in    // k is each thread number
        let min = sections[k]
        let max = sections[k + 1]

        for i in min..<max {
              // perform operations on array[i]
        }
    }

This works well in Swift, but I’m trying to work out what the equivalent code in Rust would be. I know that Rayon can let me set the number of threads vis ThreadPoolBuilder, but I’m not yet clear as to how to set particular sections to each thread.

Why do you need to manually set the number of threads and partition the array for concurrent execution? Rayon chooses by default the number of threads by looking at the available cores and automatically splits up the work between threads evenly. All you have to do is use .into_par_iter().for_each(…) or .par_bridge().for_each(…) depending on what you iterate over.

1 Like

The algorithm in question is a two-pass shuffle for large datasets, as described here. The basic process is:

  • Randomly move array elements into N piles or sections (which aren't likely to be the same size);
  • Perform a Fisher-Yates shuffle on each section. My algorithm carries out this step.

So each thread needs to work on specific sections rather than split the job up evenly.

It sounds like you aren't actually looking for a way to set the number of iterations per thread, more a way of splitting your work into different size chunks so each chunk can be processed in parallel.

This sounds quite similar to the quicksort example in the docs for rayon::join(). Could you adapt that to suit your application?

let mut v = vec![5, 1, 8, 22, 0, 44];
quick_sort(&mut v);
assert_eq!(v, vec![0, 1, 5, 8, 22, 44]);

fn quick_sort<T:PartialOrd+Send>(v: &mut [T]) {
   if v.len() > 1 {
       let mid = partition(v);
       let (lo, hi) = v.split_at_mut(mid);
       rayon::join(|| quick_sort(lo),
                   || quick_sort(hi));
   }
}

// Partition rearranges all items `<=` to the pivot
// item (arbitrary selected to be the last item in the slice)
// to the first half of the slice. It then returns the
// "dividing point" where the pivot is placed.
fn partition<T:PartialOrd+Send>(v: &mut [T]) -> usize {
    let pivot = v.len() - 1;
    let mut i = 0;
    for j in 0..pivot {
        if v[j] <= v[pivot] {
            v.swap(i, j);
            i += 1;
        }
    }
    v.swap(i, pivot);
    i
}

Thinking about your suggestions, I tried a different approach. Instead of using the section array on its own, I made an array of ranges from it and iterated from that.

// sections is an array of indexes marking out the sections of the main array to shuffle

let mut ranges: Vec<Range<usize>> = vec![0..0];
for i in 0..SECTIONS {
    ranges.push(sections[i]..sections[i + 1]);
    }
ranges.remove(0);
// Yes, I know this is a clumsy way to set it up; I'm still learning

for i in ranges.iter() {
    let min = i.start as isize;
    let max = i.end as isize;

    for j in min..max {
        // shuffle that section of main array
    }
}

This produces the same result as my earlier method.

However, if I use Rayon and replace ranges.iter() with ranges.par_iter(), I get the error message:

error[E0277]: `rayon::slice::Iter<'_, std::ops::Range<usize>>` is not an iterator
  --> src/modulo_range.rs:17:18
   |
17 |         for i in ranges.par_iter() {
   |                  ^^^^^^^^^^^^^^^^^ `rayon::slice::Iter<'_, std::ops::Range<usize>>` is not an iterator
   |
   = help: the trait `std::iter::Iterator` is not implemented for `rayon::slice::Iter<'_, std::ops::Range<usize>>`
   = note: required by `std::iter::IntoIterator::into_iter`

So I'm still missing something. I'll experiment with your join idea, Michael, and see what I come up with.

You can't use parallel iterators in for loops. Use for_each

Thanks for that, Alice. I changed the code that way, but am now getting errors either about mutable borrowing or sharing between threads. I think I'll have to study the basics of Rust concurrency first to understand this more fundamentally.