Parallel Quick Sort in Rust

Hi, I am trying to optimize my Quick Sort by implementing concurrency/parallel. So far I can do the Recursive part in parallel (Sort the 1st and 2nd half after partition). But I am not able to implement the Partition function to use parallel.

Here is my current sequential partition method:

fn partition<T: PartialOrd>(v: &mut [T]) -> usize {
    let pivot = v.len() - 1;
    let mut i = 0;
    for j in 0..pivot {
        if v[j] <= v[pivot] {
            v.swap(i, j);
            i += 1;
        }
    }
    v.swap(i, pivot);
    i
} 

I have researched online and many sites found that the only way is to use Prefix Sum, but I am struggling to apply here in Rust.

The best solution should do Parallel recursive calls, and Parallel partition. Could you advise?

What is your goal?

If you want an efficient implementation of sorting, use rayon: par_sort_unstable is a parallel quick sort (intro sort actually).

If you are studying parallel algorithms and you want to implement yourself, please specify which part of the algorithm you stuck.
Did you try to implement a parallel prefix sum algorithm? The algorithm is described in Wikipedia https://en.wikipedia.org/wiki/Prefix_sum#Parallel_algorithms or any textbook of parallel algorithms. Even an incomplete code is useful for understanding your status of understanding.

If you want to learn more about what Rayon does under the hood, I can wholeheartedly suggest the wonderful blogpost Rayon: Data parallelism in Rust which uses QuickSort as its main example, and indeed uses parallel recursive calls that perform partitioning and sorting. (Note that this article talks about how Rayon's first versions were constructed. Since then it has received many updates with extra optimizations. At a conceptual level, it still works very similar to what is described there, though.)

6 Likes

Hi, thanks for your response. I am stuck at implementing Partition function using Prefix Sum. That's the part

Hi, my current solution is similar to that post. However, that post still implements the partition method in sequential manner. It doesn't use Prefix Sum, which is what I am stuck trying to do

Well, a thing is, while the time complexity of the quicksort with both parallel partition and parallel recursion is something like O(n log n / p + (log p)^2) *1, the quicksort with sequential partition and parallel recursion runs in O(n + n log n / p) time (exercise: show it) where p is parallelism.
Thus *2, unless your number of processors is ω(log n), parallel partition doesn't reduce asymptotic running time.
That is in theory, but in practice, overhead is there and you won't get much speed-up from parallel partition (or turns out to be slower) for normal CPUs. More specialized algorithms are used in practice.

  • *1 Average running time on uniform input or expected running time with randomization.
  • *2 Technically this implication is wrong because I only showed upper bound, not precise running time.
1 Like