How to get largest n values in rayon?

For sequential iterators, we can use itertools's k_smallest method to get max n values in an iterator (although this method yields k smallest, but the idea is the same).

For parallel iterators in rayon, we can use max_by to get largest value. However, I can't find a way to get largest n values. Is there any api I can use, or how to implement it if it is not too complicated?

You could have a look at itertools' implementation of k_smallest and implement it yourself. Here an unoptimized example with rayon's fold and reduce over BinaryHeaps:

use rayon::prelude::*;

use std::collections::BinaryHeap;

fn take_n(v: &[u8], n: usize) -> Vec<u8> {
    assert!(v.len() >= n);

    let mut heap = v
        .into_par_iter()
        .fold(
            || BinaryHeap::new(),
            |mut acc, x| {
                acc.push(x);
                acc
            },
        )
        .reduce(
            || BinaryHeap::new(),
            |mut h1, h2| {
                h1.extend(h2);
                h1
            },
        );

    let mut res = Vec::with_capacity(n);

    for _ in 0..n {
        res.push(*heap.pop().unwrap());
    }

    res
}

fn main() {
    let arr = vec![0, 3, 1, 4, 2, 5];

    let x = take_n(&arr, 3);

    assert_eq!(x, vec![5, 4, 3]);
}

Playground.

Here a more optimized version more in line with k_smallest from itertools. I use Reverse to make the max-heap a min-heap and then use the same logic as k_smallest, i.e. if the heap has not enough values, just push value x onto the heap. If the heap does contain n values, peek at the smallest one, if x is bigger than the smallest value in the heap, replace it with x. Keeping at most n elements in our heaps makes them more space and time efficient (lookup takes O(log n)), especially for cases where n << |arr|:

use rayon::prelude::*;

use std::cmp::Reverse;
use std::collections::BinaryHeap;

fn take_n(v: &[u8], n: usize) -> Vec<u8> {
    assert!(v.len() >= n);

    let mut heap = v
        .into_par_iter()
        .map(|x| Reverse(x))
        .fold(
            || BinaryHeap::with_capacity(n),
            |mut acc, x| {
                if acc.len() < n {
                    acc.push(x);
                } else if *acc.peek().unwrap() > x {
                    *acc.peek_mut().unwrap() = x;
                }
                acc
            },
        )
        .reduce(
            || BinaryHeap::new(),
            |mut h1, h2| {
                for x in h2 {
                    if h1.len() < n {
                        h1.push(x);
                    } else if *h1.peek().unwrap() > x {
                        *h1.peek_mut().unwrap() = x;
                    }
                }
                h1
            },
        );

    let mut res = Vec::with_capacity(n);

    for _ in 0..n {
        res.push(*heap.pop().unwrap().0);
    }

    res
}

fn main() {
    let arr = vec![0, 3, 1, 4, 2, 6, 8, 10, 5, 12, 9];

    let x = take_n(&arr, 3);

    assert_eq!(x, vec![9, 10, 12]);
}

Playground.

Note that the order of the values we return is now reversed compared to my first example. So the smallest value is now the first and the biggest the last in the resulting vector.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.