Parallelizing a for loop that writes into a vector?

Hi everyone,

Rust newcomer here, so far I am loving it. At work we are looking at replacing a critical program. All is going great as we think about how to port it. However, the issue of parallelization comes up. The "old" program stores data to be read and written in vectors and it goes element by element to apply quite a few functions to the data obtained in those elements, including doing more loops (one loop treats data rows and a nested loop treats the columns of that row). It was previously parallelized using OpenPM with pragmas and that worked well. The results of that processing is written into a different vector. There was no issue with it but I understand that in Rust that would be a mutable vector that cannot be shared between threads.

There is probably a way of doing this (rayon?) that I, as a beginner, cannot yet see. I crafted a very small program that has somewhat similar behavior and for the life of me, I cannot find how to parallelize it without using a map and even with a map I am not sure how I would be able to handle it. How would you do it?

extern crate rand;

use rand::prelude::*;


struct Data {
    x: Vec<f64>,
    y: Vec<f64>
}

struct Output {
    first: Vec<f64>,
    second: Vec<f64>
}

impl Output {
    fn new(size: usize) -> Output {
        Output {
            first: vec![0.0; size],
            second: vec![0.0; size]
        }
    }

    fn assign(&mut self, index: usize, x: f64, y: f64) {
        self.first[index] = x;
        self.second[index] = y;
    }
}

impl Data {
    fn new(size: usize) -> Data {
        let mut rng = rand::thread_rng();
        let mut data = Data {
            x: Vec::new(),
            y: Vec::new()
        };
        for _ in 0..size {
            data.x.push(rng.gen());
            data.y.push(rng.gen());
        }
        data
    }

    fn f_sum(&self, i: usize) -> f64 {
        self.x[i] + self.y[i]
    }

    fn f_mul(&self, i: usize) -> f64 {
        self.x[i] * self.y[i]
    }
}


fn main() {
    let size = 5;
    let data = Data::new(size);
    let mut output = Output::new(size);
    
    for i in 0..size {
        output.assign(i, data.f_sum(i), data.f_mul(i));
    }

    println!("{:?}", data.x);
    println!("{:?}", data.y);
    println!("=======================");
    println!("{:?}", output.first);
    println!("{:?}", output.second);
}

Thank you for helping this newbie.

I assume you mean your Data::new? With rayon, it could look like this:

impl Data {
    fn new(size: usize) -> Data {
        let (x, y) = (0..size).into_par_iter()
            .map_with(rand::thread_rng, |rng, _i| (rng.gen(), rng.gen()))
            .unzip();
        Data { x, y }
    }
}

You can use rayon like so (assuming you wanted to parallelize the loop in main). Note that for small data-sets it will be better to not use rayon, because it's overhead is larger than any benefits it could give.

fn main() {
    let size = 5;
    let data = Data::new(size);
    let mut output = Output::new(size);
    
    for i in 0..size {
        output.assign(i, data.f_sum(i), data.f_mul(i));
    }

    println!("{:?}", data.x);
    println!("{:?}", data.y);
    println!("=======================");
    println!("{:?}", output.first);
    println!("{:?}", output.second);
    
    let mut output = Output::new(size);
    
    // first step, formulate the loop in terms of normal iterators
    output.first.iter_mut()
        .zip(&mut output.second)
        .zip(&data.x)
        .zip(&data.y)
        .for_each(|(((first, second), x), y)| {
            let sum = x + y;
            let mul = x * y;
            
            *first = sum;
            *second = mul;
        });

    println!("=======================");
    println!("{:?}", output.first);
    println!("{:?}", output.second);
    
    let mut output = Output::new(size);
    
    // second step, rename all iter* to par_iter*
    output.first.par_iter_mut()
        .zip(&mut output.second)
        .zip(&data.x)
        .zip(&data.y)
        .for_each(|(((first, second), x), y)| {
            let sum = x + y;
            let mul = x * y;
            
            *first = sum;
            *second = mul;
        });

    println!("=======================");
    println!("{:?}", output.first);
    println!("{:?}", output.second);
}
2 Likes

Ha, I didn't even scroll far enough to see the loop in main... :man_facepalming:

Haha no worries, my bad, I should have specified which one!

Note: the formulation shown above works in 99% of case I've seen. There are some Iterator combinators that don't have exact analogs in ParallelIterator, so you may have to make some additional changes in order to get things working. The most notable difference is Iterator::fold must be re-written in terms of ParallelIterator::reduce and all closures passed to the ParallelIterator combinators cannot mutate cannot mutate their environment without some sort of interior mutability.

1 Like

Thank you! This looks promising. I will take some time to internalize this and play around with it. Any way to call methods from inside the for_each? I already saw that I can call functions at least.

As long as the methods only used shared references (&T), you can.

1 Like

Sorry if this is obvious but how would it enter into the for_each? If I wanted to call data.f_mul for example instead of doing x*y. For the function I simply created a new one that took 2 f64s, multiplied them and returned a f64 so that was easy but I don't know where to begin for the data method f_mul.

Good question, you can use the enumerate combinator to get the current index, then you can just call the required method on data. This works because Data::f* only takes data by shared reference.

output.first.par_iter_mut()
        .zip(&mut output.second)
        .enumerate()
        .for_each(|(i, (first, second))| {
            *first = data.f_sum(i);
            *second = data.f_mul(i);
        });
1 Like

Thank you so much!

1 Like