Unconfident using Rayon with shared data, mutex or cell?

lewis421 · January 20, 2021, 3:48pm

The function below uses par_apply to set the value of mutable arrays using shared access to vec_of_closes, vec_of_lows, vec_of_highs. However, i am getting funny data back. This compiles, but clearly something is wrong.

Should those sections be put in a mutex? Any other suggestions?

Grateful in advance.

fn sampling_threaded_hlc(vec_of_closes: &[f64],vec_of_highs: &[f64],vec_of_lows: &[f64], vec_of_dates: &[&str], vec_of_times: &[&str]) -> DataFrame {
let steps = vec_of_times.len() - 1;
let mut lows = Array1::<f64>::zeros(steps);
let mut highs = Array1::<f64>::zeros(steps);
let mut closes = Array1::<f64>::zeros(steps);
let back_times = &vec_of_times[..vec_of_times.len()-1];
let front_times = &vec_of_times[1..vec_of_times.len()];
let first_back = back_times[0];
let last_front = front_times[front_times.len()-1];
Zip::from(&mut closes).and(&mut lows).and(&mut highs).and(&*back_times).and(&*front_times).par_apply(|closes,lows,highs,back_times,front_times| {
    // println!("{:?},{:?}",back_times,front_times);
    let first = if *back_times == first_back {
        0
    }
    else {
        vec_of_dates.par_iter().position_first(|x| NaiveDateTime::parse_from_str(&x,"%Y-%m-%d %H:%M:%S").unwrap() > NaiveDateTime::parse_from_str(&back_times,"%Y-%m-%d %H:%M:%S").unwrap()).unwrap()
    };
    let last = if *front_times == last_front {
        steps
    }
    else {
        vec_of_dates.par_iter().position_first(|x| NaiveDateTime::parse_from_str(&x,"%Y-%m-%d %H:%M:%S").unwrap() > NaiveDateTime::parse_from_str(&front_times,"%Y-%m-%d %H:%M:%S").unwrap()).unwrap()
    };
    *lows = vec_of_lows[first..last].iter().fold(f64::INFINITY, |a,&b| a.min(b));
    *highs = vec_of_highs[first..last].iter().fold(0_f64, |a,&b| a.max(b));
    *closes = vec_of_closes[first..last].last().copied().unwrap();
    println!("{:?},{:?},{:?},{:?},{:?},{:?},{:?}",back_times,front_times,last,first,lows,highs,closes);
});
let vec_dates = Series::new("date",&vec_of_times[1..]);
let vec_lows = Series::new("low",fill_forward(&lows.to_vec()));
let vec_highs = Series::new("high",fill_forward(&highs.to_vec()));
let vec_closes = Series::new("close",fill_forward(&closes.to_vec()));
let df = DataFrame::new(vec![vec_dates,vec_lows,vec_highs,vec_closes]).unwrap();
println!("{:?}",df.head(Some(10)));
df

}

H2CO3 · January 20, 2021, 4:58pm

I'm not sure using a mutex would help.

Mutexes prevent data races, which are considered memory unsafe by Rust. They don't enforce any particular order of operations – they enforce the mere existence of some unspecified order. Your code doesn't have any unsafe, so unless you are doing something funky or there's a bug in Rayon, your code should be memory-safe, and by implication, race-free. (Cells don't even do that, they enable a limited form of shared mutability.)

Without trying to dig deep into what your code is supposed to do, I suspect your problem is that you are relying on operations being in a specific order; however, race-freedom does not guarantee that.

Can you tell us more about what the code is expected to do, what algorithm you have in mind, and how the actual output differs from your expectations?

lewis421 · January 20, 2021, 5:30pm

The function samples price data and reduces it to fewer, longer price bars. So 1000 lines of one second price bars become x lines of 10 second price bars.

I do that by finding the datetime indices of bars contained in the one second dataset and then taking those indices and using them to slice the vecs of high,low and close. It is those high,low and close values that are coming back bad: initially they return good values, but after running for some amount they start to be just one number, as if something is stuck.

My idea is to lock with a mutex the steps where the low,high and close are set, and i understand that the code at that point becomes sequential for those lines. To your question is this something that needs to be done in order, it does not, meaning once you have the indices of the interval in question you can use those indices on the vecs in any order they come.

lewis421:

    *lows = vec_of_lows[first..last].iter().fold(f64::INFINITY, |a,&b| a.min(b));
    *highs = vec_of_highs[first..last].iter().fold(0_f64, |a,&b| a.max(b));
    *closes = vec_of_closes[first..last].last().copied().unwrap();

system · April 20, 2021, 5:30pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Using Rayon in code locked by a Mutex help	4	1100	January 12, 2023
Shared mutable access to a list of values help	7	1056	January 12, 2023
Acquiring a per-thread mutex in rayon kills parallelism help	2	162	January 27, 2024
Mutex locking alternatives in hot data parallel loop	7	1321	November 29, 2022
Mutably iterating over a sparse subset of items from a Vec (via their indices) in parallel (using Rayon) help	15	1370	December 10, 2019

Unconfident using Rayon with shared data, mutex or cell?

Related Topics