I implemented quite easily a parallel simple algorithm with 2 threads on an immutable vector, using Arc<T> as recommended in my previous post.
Now, I'm a little bit lost trying to do the same with a mutable vector. It seems I need to use a Arc<Mutex<T>>, but maybe it's not the only solution.
More specifically, what I trying to achieve: if v length is n, then I'd like to modify the first half for [0,n/2[ and the second half for [n/2, n[ ranges, at the same time from 2 different threads.
The crossbeam crate seems to do the job, but I'd like to keep 2 threads alive to achieve this, not starting threads each time I want to loop (which is the way to go for crossbeam or thread_scoped ?).
In my case, a Mutex will prevent any simultaneous modification for v, but what can prevent me to modify v at the same time because both threads will not address the same range ? Compiler optimizations?
Any idea or pointer on something already implemented as an tutorial example ?
The usual borrowing rules! A mutable reference (&mut) has exclusive access, and this idea works both for single threaded situations as well as if more threads are involved. The crucial idea for the Vec there is to use the “view” type &mut [T] via split_at_mut (as already linked).
ndarray + rayon should be pretty powerful at this point as well. Its entry point for parallelization basically becomes the Zip type, or the standard rayon APIs.
@leonardo Yes somebody points me to this method but it only gives you 2 mutable refs on the Vec. Using those mut refs, can I process them into 2 dirrfents threads ?
@bluss Thanks for the pointer. I knew Rayon by name only. It seems overkill for me isn't ?
It's fine. Even if you won't use most of functionality, you will only pay for what you use. Don't worry about using crates for small things, cargo package manager is really good - no need to write yourself what was already written.
// Create a threadpool holding 4 threads
let mut pool = Pool::new(4);
let mut vec = vec![0, 1, 2, 3, 4, 5, 6, 7];
// Use the threads as scoped threads that can
// reference anything outside this closure
pool.scoped(|scope| {
// Create references to each element in the vector ...
for e in &mut vec {
// ... and add 1 to it in a seperate thread
scope.execute(move || {
*e += 1;
});
}
});
is it possible to keep the 4 threads alive and kicking and kill them whenever the main thread wants it?
That's because in my crate, I'm reading a file and need to run 2 threads (main thread and worker thread) in parallel for each line I read to speed up data processing. So I'd like to spawn the worker thread and keep it alive till the end of the file. Using this pool, I'm afraid the threads will be continuously spawned for each line and spawning a thread is causing an overhead.
The pool should keep the thread(s) alive until the pool itself is dropped. So if you keep submitting work to the pool, the workers will pick it up. It is a pool, afterall . Or did I misunderstand your concern?
@vitalyd I was wondering how could I implement this for each line of my so called iterator.
The process is the following: my reader is looping through all lines of the input file, converting this line into a dedicated structure, and then returning a reference on that struct. It's not really an iterator but rather a streaming iterator.
So, I'd like to convert each line through several threads, but keep those threads alive till the end of the file reading, inside my next() fn.
So your iterator can or cannot proceed while a worker thread is doing something on the item you just yielded? Sorry, it's not quite clear to me. Maybe you can sketch out some pseudocode of what you're attempting to do?
Most of rayon functionality is generic, which means that it won't be compiled in until it's actually needed (as there are infinity possible versions of a generic function, depending with just types). This works even without link time optimization.
// read file and loop through records
while let Some(rec) = reader.next() {
println!("{}", rec);
}
The next() fn from the Reader struct is reading a file line by line and sending back a reference on a Record struct by splitting the line into individual pieces. That's this process of splitting I'd like to make parallel.
After rereading the posts several times, one big question still remains in my mind about what you're trying to do.
About these two threads that you're trying to make: Do you want them both to be doing the same thing (just in different places)? Or are they supposed to be doing different things?
I cannot find where this is happening in the code you linked.
Actually the process is pretty simple. Splitting a line into individual pieces. Suppose you have a String with a length of 100 chars. For simplification, assume the process is just to create 10 strings, each of 10 chars: 0..9, 10..19, etc. (numbers represents indices).
So thread 1 will split the String to 5 strings (indices 0 to 49) and thread 2 will do the same for indices 50 to 99.