Hello, last time I was able to set up multithreading in my application with the awesome community's help using crossbeam channels with scoped_threadpool.
At that point, the code looked like:
let pool = Pool::new(cpus);
let (s, r) = unbounded();
let tiles : Vec<Tile> = Default::default();
let pixel_numbers = 0..(film.height * film.width);
pool.scoped(|scope| {
for i in pixel_numbers.step_by(TILE_SIZE) {
let sender = s.clone();
scope.execute(move ||
{ let tile : Tile = Default::default();
sender.send(tile).unwrap();
}
});
}
});
drop(s); //To avoid waiting for the initial s which does not do anything
tiles.extend(r);
And it worked beautifully. To the best of my understanding, after the tiles.extend(r)
line, all the receivers are done and we have the output. This was around 850 ms
.
Now, I am trying to push the Tiles into a framebuffer to display on the screen as they are done, and it seems the minifb
crate is extremely easy to use, so I did that instead of collecting the output in tiles
as and when they are done:
//tiles.extend(r2);
while window.is_open() && !window.is_key_down(Key::Escape) && !r.is_empty() {
// Receive tile from renderer without blocking
let finished_tile_result = r.try_recv();
match finished_tile_result {
Ok(finished_tile) => {
// Now that we have the tile from the renderer, push it into tiles for image
// and push it to display buffer, tone mapping and showing to the screen
tiles.push(finished_tile.clone());
frame_buffer.splice(
finished_tile.start_index as usize
..(finished_tile.start_index as usize + finished_tile.num_pixels),
do_tonemapping(finished_tile.pixels),
);
//dbg!(finished_tile.num_pixels);
// We unwrap here as we want this code to exit if it fails. Real applications may want to handle this in a different way
window
.update_with_buffer(&frame_buffer, film.width as usize, film.height as usize)
.unwrap();
}
Err(_) => {}
}
However, this makes the program an order of magnitude slower, the CPU is never fully utilized, and what's worse, the time depends on how fast the screen can update, so if I limit the update rate to say 60 fps, the time taken is even worse.
So it seems I am ending up being synchronized to the while
loop... Even though, from what I understand, the loop should run in the main thread while the receivers work on their respective Tiles
, and once the result is done, try_recv()
would get a value which would be subsequently updated in both the display frame_buffer Vec<u32>
, and tiles
. This should at most use one thread and should not block, yet there seems to be something weird going on.
Can anyone please advise on what's going horribly wrong here, and maybe ways to fix it?
Also, I had another idea that instead of waiting to update the frame_buffer
whenever a receiver is done, we keep it totally independent of the channels, and instead draw to the window at say, 60 fps by reading the tiles
instead of writing to a separate frame_buffer
. However, I am still not completely sure how to implement that in the current channels framework... Is it something that can be implemented easily?
Any advice would be appreciated! Thanks!