Ok, updated numbers on i7 machine:
Running `target/release/mandel --num_threads 8`
Configuration: re1: -2.00, re2: 1.00, img1: -1.50, img2: 1.50, max_iter: 2048, img_size: 1024, num_threads: 8
Time taken for this run (serial_mandel): 1190.79262 ms
Time taken for this run (parallel_mandel): 242.11304 ms
Time taken for this run (simple_parallel_mandel): 229.32596 ms
Time taken for this run (rayon_mandel): 166.80914 ms
Now, I've managed to add the linux port of libdispatch
at rust-mandel patching up the already existing dispatch crate so I can rapidly make a test for the libdispatch port ... so adding that u it end ups like this on the i7 machine:
Running `target/release/mandel --num_threads 8`
Configuration: re1: -2.00, re2: 1.00, img1: -1.50, img2: 1.50, max_iter: 2048, img_size: 1024, num_threads: 8
Time taken for this run (serial_mandel): 1190.36614 ms
Time taken for this run (parallel_mandel): 233.26369 ms
Time taken for this run (simple_parallel_mandel): 229.82853 ms
Time taken for this run (rayon_mandel): 161.29813 ms
Time taken for this run (dispatch_serial_mandel): 1209.53670 ms
Time taken for this run (dispatch_async_mandel): 862.91629 ms
This is the implementation for dispatch version, which does have notorious limitations ... some of them imposed by my knowledge of rust, some others by the crate:
#[cfg(feature = "with_dispatch")]
fn dispatch_serial_mandel(mandel_config: &MandelConfig, image: &mut [u32]){
use dispatch::{Queue, QueueAttribute};
let queue = Queue::create("com.rust.mandel", QueueAttribute::Serial);
for y in 0..mandel_config.img_size {
for x in 0..mandel_config.img_size {
queue.sync(|| image[((y * mandel_config.img_size) + x) as usize] =
mandel_iter(mandel_config.max_iter,
Complex64{re: mandel_config.re1 + ((x as f64) * mandel_config.x_step),
im: mandel_config.img1 + ((y as f64) * mandel_config.y_step)})
);
}
}
}
#[cfg(feature = "with_dispatch")]
fn dispatch_async_mandel(mandel_config: &MandelConfig, image: &mut [u32]){
use dispatch::{Queue, QueueAttribute, Group};
use std::sync::{Arc, Mutex};
let queue = Queue::create("com.rust.mandel", QueueAttribute::Concurrent);
let group = Group::create();
let data = image.to_vec();
let image = Arc::new(Mutex::new(data));
for y in 0..mandel_config.img_size {
for x in 0..mandel_config.img_size {
let image = image.clone();
let index = ((y * mandel_config.img_size) + x) as usize;
let re = mandel_config.re1 + ((x as f64) * mandel_config.x_step);
let im = mandel_config.img1 + ((y as f64) * mandel_config.y_step);
let max_iter = mandel_config.max_iter;
let c = Complex64{re: re, im: im};
group.async(&queue, move || {
let data = mandel_iter(max_iter, c);
let mut image = image.lock().unwrap();
image[index] = data;
} );
}
}
// Wait for all tasks in queue group to finish
group.wait();
}
In any case, I'm tempted to test this, and a direct unsafe{} ffi test on FreeBSD which does natively have kqueue, libdispatch, clang and all the whistles and bells to see how it fares.
Regards.