Hi Rustacians!
I'm writing a function to compare two RgbImage
structs (from the image
crate). I want a function that compares both and returns a value in the 0..1
range that indicates how different those two images are, based on their RGB channel values and a simple subtraction. For example, a completely black image would be 1.0
different from a completely different white image.
My current diff()
function is as such:
// Actual diff is weighted by luminosity bias
const LUMA_R: f64 = 0.2126;
const LUMA_G: f64 = 0.7152;
const LUMA_B: f64 = 0.0722;
pub fn diff(a: &RgbImage, b: &RgbImage) -> f64 {
let w = a.dimensions().0;
let h = a.dimensions().1;
let num_pixels = w * h;
let mut diff_sum_r: i32 = 0;
let mut diff_sum_g: i32 = 0;
let mut diff_sum_b: i32 = 0;
let samples_a = a.as_flat_samples().samples; // Returns &[u8]
let samples_b = b.as_flat_samples().samples; // Returns &[u8]
let skip_step = 1; // Temporary; higher values skip pixels
let mut pos: usize = 0;
let pos_step: usize = skip_step * 3; // Fixed since we know the layout is RGBRGBRGB
for _ in (0..num_pixels).step_by(skip_step) {
diff_sum_r += (samples_a[pos + 0] as i32 - samples_b[pos + 0] as i32).abs();
diff_sum_g += (samples_a[pos + 1] as i32 - samples_b[pos + 1] as i32).abs();
diff_sum_b += (samples_a[pos + 2] as i32 - samples_b[pos + 2] as i32).abs();
pos += pos_step;
}
let lr = LUMA_R / 255.0;
let lg = LUMA_G / 255.0;
let lb = LUMA_B / 255.0;
let diff_sum = diff_sum_r as f64 * lr + diff_sum_g as f64 * lg + diff_sum_b as f64 * lb;
diff_sum / (num_pixels as f64 / skip_step as f64)
}
This function works fine. My problem is that it's slower than I'd expect. On a release build, a images of 1500x1500 pixels takes about 10ms to be checked. That's not terrible, but considering there's other more complicated code (using clone()
, and even a rect painter with put_pixels()
) that executes in ~1ms, it seems like I'm doing something wrong.
Those 10ms is already after a lot of optimizations! The code used to use pixel enumerations, channel()
, etc. That .as_flat_samples().samples
call seems to be the closest I could get to the data without many conversions/copying.
I'm new to Rust (I know more about GC languages, and some C/C++ experience) so I'm sure I'm just not knowing which shortcuts to take to make that comparison more performant.
So are there any tips on what I should be doing? The loop is of course the big bottleneck on that whole execution, but I think I've extracted all I could from it. I hate the two as i32
castings, but they're necessary so I could get abs()
to work on what was previously u8
s.
Other things I've tried:
-
max(a, b) - min(a, b)
: slower -
if a < b { b - a } else { a - b}
: same speed - higher level calls like
get_pixel
,enumerate_pixels
: also slower
Any other ideas?