Creating mutable sub arrays of Image buffer

Hi I am new to rust and attempting to learn enough to evaluate the language by writing in Rust a program that I have successfully written in Kotlin. I am probably going to ask some pretty stupid questions since the resource management is totally different to that with which I am familiar.

I will be processing an image hopefully on multiple threads at once. I have written the code to do grey scale conversion the hard way (as a stepping stone to something more useful):

  • Load image
  • Iterate through pixels (x, y) converting each to grey
  • Write the image.

How do I get mutable sub-arrays so that I can use multiple threads. image::deref_mut looks interesting, but the docs look a little dense to my untrained eye:

`impl<P, Container> IndexMut<(u32, u32)> for ImageBuffer<P, Container>

where
P: Pixel + 'static,
P::Subpixel: 'static,
Container: Deref<Target = [P::Subpixel]> + DerefMut, `

I am assuming that the two parameters define the range of the sub array, but is this start,end or start,length? Is the resultant structure addressed as a one dimensonal array or a two dimensional array (unlikely).

My apologies if I am barking up the wrong kettle of fish.

Phill

When working with the image crate, IndexMut probably isn't what you're looking for because the borrow checker will restrict access to its returned &mut reference to only one at a time.

However, the crate does offer an optional rayon feature, which does the heavy-lifting of multi-threaded processing for you already. Once the feature is enabled, look to the par_pixels_mut method which can be combined with for_each to modify each of the image's pixels in parallel.

2 Likes

That particular IndexMut impl is most likely for (x, y) coordinates. That is, for indexing a single pixel at a time. People have a really bad habit of not documenting trait impls, so the docs just show the doc comment in the trait itself, which is terrible :frowning:

The best way to go forward is likely to use the rayon feature, but for the record, you can get the underlying container of (sub)pixels (eg. RGBRGBRGB…) from the ImageBuffer as a slice, either by a direct deref [1] or the method as_raw(), and then safely partition the slice into mutable subslices with split_at_mut() or get_disjoint_mut().


  1. ImageBuffer implements the "magic" Deref trait, which means you can use any slice methods directly as ImageBuffer methods. ↩︎

1 Like

I believe you want as_flat_samples_mut.

1 Like

Ah, thank you @tuffy, that sounds as if it might be the way to go.....

I need to do some reading about this. I fully expected that, but I am delighted that there are multi-threaded features available.

I am assuming that there are some examples around somewhere that I can use to clarify my understanding? If so a link would be greatly appreciated.

Thanks

Phill

@tuffy, @jdahlstrom, I have looked at Rayon and it seems like a cool encapsulation of the parallelism. My problem with this is that a large number of jobs will be put on the queue (one for each pixel, and some of the images will be very large).

Is there a way to break the image into groups of pixels (for example scan lines) each of which could be dealt with as a block of pixels, but each line could be potentially processed in parallel to each other line?. That might avoid some of the overhead of queuing jobs and be a good compromise between efficiency within and between threads.

In general, when using Rayon parallel iterators, Rayon doesn’t create a “job” for each individual item (such as an image pixel here); it chooses a chunk size so as to have enough jobs for all cores, and then processes each chunk serially. You can also tune this behavior with with_min_len() and with_max_len(), or use .par_chunks_mut() to get chunks for you to explicitly process yourself (this could be useful for doing explicit SIMD processing, for example).

Is the call as_chunks_mut ?

Using this call I can create an arbitrary number of chunks each with several pixels, then derive a parallel iterator across these chunks, and use Rayon to parallelise the processing?

Hey, thanks @kpreid - that sounds exactly what I would want. I didn’t see that bit in the docs. I will read more closely…..

I don’t care about absolute chunk size - just more than one pixel at a time….

Maintainer of image here. The question overlaps with internal works that we have not published yet but this has been bugging for for a while, so I've written matrix-slice when I had the time to. You'll get independent mutable references to blocks of the buffer through the rows of these blocks. It's not fully fleshed out yet but compatible with image's representation. I suppose some interaction would be good but an external safety review is warranted before adding that even as an optional dependency.

You can find more context in our old issue here: 2 dimensional slices · Issue #888 · image-rs/image · GitHub which we had to close for prioritization since it won't even be fully solved with custom DSTs that are a bit a way and GitHub · Where software is built for a different reiteration. Until then at least that crate exists for the non-owning use cases.

1 Like

Thanks, that is cool @197g - but I am only just starting with Rust so I don’t really appreciate the technicalities yet. It is reassuring to know that a maintainer of an important package (correct term?) is monitoring this forum. Great stuff!

Ok I have it working multi-threaded using Rayon. Cool.

Would anyone like to critique my code -is there anything that I could have done better/neater?

use std::time::{SystemTime, UNIX_EPOCH};

fn main() {
    use image::{open};
//    use image::{open, Rgb};

    let image = open("bread.png").unwrap();
    let mut buffer = image.into_rgb8();
    let size = buffer.dimensions();
    let width = size.0;
    let height = size.1;
    
    println!("opened image {width},{height}");
    // Convert to grey the hard way and measure the time taken
    let start = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();

    // The slowest way getPixel, modify, putPixel takes 111ms on my machine
    // for x in 0.. width {
    //     for y in 0 ..height {
    //         // use std::ops::Shl;
    //         let mut rgb = buffer.get_pixel(x, y);
    //         let r = rgb[0] as u16;
    //         let g = rgb[1] as u16;
    //         let b = rgb[2] as u16;
    //         let grey = ((r + g + b) / 3) as u8;
    //         buffer.put_pixel(x, y, Rgb([grey, grey, grey]));
    //     }
    // }

    // // Direct modification of pixels in situ takes 78ms on my machine
    // buffer.enumerate_pixels_mut().for_each(
    //     | pix | {
    //         let rgb = pix.2;
    //         let r = rgb[0] as u16;
    //         let g = rgb[1] as u16;
    //         let b = rgb[2] as u16;
    //         let grey = ((r + g + b) / 3) as u8;
    //         rgb[0] = grey;
    //         rgb[1] = grey;
    //         rgb[2] = grey;
    //     }
    // );

    // Parallel processing using Rayon to do the heavy lifting.  Takes 7ms on my machine.
    use rayon::prelude::*;
    let ppm = buffer.par_enumerate_pixels_mut();
    ppm.for_each(
        | pix | {
            let rgb = pix.2;
            let r = rgb[0] as u16;
            let g = rgb[1] as u16;
            let b = rgb[2] as u16;
            let grey = ((r + g + b) / 3) as u8;
            rgb[0] = grey;
            rgb[1] = grey;
            rgb[2] = grey;
        }
    );

    let end = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
    println!("Finished in {}ms", end.as_millis() - start.as_millis());
    use std::path::Path;
    buffer.save(&Path::new("out.png")).expect("Failed to write");
}