Efficient way to concatenate/append image in rust

Assuming I have 4 small image buffers with the size 1024x1024, like:

let mut img_buf = <ImageBuffer<Rgba<u8>, _>>::new(1024, 1024);

I need to concatenate them in a way to make a 2048x2048 image.
Creating a new 2048x2048 image buffer and iterating through every pixels works, but it is a really computational expensive task. I want to do it on multiple big images so I wonder if there is a better way.
Basically, here is the trivial way:

let mut img_buf_b = <ImageBuffer<Rgba<u8>, _>>::new(2048, 2048);
let mut img_buf_s_1 = <ImageBuffer<Rgba<u8>, _>>::new(1024, 1024);
let mut img_buf_s_2 = <ImageBuffer<Rgba<u8>, _>>::new(1024, 1024);
let mut img_buf_s_3 = <ImageBuffer<Rgba<u8>, _>>::new(1024, 1024);
let mut img_buf_s_4= <ImageBuffer<Rgba<u8>, _>>::new(1024, 1024);
//Do something
for i in 0..2048 {
        for j in 0..2048 {
            if i<1024 && j<1024 {
                img_buf_b.put_pixel(i,j,*img_buf_s_1.get_pixel(i, j);                     
            } else  if i<1024 && j>=1024 {
                img_buf_b.put_pixel(i,j,*img_buf_s_2.get_pixel(i, j-1024);
            } else  if i>=1024 && j<1024 {
                img_buf_b.put_pixel(i,j,*img_buf_s_3.get_pixel(i-1024, j);
            } else {
                img_buf_b.put_pixel(i,j,*img_buf_s_4.get_pixel(i-1024, j-1024);

So any idea on a more optimal method of append/concatenate for improved performance?
Thank you.

Perhaps you can tell us something about the ImageBuffer type?

Without knowing the types, if guess you want get_pixel rather than get_pixel_mut, since you aren't modifying your inputs.

Oh yes
Lemme edit that.
ImageBuffer is somewhat an image representation with parameterized value, used in the create image-rs, a rust image processing library.

Try this:

img_buf_b.copy_from(&img_buf_s_1, 0, 0);
img_buf_b.copy_from(&img_buf_s_2, 0, 1024);
img_buf_b.copy_from(&img_buf_s_3, 1024, 0);
img_buf_b.copy_from(&img_buf_s_4, 1024, 1024);

Also note that in general it is faster to iterate through an ImageBuffer row-wise rather than column-wise. Your original code might be faster if you switch the order of the i and j loops so that the inner loop writes one row instead of one column. Using the ImageBuffer::rows and rows_mut iterators can help, too.