Remove large white strips between images?

I work for a newspaper, they used to put white strips between articles to make the paper look big but now that they have stopped doing that they want me to go and remove these spaces from images and separate articles, is there a way to do this using image library? I have tried Python with some success but it was quite slow, I have to do this for >100k articles so hoping there is a way to do this in Rust.

I can’t really help you, but I can recommend Image crate

I like the image crate, too. But it only offers very simple image manipulation operations. I don't think you can use the image crate to remove a part of an image as shown in @rustusaodic's example. Personally, if I were hell-bent on using rust, I'd look into some sort of canvas implementation for rearranging the articles, like rust-skia.
I don't see why using python is such a bad idea, though. It offers bindings for OpenCV, which is a powerful image processing tool and super fast.

1 Like

I did not know about OpenCV, I'll use python, thank you :slight_smile:

If you have any ML experience, this feels like something you could use an object detection model like YOLO. Maybe something like poly-yolo if the bits you want to keep aren't perfectly rectangular, otherwise YOLOv3 could also work well for you.

I'm no ML engineer so I wouldn't know how to do the training, but there should be loads of tutorials on the internet using TensorFlow and Python.

Obviously, there are going to be analytical solutions which are potentially faster than ML, but often training a model is faster to develop than writing your own code that has to deal with edge cases manually.


What format are the "articles" ? What file type?