I have lots (tens of thousands) media files, some of which are corrupted. I'm thinking of way to check their integrity in Rust, so I need a non-panicking way to parse them (and to validate that, given resolution w by h, there are indeed w*h pixels).
Those files are mostly jpeg, png, mp4, with a few files of other formats also present.
What solution would be the fastest? I'm considering
All else being equal, spawning a subprocess per file will always be slower than not doing that.
But you might not want to link against ffmpeg, and you might have trouble finding pure-Rust libraries for all the formats you want to process.
You could use a mixed strategy — check the extension or magic numbers, and then validate it using image or ffmpeg or whatever else can handle the format and is readily available to you.
While spawning overhead might be considerable, ffmpeg is just unreasonably faster that image crate. Additionally, it looks like a one shot thing, thus you can maybe link against ffmpeg or opencv (for opencv there's already a good crate that will do everything for you. idk about ffmpeg though, maybe write a bit extern "C" , you're not shipping it anyways).
Can you like take a small sample and try out everything? The task itself is trivial, there's just a lot of volume, and just trying all options may not be as time consuming.
Video is such a mess that you should probably just shell out to ffmpeg, yes. Video decoding is expensive enough that the "start the process" overhead is probably small enough to not care.