Testing output pdfs

I have a program that produces pdfs, and I'd like to write tests for that output. I care about the "look" of the pdfs, but I don't really mind the internal structure. E.g., I'd surely like to detect if for some reason some wrong font snuck in, or some spaces changed in rather large-ish ways. I'd be fine with false positives sometimes, but I'd rather like to avoid false negatives.

I'm pondering using https://docs.rs/image-compare/latest/image_compare or something similar, which would entail converting the pdfs, which is ok.

Any better ideas, anyone doing this who can share experiences with this? Thanks in advance!

1 Like

Why don't you render those pdfs into images and compare them? You can automate it with tools like playwrite.

1 Like

Yeah that was my idea for image_compare, I just wasn't really sure if that's a good idea or if there a caveats I'm not seeing... not sure about playwrite, it's not a web app. Or am I missing something?

It seems you want to compare rendered outputs, and you can do it. Playwrite is a browser automation toolkit. And for some historical reason web browsers are the most practical way to produce images from pdf programmatically.

Oh ok, thanks for that, but I can already programmatically generate those images without resorting to a browser, so that's fine.

I was more wondering about pitfalls in image comparison I guess, as for which algorithm/similarity measure to use and how good that will work... will give crates.io: Rust Package Registry a shot I think.

Thanks!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.