New crate: subdiff


A replacement for diff that is a bit smarter about filtering out parts of the line before comparison, based on user-provided regular expressions. This is something I wished for every few months, so hoping it might be useful to someone else too.

I’m quite happy with how some of the more advanced features have worked out; e.g. the automatic inference of the character class for changed (but where the change has been ignored by the user-supplied regex) context lines:

$ subdiff --context-format=cc  -r "^.*:.*:.*:(.*)" boot.old
@@ -49,26 +51,27 @@
 \a+ \d+ \d+:\d+:\d+ \w+ kernel: x86/PAT: MTRRs disabled, skipping PAT initialization too.
 \a+ \d+ \d+:\d+:\d+ \w+ kernel: x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WC  WP  UC  UC  
 \a+ \d+ \d+:\d+:\d+ \w+ kernel: e820: last_pfn = 0x9cf00 max_arch_pfn = 0x400000000
-Nov 09 20:45:39 localhost kernel: Scanning 1 areas for low memory corruption

More examples in the README file.

The tests exhaustively compare subdiff's output against that of GNU diff for all combinations of a set of lines up to a certain length and verify that the basic functionality with regard to selecting parts of the line and presenting the differences works as expected. That said, I wasn’t shy about adding options and no amount of manual testing can compete with users for breadth of coverage so, here we are :slight_smile:

Many thanks to the authors of the regex and lcs-diff crates (along with the rest of the ecosystem of course) for making the process of building this utility both possible and enjoyable.