I've been trying to work with sparse files a lot recently, and noticed that this is a bit of a gap in the rust ecosystem.
hole-punch is a library that gives you a dead simple way to find out which parts of a file have data in them and which are holes.
It aims to be cross platform, but currently only supports *nix, windows support is, however, in the works and should hopefully be published in the next few days (depending on how much time I have available in front of my windows computer to test).
It provides an extension trait to File that has a single method (at the moment, I plan on adding a fall back "check every byte to find chunks of zeros" method, and a method that tries checking for sparsity at the filesystem level first, then automatically fails over to the fallback method) that returns a list of segments in the file, and each segment is identified as either containing data, or being a hole
Little update: It has windows support now.
If there are any MacOs gurus out there, I would greatly appreciate your input on how getting information about file sparsity on MacOs with APFS works, as I do not actually have a mac.
From what I have seen, APFS definitely has full blown support for sparse files, it just seems like Apple has not fully updated the userland tools to be aware of them, which is honestly sort of similar to the situation on windows. Though windows does go the extra mile and intercept writes to sparse files do discard zero-only writes to mitigate the impact of programs that aren't aware of their existence.
Inspection of some programs I have found claiming to support sparse files on APFS seems to indicate that it is done through the semi-standard *nix way of calling lseek with SEEK_DATA/SEEK_HOLE, in which case this library will work out of the box, but I really need someone with a Mac to test
The test suite attempts to create sparse files, then tries to SEEK_HOLE/SEEK_DATA through them on *nix platforms. If those calls don't succeed, then it should blow up with ScanError::UnsupportedFileSystem errors.
Of course it may not even build, but that would also be useful information.
Test fail on HFS+ (OSX 10.12.6, pre-APFS. yes i'm vulnerable to all the things) https://clbin.com/TXsg6
according to /en.wikipedia.org/wiki/Sparse_file
Apple's HFS+ does not provide for sparse files, but in OS X, the virtual file system layer supports storing them in any supported file system, including HFS+[citation needed]. Apple File System (APFS), announced in June 2016 at WWDC, also supports them.
I'll give it another go once i upgrade to 10.15 or if someone else wants to run it on APFS?