Announcing Hole-Punch: A dead simple library for finding holes in sparse files

I've been trying to work with sparse files a lot recently, and noticed that this is a bit of a gap in the rust ecosystem.

hole-punch is a library that gives you a dead simple way to find out which parts of a file have data in them and which are holes.

It aims to be cross platform, but currently only supports *nix, windows support is, however, in the works and should hopefully be published in the next few days (depending on how much time I have available in front of my windows computer to test).

It provides an extension trait to File that has a single method (at the moment, I plan on adding a fall back "check every byte to find chunks of zeros" method, and a method that tries checking for sparsity at the filesystem level first, then automatically fails over to the fallback method) that returns a list of segments in the file, and each segment is identified as either containing data, or being a hole

https://crates.io/crates/hole-punch

1 Like

Little update: It has windows support now.
If there are any MacOs gurus out there, I would greatly appreciate your input on how getting information about file sparsity on MacOs with APFS works, as I do not actually have a mac.

1 Like

You might consider declaring sparse file handling on macOS a non-goal due to strange/lacking support for it.

From what I have seen, APFS definitely has full blown support for sparse files, it just seems like Apple has not fully updated the userland tools to be aware of them, which is honestly sort of similar to the situation on windows. Though windows does go the extra mile and intercept writes to sparse files do discard zero-only writes to mitigate the impact of programs that aren't aware of their existence.

Inspection of some programs I have found claiming to support sparse files on APFS seems to indicate that it is done through the semi-standard *nix way of calling lseek with SEEK_DATA/SEEK_HOLE, in which case this library will work out of the box, but I really need someone with a Mac to test

If you give me some steps to follow, Iā€˜d be happy to test it. I have an up-to-date Mac.

  1. Clone https://gitlab.com/asuran-rs/hole-punch
  2. cargo test

The test suite attempts to create sparse files, then tries to SEEK_HOLE/SEEK_DATA through them on *nix platforms. If those calls don't succeed, then it should blow up with ScanError::UnsupportedFileSystem errors.

Of course it may not even build, but that would also be useful information.

Test fail on HFS+ (OSX 10.12.6, pre-APFS. yes i'm vulnerable to all the things) https://clbin.com/TXsg6

according to /en.wikipedia.org/wiki/Sparse_file

Apple's HFS+ does not provide for sparse files, but in OS X, the virtual file system layer supports storing them in any supported file system, including HFS+[citation needed]. Apple File System (APFS), announced in June 2016 at WWDC, also supports them.

I'll give it another go once i upgrade to 10.15 or if someone else wants to run it on APFS?

After

diff --git a/src/lib.rs b/src/lib.rs
index df11353..579f281 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -16,6 +16,7 @@ cfg_if::cfg_if! {
     if #[cfg(any(target_os = "linux",
                  target_os = "android",
                  target_os = "freebsd",
+                 target_os = "macos",
     ))]{
         mod unix;
     } else if #[cfg(windows)] {
error[E0432]: unresolved imports `libc::SEEK_DATA`, `libc::SEEK_HOLE`

https://clbin.com/ADVfE

So.. boo

Continued at https://gitlab.com/asuran-rs/hole-punch/-/issues/1

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.