Reading from a TAR file as a File/Cursor

Hi. I'm trying to use the tar-rs library in what might be a unique (or crazy!) manner. Long story short, I have a TAR file that has several files in it. I need to be able to feed one of the files into a partial decompression engine. If I give it a File (or, more accurately, something with the Read and Seek traits), it's happy. (I'm guessing a Cursor would be okay too.) I know I can read the compressed file in the TAR file into its own file. The problem is that it's possible I'll have files that are several gigabytes large. Therefore, I'd like to come up with a scheme that will take the Entry and translate it into something appropriate. Maybe my Rust-fu isn't good enough but I went over the tar-rs code and couldn't find anything that looked like it'd work. I also couldn't find any wrappers that can accomplish what I'm seeking.


Just browsing docs, but it looks like you want to start with a reader that can seek, make that into an Archive, get ahold of the desired Entry, check the Header to make sure you have a non-sparse file entry, get the entry size, get the raw_file_position, recover your underlying reader, seek to the position, take the entry size.

Then you'll have a Read + Seek that should correspond to the file in the tar.


1 Like

Thanks! That almost got me there. The fly in the ointment appears to be the fact that Take doesn't implement the Seek trait. I tried to use a Cursor wrapper but it requires whatever's placed in it to implement the AsRef<[u8]> trait, which Take doesn't do. Uggh! Thoughts?

Yeah, thinking about it, there's no universally unsurprising implementation of Seek for Take , and even if it had one, it wouldn't involve pretending the current position was the start.

You need something that takes both an ending and beginning position and an underlying Read + Seek, and acts like that's the entire "file" both in terms of reading and seeking. It would have to translate seek positions for all the possible seek strategies, e.g. not allowing seeking before the beginning of the entry. Such a thing might exist as a crate, or you could create your own.

Here's a POC that could use better errors and needs a lot more testing. And find_it should consume the reader on failure. But if you don't find anything in the ecosystem, maybe it's a good start.


Thanks! I'll play with this ASAP and see what comes up.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.