[Solved] Tar-rs: 'failed to iterate over archive' when unpacking tar

I’m currently getting an error from tar-rs when attempting to unpack a tarball. Documentation the unpack method states that:

This operation is relatively sensitive in that it will not write files outside of the path specified by dst. Files in the archive which have a ‘…’ in their path are skipped during the unpacking process.

Though when used to write entries from a tar into a directory I get the following error: failed to iterate over archive. I have gotten this error with different tar files so I assume that this has to do with entries.

Below is the code where this method is used:

fn extract_tar(&self, path: &str) -> Result<(), Box<error::Error>> {

Looking into the source, I found my particular error but couldn’t deduce what causes it.

Can you upload your failing tar somewhere? Chances are you have extended headers in your archive (usually GNU extensions to support long filenames). Instead of directly using Archive.unpack, you probably want to iterate over the archive entries yourself (which will handle GNU extensions correctly by default) and call .unpack on each individual entry.

I’ve raised an issue on tar-rs which I may look at in a week or so - https://github.com/alexcrichton/tar-rs/issues/56

I was testing this using a fish shell tarball pulled from Github. I had seen that TODO but wasn’t sure if that was the problem. I’ll try iterating through the entries and unpacking them.

Ah currently the tar-rs library only works with uncompressed tarballs, it won’t automatically decompress the tarball for you. You’ll probably want to combine it with flate2 for decoding gzip compression.

Note that the GNU extensions mentioned in the connected issue were actually implemented awhile ago, I just forgot to remove the TODO!

Turns out that archive uses posix global headers, which IIRC aren’t implemented in tar-rs at the moment.

Ah, I see. Is there anyway to check whether a tarball is compressed or not? I’m not entirely sure how to use flate2 with a downloaded tarball.

After reading the documentation, I’m unsure of which method to use. Say I have a tarball file file.tar.gz, as the documentation suggests I should use the read/write traits. As such I came up with this:

// path is a &str and path_tmp is the same path with '_tmp' appended to it.
let mut dec = read::GzDecoder::new(try!(File::open(path))).unwrap();
let mut buff = Vec::new();
try!(dec.read(&mut buff));

Which gives me corrupt gzip stream does not have a matching checksum.

Should I be using read/write traits or the Decompress struct?

Ah you can probably just get away with:

let data = try!(File::open(path));
let decompressed = try!(GzDecoder::new(data));
let mut archive = Archive::new(decompressed);

The Archive type can take any raw Read type (e.g. the decoded version of the compressed file. If you’re still seeing that it’s a corrupt gzip stream then you may want to try the gunzip program on the tarball to see if that succeeds. If that does and flate2 doesn’t then it’s a bug in flate2.

Some notes on the code snipped you wrote though:

  • You shouldn’t need a temporary file as you can pass GzDecoder directly to Archive
  • Ok(try!(expr)) is generally the same as expr
  • foo.read(&mut Vec::new()) won’t actually read anything. The read function reads into a mutable slice, and an empty vector coerces to an empty slice. You’ll want to use read_to_end to read the entire contents of a stream into a Vec
1 Like

Thanks! That seems to work perfectly.

I ran into this error today for a different reason, so I'm sharing it for others that might find this page as I did. My file was downloaded from the internet (into a tempfile) and then I was trying to un-tar the file. I received unexpected EOF because I was at the end of the file. This was resolved with file.seek(SeekFrom::Start(0)).