How to extract a ZipArchive without writing it to disk first?

So, for context: a (government) website publishes information I want to use, but instead of an API they publish it as xml files they compress with zip first (all files are compressed individually, which is why I would like to save them differently on my end).

So now I want to access the content of these files. I don’t understand how the zip crate works, exactly. I have not found a way to give that crate the raw binary zip file to create a ZipArchive.

Code Example:

use zip::read::ZipArchive;
use zip::result::ZipError;
///The bytes-vector is what we get as a result from scraping the website with curl, it is a raw binary
///version of the zip folder which, if we write it to disk, works as a valid Zip Folder we can 
///extract (I checked that that works). 
fn get_zip_archive_from_zip_bytes(bytes: &Vec<u8>) -> Result<ZipArchive<R>, ZipError> {
	//Read is implemented for &[u8] but not for Vec<u8>
	let zip_bytes: &[u8] = bytes;
	zip::ZipArchive::new(zip_bytes)
}

If I try this, it won’t compile and rustc will complain that &[u8] does not fulfil the Seek trait. It does not provide a hint for what kind of binary input would be appropriate instead.

You can use a Cursor, which provides a Seek implementation

let zip_bytes = Cursor::new(bytes);

Side-note: &Vec<u8> is an unusual type to take as a parameter. It's more common to just use &[u8] directly or use Vec<u8> if you want to pass ownership.

4 Likes

Thank you, I think this could be a solution and &Vec was indeed a typo/mishap.

For posterities’ sake, this is the final compiling version.

fn get_zip_archive<R: Read + Seek>(bytes: Vec<u8>) -> Result<ZipArchive<Cursor<Vec<u8>>>, ZipError> {
	let zip_bytes = Cursor::new(bytes);
	zip::ZipArchive::new(zip_bytes)

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.