Read Compressed, Little-Endian File on Big-Endian Machine

I've got a binary file that's compressed with LZSS, and the data is stored in the little-endian layout. I'm trying to decode the file on my PC which is big-endian, but I haven't had any luck with that. I'm using the compression crate, to decode the LZSS data. Does anyone have any ideas to get around the endian-ness. Right now, it seems like I'm better off learning the LZSS compression algorithm, and writing my own that can take endian-ness into account. Either that, or run my program on a Raspberry Pi since it's little-endian. I'm trying the Pi route right now to see if I can rule out endian-ness.

Both filesystems and most compression algorithms including LZSS operates on byte stream and doesn't care about endianness.

What does it means "little-endian layout" here? Does it means UTF-16LE encoded text data?

By "little-endian layout" I only meant that if the compression algorithm needed to interpret multiple bytes as numeric data at any point (i16, u16, i32, u32, f32, f64, etc) then the bytes will be read in reverse order on a big-endian machine. The header data to the image files contained in those LZSS blobs needed to be reversed.

It sounds like this isn't the case though. I could be using the compression crate incorrectly since I'm still figuring out how LZSS works. I don't know where to get the sizes for the window size, search buffer, or look-ahead. Not sure if they're contained within the files themselves, or if it's a hard-coded constant that just needs to be "known" for all of those files.

Typically, on-disc formats (it any format that might be seen by another computer) will have their layouts explicitly specified, and well written libraries for them will do endianness correctly as a consequence.

This is especially so for compression formats, where compact representation is paramount, and so you tend to have things like “these 5 bits are the offset into array $foo, these 3 bits are the 0-based bias for the $frobinator”; there is no concept of endianness there!

Ok, I see. So compression generally doesn't have concepts of 16-bit and 32-bit data. It's just all raw bits and sometimes bytes, and if operations span multiple bytes, it's about jumping between bytes; never really treating multiple bytes a one solid concept, such as a 16-bit or 32-bit datatype.

I think I understand now. I haven't really done anything with compression, so these concepts are really new to me. I'm trying to decompress some image files from the PSP version of Final Fantasy IV, and this is somehow the first time I've ever had to learn about compression algorithms.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.