Reading and Manipulating Image Files

Hello all,

Minimal Reproduceable Example: https://replit.com/@Sky020/ImageToVec#src/main.rs

Objective: I want to understand what std::fs::read does to an image file, with the hope of being able to parse/manipulate the data.

Problem: I have a 10x10 PNG I am opening and reading with std::fs::read. What I expected was either a [u8; 10*10*4] or [u8; 10*10*3], as the result. That is, I expected 100 pixels each with RGB/RGBA values.

However, as can be seen in the example, the len() of the data is 383. This is neither 300 nor 400.

Question: What do those 383 bytes consist of for a 10x10 PNG?

Extra: I have considered the image is not actually 10x10, but all I have to go on is the fact I used Windows built-in tool to create the 10x10, and this is what Windows says:
image

If it has something to do with the Bit depth, then please go easy on me, as I am out of my depth with such.

Any clarification is appreciated.

This isn't particularly a Rust-specific question, since std::fs::read just reads the whole of the file, but the PNG format is more complex than just providing RGB values for each pixel (because, apart from anything else, how would you tell the difference between a 10x10 pictures and a 1x100 picture). Roughly, a PNG file consists of an 8-byte signature at the start to identify the file as a PNG file, followed by a header chunk that contains information like the height and width of the image and information on how colours are represented. There may then be multiple chunks of image data, a pallette of colours, and other ancillary data. Without spending the time to decode the file you have there, it's hard to say exactly where the length is coming from or what colour representation it's using, but if you want to read PNG files using Rust, you may find the png crate useful.

2 Likes

I've had a look at the file a bit more and it seems your file consists of the following parts (note that each chunk is at least 12 bytes because it has a 4 byte length, 4 byte chunk type and 4 byte checksum):

  • 8 byte signature
  • 25 byte IHDR chunk (containing the header information)
  • 13 byte sRGB chunk (indicating that the standard RGB colour model is being used)
  • 16 byte gAMA chunk (defining a gamma value to adjust brightness)
  • 21 byte pHYs chunk (specifying the intended pixel size)
  • 288 byte IDAT chunk (the actual image data)
  • 12 byte IEND chunk (marking the end of the file)

Removing the length, chunk type and checksum from the IDAT bytes gives 276 bytes of image data, since PNG image data is compressed using a filtering algorithm and DEFLATE compression. Since your file is using RGBA colours with 8 bits per channel, it looks like that is compressed from the expected 400 bytes (although I haven't tried decompressing it).

1 Like

PNG is compressed. If you want just the bytes -- no headers, no colour information, no compression, etc -- then you could export it as "Raw image data" from GIMP (or similar in other applications).

But also, here's the obligatory mention of

@jameseb7 Thank you. That is exactly the information I was looking for.

@scottmcm I am unfamiliar with GIMP. So, will look into it.

In the end, I have gone with using the image crate. Originally, I had hoped to avoid external crates, but considering I am writing this application for multiple image types, I will byte the bullet; the crate is awesome, but I just like trying native development to learn.

If you really want to do it by hand for learning purposes: https://datatracker.ietf.org/doc/html/rfc2083

But you should use one of the crates for a "real" use. They'll be faster and probably better hardened against weird inputs.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.