I want to be able to read and write some metadata consisting of
- 2
String
s rarely exceeding 20 characters each - 4 integers
to/from a JPEG, in pure Rust.
Can you recommend any crates that might help with this, or offer any other relevant advice?
I want to be able to read and write some metadata consisting of
String
s rarely exceeding 20 characters eachto/from a JPEG, in pure Rust.
Can you recommend any crates that might help with this, or offer any other relevant advice?
This looks like a good crate for your needs: crates.io: Rust Package Registry
I've been banging my head against a brick wall for a while.
I can
but when I read it back in from disk, the new segment does not appear.
use std::env::args;
use img_parts::jpeg::{self, JpegSegment};
const OUR_MARKER: u8 = jpeg::markers::COM;
fn main() {
let mut args = args();
let _executable = args.next();
let path = args.next().unwrap();
println!("Opening {path}");
let input = std::fs::read(&path).unwrap();
let mut jpeg = jpeg::Jpeg::from_bytes(input.into()).unwrap();
report_segments(&jpeg, OUR_MARKER, "Segments when loading");
let segments = jpeg.segments_mut();
segments.push(JpegSegment::new_with_contents(
OUR_MARKER,
img_parts::Bytes::from("**** WE WROTE THIS: THIS IS OURS *****"))
);
report_segments(&jpeg, OUR_MARKER, "Segments before writing");
let output = std::fs::File::create(&path).unwrap();
let bytes_written = jpeg.clone().encoder().write_to(output).unwrap();
println!("Wrote {bytes_written} bytes to {path}");
let it = jpeg.segment_by_marker(OUR_MARKER);
println!("Our new segment retrieved: {it:?} {contents:?}", contents = std::str::from_utf8(it.unwrap().contents()));
println!("Re-reading the jpeg we just wrote");
let new_input = std::fs::read(&path).unwrap();
let new_jpeg = jpeg::Jpeg::from_bytes(new_input.into()).unwrap();
report_segments(&new_jpeg, OUR_MARKER, "Segments after re-reading");
}
fn report_segments(jpeg: &jpeg::Jpeg, our_marker: u8, msg: &str) {
println!("=============== {msg} ===============");
for (n, segment) in jpeg.segments().iter().enumerate() {
let marker = segment.marker();
println!("Marker {n:2} in input: {marker}");
}
println!("---- Contents of OUR segments ----");
for segment in jpeg.segments_by_marker(our_marker) {
let contents = std::str::from_utf8(segment.contents()).unwrap();
println!(" {contents}");
}
println!("---- End of segment report ----");
}
cargo run --release --bin wtf -- /tmp/test.jpg
Finished `release` profile [optimized] target(s) in 0.18s
Running `target/release/wtf /tmp/test.jpg`
Opening /tmp/test.jpg
=============== Segments when loading ===============
Marker 0 in input: 224
Marker 1 in input: 219
Marker 2 in input: 219
Marker 3 in input: 192
Marker 4 in input: 196
Marker 5 in input: 196
Marker 6 in input: 196
Marker 7 in input: 196
Marker 8 in input: 218
---- Contents of OUR segments ----
---- End of segment report ----
=============== Segments before writing ===============
Marker 0 in input: 224
Marker 1 in input: 219
Marker 2 in input: 219
Marker 3 in input: 192
Marker 4 in input: 196
Marker 5 in input: 196
Marker 6 in input: 196
Marker 7 in input: 196
Marker 8 in input: 218
Marker 9 in input: 254
---- Contents of OUR segments ----
**** WE WROTE THIS: THIS IS OURS *****
---- End of segment report ----
Wrote 5792 bytes to /tmp/test.jpg
Our new segment retrieved: Some(JpegSegment { marker: 254 }) Ok("**** WE WROTE THIS: THIS IS OURS *****")
Re-reading the jpeg we just wrote
=============== Segments after re-reading ===============
Marker 0 in input: 224
Marker 1 in input: 219
Marker 2 in input: 219
Marker 3 in input: 192
Marker 4 in input: 196
Marker 5 in input: 196
Marker 6 in input: 196
Marker 7 in input: 196
Marker 8 in input: 218
---- Contents of OUR segments ----
---- End of segment report ----
Can you spot what I'm doing wrong?
All I can think of is that you are reading stale data. You already have a handle to the file here:
let output = std::fs::File::create(&path).unwrap();
Just re-use it to get the file contents:
-let new_input = std::fs::read(&path).unwrap();
+let mut new_input = Vec::new();
+output.read_to_end(&mut new_input).unwrap();
The on-disk file grows by N bytes, every time I run this program, but the new segments never appear when it is read in.
Even new processes don't find the new segments when reading in a file that occupies more space than it did before the previous execution of the program.
I think this is inconsistent with the stale data hypothesis, unless I'm failing to grasp your point.
...
00001630: 669c a5c3 0268 df6d 7485 8978 69b3 d485 f....h.mt..xi...
00001640: 4a19 8317 2055 dff8 d63f ffd9 fffe 0028 J... U...?.....(
00001650: 2a2a 2a2a 2057 4520 5752 4f54 4520 5448 **** WE WROTE TH
00001660: 4953 3a20 5448 4953 2049 5320 4f55 5253 IS: THIS IS OURS
00001670: 202a 2a2a 2a2a fffe 0028 2a2a 2a2a 2057 *****...(**** W
00001680: 4520 5752 4f54 4520 5448 4953 3a20 5448 E WROTE THIS: TH
00001690: 4953 2049 5320 4f55 5253 202a 2a2a 2a2a IS IS OURS *****
000016a0: fffe 0028 2a2a 2a2a 2057 4520 5752 4f54 ...(**** WE WROT
000016b0: 4520 5448 4953 3a20 5448 4953 2049 5320 E THIS: THIS IS
000016c0: 4f55 5253 202a 2a2a 2a2a fffe 0028 2a2a OURS *****...(**
000016d0: 2a2a 2057 4520 5752 4f54 4520 5448 4953 ** WE WROTE THIS
000016e0: 3a20 5448 4953 2049 5320 4f55 5253 202a : THIS IS OURS *
000016f0: 2a2a 2a2a fffe 0028 2a2a 2a2a 2057 4520 ****...(**** WE
00001700: 5752 4f54 4520 5448 4953 3a20 5448 4953 WROTE THIS: THIS
00001710: 2049 5320 4f55 5253 202a 2a2a 2a2a fffe IS OURS *****..
00001720: 0028 2a2a 2a2a 2057 4520 5752 4f54 4520 .(**** WE WROTE
00001730: 5448 4953 3a20 5448 4953 2049 5320 4f55 THIS: THIS IS OU
00001740: 5253 202a 2a2a 2a2a ffed 0028 2a2a 2a2a RS *****...(****
00001750: 2057 4520 5752 4f54 4520 5448 4953 3a20 WE WROTE THIS:
00001760: 5448 4953 2049 5320 4f55 5253 202a 2a2a THIS IS OURS ***
00001770: 2a2a ffed 0028 2a2a 2a2a 2057 4520 5752 **...(**** WE WR
00001780: 4f54 4520 5448 4953 3a20 5448 4953 2049 OTE THIS: THIS I
00001790: 5320 4f55 5253 202a 2a2a 2a2a ffed 0028 S OURS *****...(
000017a0: 2a2a 2a2a 2057 4520 5752 4f54 4520 5448 **** WE WROTE TH
000017b0: 4953 3a20 5448 4953 2049 5320 4f55 5253 IS: THIS IS OURS
000017c0: 202a 2a2a 2a2a ffed 0028 2a2a 2a2a 2057 *****...(**** W
000017d0: 4520 5752 4f54 4520 5448 4953 3a20 5448 E WROTE THIS: TH
000017e0: 4953 2049 5320 4f55 5253 202a 2a2a 2a
fffe
for each time it was run with COM
and ffed
for each time it was executed using APP13
as the marker.So, clearly the segment is added to the file each time the program runs, but it is not being discovered when it is read back in.
This is insane: two JPEGs containing identical bytes have different segments, according to the segment
and segment_by_marker
methods!
use std::{env::args, io::Write};
use std::fs::File;
use std::path::Path;
use img_parts::jpeg::{self, JpegSegment, Jpeg};
const OUR_MARKER: u8 = jpeg::markers::APP7;
fn main() {
// Get path of JPEG from CLI
let mut args = args();
let _executable = args.next();
let path = args.next().unwrap();
// Read a JPEG and report its segments
let mut jpeg_original = read_jpeg(&path);
report_segments(&jpeg_original, "Segments when first loaded");
// Add our segment to a clone of the JPEG, and report the segments: new segment is found
let jpeg_before_new_segment = jpeg_original.clone();
let segments = jpeg_original.segments_mut();
segments.push(make_our_segment());
let jpeg_with_new_segment = jpeg_original; // this is a move
report_segments(&jpeg_with_new_segment, "Segments after pushing new segment");
// The bytes method gives something different before/after the segment has been added
assert!(! compare_bytes_in_jpeg(jpeg_before_new_segment, jpeg_with_new_segment.clone()));
// Recrate the JPEG by roundtrip via bytes
let jpeg_via_bytes = bytes_to_jpeg(&jpeg_to_bytes(jpeg_with_new_segment.clone()));
// ... the bytes in the new JPEG are identical to the ones in the old one ...
assert!(compare_bytes_in_jpeg(jpeg_with_new_segment.clone(), jpeg_via_bytes.clone()));
// ... but the new segment is not found in the copy-via-roundtrip
report_segments(&jpeg_via_bytes, "Segments after pushing new segment and in-memory roundtrip");
// Recrate the JPEG by roundtrip via file
write_jpeg(jpeg_with_new_segment.clone(), &mut File::create(&path).unwrap());
let jpeg_via_file = read_jpeg(&path);
// The bytes are identical again ...
assert!(compare_bytes_in_jpeg(jpeg_with_new_segment.clone(), jpeg_via_file.clone()));
// ... but the new segment is missing, again.
report_segments(&jpeg_via_file, "Segments after roundtrip via file");
// Sanity check: is the new segment still present in the original file
report_segments(&jpeg_with_new_segment, "Sanity check: segments in the only place where the new segment was found");
}
fn report_segments(jpeg: &jpeg::Jpeg, msg: &str) {
println!("\n=============== {msg} ===============");
println!("Bytes in JPEG = {}", jpeg_to_bytes(jpeg.clone()).len());
for (n, segment) in jpeg.segments().iter().enumerate() {
let marker = segment.marker();
println!("Marker {n:2} in input: {marker:x}");
}
println!("---- Looking for OUR segments ----");
if let Some(our_segment) = jpeg.segment_by_marker(OUR_MARKER) {
let contents = std::str::from_utf8(our_segment.contents()).unwrap();
println!("Our segment was found: `{our_segment:x?} {contents}`.")
} else {
println!("Our segment was NOT found.")
}
println!("---- End of segment report ----");
}
fn make_our_segment() -> JpegSegment {
JpegSegment::new_with_contents(
OUR_MARKER,
img_parts::Bytes::from("<THIS IS DRIVING ME NUTS>")
)
}
fn compare_bytes_in_jpeg(a: Jpeg, b: Jpeg) -> bool {
let a = jpeg_to_bytes(a);
let b = jpeg_to_bytes(b);
a == b
//println!("The bytes in these two JPEGs are {}", if a == b { "IDENTICAL" } else { "DIFFERENT" });
// println!("Lengths: {} {}", a.len(), b.len());
// for (i, (a, b)) in a.into_iter().zip(b).enumerate() {
// print!("{a:02x} {b:02x} {} ", if a==b {""} else {"XXXX"});
// if i%16 == 15 { println!(); }
// }
// println!();
}
fn bytes_to_jpeg(bytes: &[u8]) -> Jpeg {
Jpeg::from_bytes(bytes.to_owned().into()).unwrap()
}
fn jpeg_to_bytes(jpeg: Jpeg) -> Vec<u8> {
let mut bytes = vec![];
write_jpeg(jpeg, &mut bytes);
bytes
}
fn write_jpeg(jpeg: Jpeg, sink: &mut impl Write) {
jpeg.encoder().write_to(sink).unwrap();
}
fn read_jpeg(path: impl AsRef<Path>) -> Jpeg {
bytes_to_jpeg(&std::fs::read(&path).unwrap())
}
That's interesting. I'm as clueless as you are here. Personally, I would ask the crate's author, they are more likely to have insights on what's going on here.
Looking at the implementation
it seems that the from_bytes
function stops looking for any further segments once it has found the EOI
(End of Image) marker, and all the segments I'm adding are placed after EOI
.
Looks like a bug. I'll open an issue.
That's probably why on the examples they were using insert
rather than push
to add new segments.
This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.