Algos work entirely different in Python and Rust?

I've been trying to write a decoding function in Rust based on a common PRNG. I first implemented it in Python for testing and it worked as expected. Then I tried doing a Rust implementation and get the real deal coded. That fails miserably. What I've discovered is the seed calculation works, but it's the actual byte decoding which is incorrect?

Working Python test implementation

import sys, struct

if len(sys.argv) > 1:
	with open(sys.argv[1], "r+b") as f:
		buf = f.read()
		seed = 0x19000000

		with open(sys.argv[1] + ".out", "w+b") as f2:
			for b in buf:
				seed = (seed * 0x41C64E6D + 12345) & 0xFFFFFFFF
				print(b ^ (seed >> 24))
				out = struct.pack('B', b ^ (seed >> 24))
				f2.write(out)

Rust implementation:

/// Decodes a block of data
fn decode<T>(buf: &mut T, data: &mut [u8]) -> anyhow::Result<()>
where
	T: Read + Seek,
{
	let pos = buf.stream_position()? as u32;
	buf.read(&mut data[..])?;

	// Decoding uses a common PRNG algorithm
	let mut seed = 0x19000000 + pos;
	data.iter_mut().for_each(|b| {
		seed = seed.wrapping_mul(0x41C64E6D).wrapping_add(12345);
		*b = ((*b as u32) ^ seed >> 24) as u8;
		dbg!(*b);
	});

	Ok(())
}

Note that read is not guaranteed to fill the given slice, instead it returns the number of written bytes.

Also, why are you summing pos in the Rust implementation?

Well in the Python one, for test sake, I did the whole buffer. In a real world scenario, I need this to work on offset chunks in a buffer.

Additionally, for good measure, I put in some extra checks in the decode algo, but for the immediate problem, it's a moot point, because I get the same result regardless.

/// Decodes a block of data
fn decode<T>(buf: &mut T, data: &mut [u8]) -> Result<(), ResBinErr>
where
	T: Read + Seek,
{
	let pos: u32;
	if let Ok(p) = buf.stream_position() {
		pos = p as u32;
	} else {
		return Err(ResBinErr::StreamPos);
	}

	if let Ok(sz) = buf.read(&mut data[..]) {
		if sz != data.len() {
			return Err(ResBinErr::DecReadLen(sz));
		}
	} else {
		return Err(ResBinErr::DecRead);
	}

	// Decoding uses a common PRNG algorithm
	let mut seed = 0x19000000 + pos;
	data.iter_mut().for_each(|b| {
		seed = seed.wrapping_mul(0x41C64E6D).wrapping_add(12345);
		*b = ((*b as u32) ^ seed >> 24) as u8;
		dbg!(*b);
	});

	Ok(())
}

Then you'll have to start the nth number yielded by your PRNG, rather than summing n to the base seed.

It'll be like this if you try to write your decoding algo in python too. And likewise, if you want to encode offsetted chunks.

I would avoid doing this and instead use read_exact instead of read. Also consider using map_err and ? to shorten your error handling code.

1 Like

I'm aware of ? but I'll take this into consideration

Update: Okay, so I examined the read in bytes, and they don't match. Algo is correct, but it's not reading the correct offset