Parsing raw bytes into signed integers


#1

Purely for educational reasons, I’m attempting to parse raw bytes into signed integers using only safe Rust features.

I’ve already written some code to do this for unsigned integers, and was wondering if anything else needs to be done with the core code or if I just need to cast using as the final result to a signed integer rather than an unsigned one?

In C, I would have used some casting and raw pointer stuff but in safe Rust it’s a bit different and I’ve been fighting with the compiler a bit.

What I have:

    pub fn bvec_to_uint(bvec: &[u8]) -> UintResult
    {
        let bvec_size = bvec.len();
        
        match bvec_size
        {
            1 => return UintResult::Result8(bvec[0]),
            2 =>
            {
                let temp = ((bvec[0] as u16) << 8) | bvec[1] as u16;
                return UintResult::Result16(temp);
            },
            4 =>
            {
                let temp: u32 = (bvec[0] as u32)<< 24 as u32;
                    let temp2: u32 = (bvec[1] as u32)<< 16 as u32;
                    let temp3: u32 = (bvec[2] as u32)<< 8 as u32;
                    let result = temp | temp2 | temp3 | bvec[3] as u32;
                    return UintResult::Result32(result);
            },
            8 =>
            {
                let temp: u64 = (bvec[0] as u64) << 56 as u64;
                let temp2: u64 = (bvec[1] as u64) << 48 as u64;
                let temp3: u64 = (bvec[2] as u64) << 40 as u64;
                let temp4: u64 = (bvec[3] as u64) << 32 as u64;
                let temp5: u64 = (bvec[4] as u64) << 24 as u64;
                let temp6: u64 = (bvec[5] as u64) << 16 as u64;
                let temp7: u64 = (bvec[6] as u64) << 8 as u64;
                let result = temp | temp2 | temp3 | temp4 | temp5 | temp6 | temp7 | bvec[7] as u64;
                return UintResult::Result64(result);
            },
            _ => UintResult::ErrorImproperSize
        }
    }

The lines which look like let result = temp | temp2 | temp3 | temp4 | temp5 | temp6 | temp7 | bvec[7] as u64; were done because I was not able to directly | two values of different type sizes together due to compiler errors.

Additionally, I had tried to simply change that final as to an i instead of u and change the return types, but I was running into some trouble. Is there anything else that needs to be considered here to interpret bytes as signed integers?


#2

If it’s for educational purposes, then instead of your manually unrolled variant I would’ve used loop like this:

let mut a: u32 = 0;
for (n, b) in buf.iter().enumerate() {
    a |= (*b as u32) << (8*(3 - n));
}

As a bonus problems:

  • try to write it in a more “functional” style, i.e. using only iterator adapters without using explicit loop
  • try to write a function which will be able to generically convert buffer to the given type. (you will need to write a custom trait and implement it for u8, u16, u32, u64 and u128)