Rust Generics - could this be simplified?


#1

Hi *,

I’m pretty new to Rust. After coding these two functions:

pub fn four_byte_le_to_u32(data: &[u8]) -> u32 {

    let size: usize = 4;

    if data.len() != size {
        panic!("Length of input-data is not {}: {}", size, data.len());
    }

    let mut result: u32 = 0;

    for x in 0..size {
        result += (data[x] as u32 * (1 << (x * 8)));
    }

    return result;
}

pub fn eight_byte_le_to_u64(data: &[u8]) -> u64 {

    let size: usize = 8;

    if data.len() != size {
        panic!("Length of input-data is not {}: {}", size, data.len());
    }

    let mut result: u64 = 0;

    for x in 0..size {
        result += (data[x] as u64 * (1 << (x * 8)));
    }

    return result;
}

i came to the believe, that now’s the time to learn something about Rust Generics.

After some 4 hours which have been a rather rough ride i came up with this generic version, which compiles and works:

pub fn byte_converter<T>(data: &[u8]) -> T where T: default::Default + From<u8> + ops::AddAssign + ops::Mul<Output = T> + num::NumCast + ops::Shl<T, Output = T> {
    let size: usize = mem::size_of::<T>();

    if data.len() != size {
        panic!("Length of input-data is not {}: {}", size, data.len());
    }

    let mut result: T = Default::default();

    for x in 0..size {
        let g = <T as From<u8>>::from(data[x]);
        let one: T = NumCast::from(1).unwrap();
        let eight: T = NumCast::from(8).unwrap();
        let xx: T = NumCast::from(x).unwrap();

        result += (g * (one << (xx * eight)));
    }

    return result;
}

Alas, it looks kinda complicated and ugly. Sorting out the necessary traits was really cumbersome.

Since I’m a real Rust newbie here’s the question:

Could this be done cleaner, better, shorter?


#2

Since you’re already using num, you might as well use a constraint like Num or PrimInt and it will give you many of your current constraints “for free”.


#3

For cases where you don’t really need generics (i.e. with support for arbitrary types and extensible for new types), and you just want to save typing when handling two or three cases, macros are much simpler and work quite well.


#4

In addition to what others have said, you can probably drop Default and NumCast; since you require From<u8>, you can materialize T values like this instead:

let mut result: T = 0.into();
for x in 0..size {
        let g: T = data[x].into();
        let one: T = 1.into();
        let eight: T = 8.into();
        let xx: T = T::from(x as u8);

        result += (g * (one << (xx * eight)));
    }
    return result;

Also, I realize that the thrust of your post is about generics, but thought I’d mention the byteorder crate that deals with reading/writing values from/to byte buffers - it’s another take on tasks of this nature.


#5

Look at what I did here, which is simpler.

extern crate num;
use num::traits::{PrimInt, FromPrimitive};

fn byte_converter<T>(data: &[u8]) -> Result<T, &'static str>
  where T: PrimInt + FromPrimitive,
{
    let size: usize = ::std::mem::size_of::<T>();

    if data.len() != size {
        return Err("data is the wrong size");
    }

    let mut result = T::zero();

    for x in 0..data.len() {
        result = result + T::from(data[x] as u64 * (1 << (x * 8))).unwrap();
    }

    Ok(result)
}

Panicking in what appears to be a library function is generally a bad idea, so I changed it to return a Result instead. This version removes the only unwrap in the function, helping to prevent panics.


#6

My take on the problem, that shows three alternative solutions. Using macros sounds like a failure of (usability of) Rust generics:

#![feature(i128_type)]

use std::ops::{Add, Shl};
use std::mem::size_of;

fn bytes_le_to_u32(data: &[u8; 4]) -> u32 {
    let mut result = 0;
    for &d in data {
        result = (result << 8) + u32::from(d);
    }
    result
}

fn bytes_le_to_u64(data: &[u8; 8]) -> u64 {
    let mut result = 0;
    for &d in data {
        result = (result << 8) + u64::from(d);
    }
    result
}

fn bytes_le_to_u128(data: &[u8; 16]) -> u128 {
    let mut result = 0;
    for &d in data {
        result = (result << 8) + u128::from(d);
    }
    result
}


macro_rules! generate_bytes_converter {
    ($func_name:ident, $t:ty) => (
        fn $func_name(data: &[u8]) -> Result<$t, &'static str> {
            type T = $t;
            let size: usize = size_of::<T>();

            if data.len() != size {
                return Err("Length of data is wrong.");
            }

            let mut result = 0;
            for &d in data {
                result = (result << 8) + T::from(d);
            }
            Ok(result)
        }
    )
}

generate_bytes_converter!(bytes_le_to_u32b, u32);
generate_bytes_converter!(bytes_le_to_u64b, u64);
generate_bytes_converter!(bytes_le_to_u128b, u128);


// num::Unsigned not defined for u128
trait Unsigned {}
impl Unsigned for u8 {}
impl Unsigned for u16 {}
impl Unsigned for u32 {}
impl Unsigned for u64 {}
impl Unsigned for u128 {}


fn byte_converter<T>(data: &[u8]) -> Result<T, &'static str>
where T: Unsigned + From<u8> + Add<Output = T> + Shl<T, Output = T> {
    if data.len() != size_of::<T>() {
        return Err("Length of data is wrong.");
    }

    let mut result = 0.into();
    for &d in data {
        result = (result << 8.into()) + T::from(d);
    }
    Ok(result)
}


fn main() {
    let a4 = [1, 10, 100, 255];
    let a8 = [1, 10, 100, 255, 1, 10, 100, 255];
    let a16 = [1, 10, 100, 255, 1, 10, 100, 255, 1, 10, 100, 255, 1, 10, 100, 255];

    println!("{}", bytes_le_to_u32(&a4));
    println!("{}", bytes_le_to_u64(&a8));
    println!("{}\n", bytes_le_to_u128(&a16));

    println!("{:?}", bytes_le_to_u32b(&a4));
    println!("{:?}", bytes_le_to_u64b(&a8));
    println!("{:?}\n", bytes_le_to_u128b(&a16));

    let x: Result<u32, _> = byte_converter(&a4);
    println!("{:?}", x);
    let y: Result<u64, _> = byte_converter(&a8);
    println!("{:?}", y);
    let z: Result<u128, _> = byte_converter(&a16);
    println!("{:?}\n", z);
}

Once we can use consts in generics, the third version becomes nicer and it can return a T instead of a Result.


#7

Hi *,

I’m overwhelmed by the numerous and helpful answers. Thanks for all of them.

Rust hast indeed a great community, that’s for sure.

I personally like coder543’s solution for the sake of readability and he’s definitely
right by using Results over panic’s in production code :wink:

Again, thanks to you all!