Trait for "implements le_bytes"?

Trying to code a generic that uses a macro which uses from_le_bytes and std::mem::size_of::. Adding a type constraint of "Sized" lets size_of work. But what constraint is needed to make "from_le_bytes", a standard property of the primitives, work?

fn UnpackTextureEntryField<T: Sized>(out: &mut [ b: &[u8])  { 
  ....
    out[0] = decodebuiltintype!(T, b, offset);                     // get first value
   ....



error[E0599]: no function or associated item named `from_le_bytes` found for type parameter `T` in the current scope
    --> src/messages/helpers.rs:26:23
     |
26   |         let v = $typ::from_le_bytes(
     |                       ^^^^^^^^^^^^^ function or associated item not found in `T`
     | 
    ::: src/common/objecttypes.rs:1417:26
     |
1417 |                 out[j] = decodebuiltintype!(T, b, offset);  // override this entry
     |                          -------------------------------- in this macro invocation
     |
     = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)

I don't believe there is such a trait in the standard library. You could make your own FromLeBytes trait and implement it for all relevant types.

A trait-based approach seems reasonable, so I'm trying that. It almost compiles.

trait SerializableSized {
    //  This default implementation is used for built-in types.
    //  Overridden for user types.
    const serialized_size: usize = std::mem::size_of::<Self>;
}
trait Serializable: SerializableSized {
    //  Default implementation is used for built-in types.  
    //  Overridden for user types.
    fn our_from_le_bytes(b: &[Self; Self::serialized_size]) -> Result<LLVector2, &'static str> where Self: std::marker::Sized { Self::from_le_bytes(b) }
    fn our_to_le_bytes(&self) -> [Self; Self::serialized_size] where Self: std::marker::Sized { self.to_le_bytes() }
} 

//  Built in types use defaults.
impl Serializable for u16 {}
impl SerializableSized for u16{}
impl Serializable for u32 {}
impl SerializableSized for u32{}
impl Serializable for f32 {}
impl SerializableSized for f32{}

///  vector2 to 8 bytes (32 bit floats)
#[derive(Copy, Clone, PartialEq)] // floats don't compare equal if NaN, so only partial equal is allowed.
pub struct LLVector2 {
    pub x: f32,
    pub y: f32,
}

impl SerializableSized for LLVector2 {   
    const serialized_size: usize = 8;
}

The first error message is:

error[E0599]: no function or associated item named `serialized_size` found for type parameter `Self` in the current scope
  --> src/traitpractice5.rs:39:43
   |
39 |     fn our_from_le_bytes(b: &[Self; Self::serialized_size]) -> Result<LLVector2, &'static str> where Self: std::marker::Sized { Self::fro...
   |                                           ^^^^^^^^^^^^^^^ function or associated item not found in `Self`
   |
   = help: items from traits can only be used if the type parameter is bounded by the trait
help: the following trait defines an item `serialized_size`, perhaps you need to add another supertrait for it:
   |
36 | trait Serializable: SerializableSized + traitpractice5::SerializableSized {

The compiler's help system is telling me to add a trait restriction it already has. If I put in exactly what the compiler suggests, it can't find "traitpractice5", which is the current module.

This seems to need a supertype to make it work, because I need the "serializable size" as a compile time constant for an array declaration. But I'm doing something wrong in setting this up.

(All this is to exactly replicate the marshalling of a C++ program, byte for byte.)

Looks like you've hit https://github.com/rust-lang/rust/issues/43408 , in particular the duplicate issue in https://github.com/rust-lang/rust/issues/72192 exactly demonstrates this problem. Const generics are still a bit of a work in progress in Rust, I would imagine this gets fixed before the feature gets stabilized.

In addition, the error message is really bad. It looks like fixing the error message is dependent on fixing the issue itself as well... Unfortunate all around.

I think the only solution that works here is to return a Vec<u8> and take a &[u8] instead. If that's acceptable and we change your example code to that, a couple more issues prop up.

First, Self needs to be Sized in SerializableSized and the size_of needs to be called (it's missing the parens to make it a function call()):

trait SerializableSized: Sized {
    const serialized_size: usize = std::mem::size_of::<Self>();
}

Second, our_from_le_bytes and our_to_le_bytes use from_le_byts and to_le_bytes, which are not guaranteed to be defined for the type implementing the trait, meaning they cannot be used in the default implementation. IIRC this is a difference in how Rust and C++ do generics.

I'm not sure what the best way to proceed from there is, as I'm not completely sure what the intended functionality is, but hopefully this'll help you forwards.

What a mess. That issue has 74 attached bug reports. Using generics to handle data of varying size is one of the main real-world use cases for generics, after all. This bug has been open since 2017. So it's not likely to be fixed soon.

The excuse for not fixing it given in 43408 really applies only when you're creating the size in the same trait. The "supertrait' workaround the compiler suggests as "help" ought to work, but doesn't.

Using Vec is out. I'd be allocating huge numbers of Vec structures for sizes in the 1-8 byte range. This is deserializing game messages coming in via UDP at high speed, and I want that path allocation-free. I'm writing safe Rust to decode data previously read by a horrid mess of C++ pointer arithmetic.

Also, it's a problem that from_le_bytes, although implemented for u8 to u128, i8 to i128, and f32 to f64, does not apparently have a trait you can use to tell the trait system it is available. "You could also make your own FromLeBytes trait and implement it for all relevant types" is not particularly helpful.

I tried GenericArray as a workaround. Coding up the small integers as typenames is just too cute.

I can get this done by having two copies of the relevant code, one for built-in types and one for my types. That's easier than all the compiler bug workarounds above.

Thanks.

Even the workaround didn't work. You cannot use T::from_le_bytes in a generic. The function from_le_bytes, and the other 60+ functions of that family, are all duck typed - they're just function members of the impl of each type that happen to have the same name across multiple types. The generic system can't handle that. Even if the generic is restricted to PrimInt or some such, it doesn't help, because those functions are not part of any trait.

This sucks.

Can you provide a simplified example on the playground that we could experiment with? It doesn't look like your snippet with SerializableSized contains a decodebuiltintype!() macro.

What using something like arrayvec which is backed by a normal array? Otherwise you could use smallvec to optimise for the case when there are few elements, but fall back to allocation if you get more.

Here's "decodebuiltintype!". This works if not put inside a generic. Playground version to follow. Why not use Vec or Smallvec? Because if this is done right, from_le_bytes should all compile down to one move instruction. I'm trying to re-do some efficient but ugly C++ game code in Rust.

/// Decode built-in type (type, buf, offset) -> value
#[macro_export]
macro_rules! decodebuiltintype {
    // get next field from buffer at offset, given type
    ($typ:ident, $buf:ident, $ix:ident) => {{
        if *$ix + std::mem::size_of::<$typ>() > $buf.len() {
            return Err("Msg too long");
        } // avoid running off end of buffer
        let v = $typ::from_le_bytes(
            $buf[*$ix..*$ix + std::mem::size_of::<$typ>()]
                .try_into()
                .unwrap(),
        ); // fetch one variable from byte string
        *$ix = *$ix + std::mem::size_of::<$typ>(); // advance index
        v // return result
    }};
}

How about creating a Deserialize trait like this. The idea is that we will return an instance of Self and the unread bytes in the happy case, or a DeserializeError if parsing fails (e.g. bounds checks fail due to insufficient data).

trait Deserialize: Sized {
    fn deserialize(buffer: &[u8]) -> Result<(Self, &[u8]), DeserializeError>;
}

struct DeserializeError;

Then we use the byteorder crate to implement deserialize() using LittleEndian::read_u32() and friends. A macro makes this process a lot less repetitive.

use byteorder::{ByteOrder, LittleEndian};

macro_rules! impl_deserialize {
    ($type:ty, $method:ident) => {
        impl Deserialize for $type {
            fn deserialize(buffer: &[u8]) -> Result<(Self, &[u8]), DeserializeError> {
                if buffer.len() < std::mem::size_of::<Self>() {
                    return Err(DeserializeError);
                }

                let (head, rest) = buffer.split_at(std::mem::size_of::<Self>());
                let value = LittleEndian::$method(head);

                Ok((value, rest))
            }
        }
    };
}

impl_deserialize!(u16, read_u16);
impl_deserialize!(i16, read_i16);
impl_deserialize!(u32, read_u32);
impl_deserialize!(i32, read_i32);
impl_deserialize!(u64, read_u64);
impl_deserialize!(i64, read_i64);
impl_deserialize!(f64, read_f64);
impl_deserialize!(f32, read_f32);

(if you want to be generic over endianness, the Deserialize trait would accept a type parameter, turning it into Deserialize<T> where T: ByteOrder)

And now we've done the primitives, you can use it to deserialize more complex types:

#[derive(Copy, Clone, PartialEq)] // floats don't compare equal if NaN, so only partial equal is allowed.
pub struct LLVector2 {
    pub x: f32,
    pub y: f32,
}

impl Deserialize for LLVector2 {
    fn deserialize(buffer: &[u8]) -> Result<(Self, &[u8]), DeserializeError> {
        let (x, buffer) = f32::deserialize(buffer)?;
        let (y, buffer) = f32::deserialize(buffer)?;

        Ok((LLVector2 { x, y }, buffer))
    }
}

(playground)

Inspecting the assembly shows simple move instructions and a bit of shuffling to create the rest slice, as you'd expect:

<f64 as playground::Deserialize>::deserialize:
	mov	rax, rdi
	cmp	rdx, 8
	jae	.LBB2_1
	mov	qword ptr [rax + 8], 0
	ret

.LBB2_1:
	add	rdx, -8
	mov	rcx, qword ptr [rsi]
	add	rsi, 8
	mov	qword ptr [rax], rcx
	mov	qword ptr [rax + 8], rsi
	mov	qword ptr [rax + 16], rdx
	ret

I'm not sure why LLVM couldn't coalesce the x and y bounds checks into a single cmp rdx, 8; jb .LBB0_3, though.

<playground::LLVector2 as playground::Deserialize>::deserialize:
	mov	rax, rdi
	cmp	rdx, 4
	jb	.LBB0_3
	mov	rcx, rdx
	and	rcx, -4
	cmp	rcx, 4
	jne	.LBB0_2

.LBB0_3:
	mov	qword ptr [rax + 8], 0
	ret

.LBB0_2:
	mov	rcx, qword ptr [rsi]
	add	rsi, 8
	add	rdx, -8
	mov	qword ptr [rax], rcx
	mov	qword ptr [rax + 8], rsi
	mov	qword ptr [rax + 16], rdx
	ret

Here's a playground with what I was trying to do.

The generic won't find T::from_le_bytes because there is no trait which contains that function.

Most of the other problems come from trying to work around that.

Creating all those custom readers is an option, but seems overkill.

Tried another approach. See playground.

This is a different workaround, defining a generic_to_le_bytes, etc. for all the relevant types. But those can't all be in one trait, because they return different sized fixed arrays, and array sizes can't be parameterized in traits, per bug #43408. So I made one trait for each size of object. Then I tried to use that by writing

fn UnpackTextureEntryFieldBuiltin<T: Sized + Serializable2 + Serializable4>

which doesn't work because "+" in this context seems to mean AND, nor OR. (The Rust reference is silent on that.) Also, the compiler seems to be trying to disambiguate which function is being called from the trait definition, not its implementation, resulting in two functions with different "self" types being diagnosed as a conflict.

All this is for a basic generic operation which should Just Work - defining the same function on all the built-in numeric types and having the right one used in generic functions. Something's wrong here.

Well, I finally came up with a solution that uses macros instead of traits. Not as elegant, but works. Thanks, everyone.

I rather get the feeling that the trait system is designed to let people construct the Rust equivalent of derived classes. It works fine for that. Generics which don't map to that model don't seem to fit well into the trait system.

I'll note that this started from

It's not terribly surprising to me that the only practical way to wrap a macro is with another macro. If the generic was using a generic it wouldn't be a big deal, and a macro can also easily use a generic, but the compilation and checking models of a macro and a generic are different enough that it's quite common that a macro cannot be reasonably encapsulated by a generic.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.