Using serde to deserialize a Vec<u8> into a struct (fails)


#1

Hi all, I have a question about using Serde to deserialize a ‘Vec’ into a struct.

#[macro_use]
extern crate serde_derive;

extern crate bincode;
use bincode::{serialize, deserialize, Infinite};

#[derive(Serialize, Deserialize, Debug)]
pub struct Message {
    ty: i32,
    len: i32,
    msg: Vec<u8>,
}

fn main() {
    let mut v = Vec::new();
    v.push(b'1');
    v.push(b'2');
    v.push(b'0');
    v.push(b'0'); // ty = 12

    v.push(b'2');
    v.push(b'0');
    v.push(b'0');
    v.push(b'0'); // len = 2

    v.push(b'A'); // msg[0]
    v.push(b'A'); // msg[1]
    v.push(b'A'); // msg[2]
    // .....

    let decoded: Result<Message, _> = deserialize(&v);
    match decoded {
        Ok(value) => println!("success"),
        Err(e) => println!("error '{}'", e),
    }
}

Running this program with cargo yields

error ‘IoError: failed to fill whole buffer’

I tried encoding the vector using serde serialize(), and then decoding that, but I get the same error. I feel like I’m missing something. If I remove the two i32 paramters, and only push the b’A’ bytes, it works. I feel like the answer is silly, but does anyone see what would be wrong with the above program?

Edit: My major assumption is that deserializing the 4bytes into an integer works. I tried modifying my struct so that the Vec is gone, and it works. The combination of int32 fields with the vec[u8] yields me this error, and I’m not sure why.


#2

Your v is missing the part that says how many elements are in msg. Check the serialized bytes again:

#[macro_use]
extern crate serde_derive;

extern crate bincode;
use bincode::Infinite;

#[derive(Serialize, Deserialize, Debug)]
pub struct Message {
    ty: i32,
    len: i32,
    msg: Vec<u8>,
}

fn main() {
    let v = Message {
        ty: 12,
        len: 2,
        msg: b"AAA".to_vec(),
    };
    println!("{:?}", bincode::serialize(&v, Infinite).unwrap());
}

#3

The code @dtolnay posted outputs:
[12, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 65, 65, 65]

Notice how i32s get represented as their little-endian binary form – not decimal like b"12", but [12u8, 0u8, 0u8, 0u8].

A Vec is represented as its length (64-bit) followed by its contents. Here’s the responsible piece of code in the bincode crate:

fn serialize_seq(self, len: Option<usize>) -> Result<Self::SerializeSeq> {
    let len = try!(len.ok_or(ErrorKind::SequenceMustHaveLength));
    try!(self.serialize_u64(len as u64));
    Ok(Compound {ser: self})
}

#4

Oh gosh, okay. So serde automatically encodes the length of the vector, I don’t know why I made the assumption that only the contents would be synchronized.

Seeing the byte representation really helped my understanding, thanks @bugaevc and @dtolnay.

It’s a challenge now to figure out if I could have figured this out myself, I’ll have to think further on how I would have figured this out from the error message this without your help.

error ‘IoError: failed to fill whole buffer’

The error message seems like it was just a Result<> propogated up the call stack, maybe seeing the callstack would have been helpful? I’m just tossing ideas out there.

Luckily for me however this community is super helpful!


#5

Well, in general if you want error messages then Bincode is probably not going to be your thing. The entire point is to be compact and fast. JSON or YAML or almost any other format would have given a nice message.

Bincode is fine as long as you stick to the rule of only deserializing data that was serialized by Bincode, so that sounds like where things went wrong here.


#6

Well, in general if you want error messages then Bincode is probably not going to be your thing. The entire point is to be compact and fast. JSON or YAML or almost any other format would have given a nice message.

Fair.

Are you saying decoding a raw vector like this is different than if I had called serialize() on the vector first?


#7

Let’s talk about JSON first. In some use cases you may have a remote server that exposes a non-negotiable JSON interface that you need to handle in your Rust application. Or you may have JSON input that is typed in by a human into a configuration file. Or any number of sources of JSON that are not originally produced by your application. Good JSON libraries are designed to support these use cases and handle errors gracefully with helpful messages.

Compact binary formats like Bincode are designed for a different use case. Your application serializes some data, writes it to a file or transmits it somewhere, and then deserializes it back to the original data structure. The source of the binary data is your program having serialized it. If you have data coming from somewhere other than your program serializing it, then Bincode won’t provide any hand-holding.