Let's start by taking a look at rustc-serialize's documentation. As you can see, it contains three modules: one for encoding to and decoding from base64, another for hex, and another for json. Serde (in the most reductive sense) does much of the same.
You can think of both crates as providing you with some ways to take data from your Rust programs, such as a struct or arrays of bytes, and turn that data into "friendly" (i.e. human-readable) formats. These kinds of formats are often referred to as text-based formats. They are useful in cases where we want humans to be able to read the data or we need to use it in a situation where we want strings rather than blobs of arbitrary memory contents.
That leads us to binary formats. A binary format is essentially just a block of memory, a bunch of bytes with values, probably organized in some order. I should point out that the big distinction here really is in the representation of the data, because of course no matter what you do, when you send data over the network it's transmitted as a bunch of bytes. With a binary format, you take the literal contents of your data- that is, the actual bytes underneath it all- and you align those bytes in one big array. Now you can send that array of bytes out over the network and the recipient, provided it understands the protocol you're using, should be able to reconstruct the data you sent it just by reading the appropriate number of bytes into the correct variables (or struct fields).
Let's look at an example. Suppose we had the following struct.
struct Message {
ip: [u8; 4],
port: u16,
body: Vec<u8>
}
If we instantiated an instance of this like
// Note, I haven't run this code but it should illustrate the point
let msg = Message {
ip: [127, 0, 0, 1],
port: 8080,
body: vec![104, 101, 108, 108, 111] // The string "hello" encoded to ASCII bytes
};
The JSON representation of this struct is relatively human readable (save for the body).
{
"ip": [127, 0, 0, 1],
"port": 8080,
"body": [104, 101, 108, 108, 111]
}
But look at all the wasted bytes! Surely we don't need to include the {}
characters, or the property names "ip", "port", "body"
, the colons, the spaces, the commas, the newlines... That's a lot of wasted space (when you're transmitting millions of such packets every minute). Instead, we could design a protocol (like DNS, TCP, etc...) that says: "I will send you a sequence of bytes. They are as follows:
- The first four bytes will be IP address octets
- The following two bytes will be a port number
- Every byte after that is part of the body"
Now I can lay out a byte array like so:
|127|0|0|1|80|80|104|101|108|108|111|
and just send that! Much simpler and more efficient.
There are other performance-related reasons for using binary formats but the gist of it is that it's mostly about efficiency. Serde and rustc_serialize are concerned with encoding/serializing data into text-based, human-friendly formats, and less with aligning byte arrays with data. That's the kind of thing you can only do if you understand the layout of those byte arrays, and so it's often left to you to handle.
Please understand that I'm fudging over some details (like how a port number would actually be represented as two bytes) for the sake of simplicity.
If you check out the RFCs specifying protocols like DNS, you'll get a much more detailed look at how binary formats are used and defined.
I do hope this helps.