std::io::Read into a Vector buffer

I am implementing a function that would read data from an input that implements the Read trait (normally a TCPStream).

That data is MySQL protocol: first 4 bytes indicate the length of the packet and the rest is the body.

The idea is to read 4 bytes to know the length, and then read the rest of the body with that length.

The first part could be read by allocating a buffer with a known size of 4.

But for the 2nd part, we'd need to allocate a buffer with dynamic size.

use std::io::Read;

fn main() {
    let input = [0x01, 0x00, 0x00, 0x01];
    read_packets(&input[0..4]);
}

fn read_packets<A>(mut input: A)
where
    A: Read,
{
    // let header = [0; 4] // works, but does not help when the length is dynamic
    let mut header: Vec<u8> = Vec::with_capacity(4); // does not work
    // let mut header = BytesMut::with_capacity(4); // does not work either
    let n = input.read(&mut header).unwrap();
    println!("read bytes: {}", n)
}

For some reason, read always reads zero bytes when it accepts a Vector or BytesMut.

I'd appreciate some pointers to help me figure out why.

This is because &mut header is coerced into a slice of length zero. It doesn't use the capacity, it uses the length as the end of the slice. Support for reading into uninitialized memory is currently poor.

fn read(&mut self, buf: &mut [u8]) -> Result<usize>

Pull some bytes from this source into the specified buffer

(source)

An empty Vec has an empty buffer. read is a relatively low level API: it just takes a slice and populates it.

You probably want read_to_end instead.

1 Like

Thanks, now I understand it!

read_to_end reads until EOF, which in TCPStream means reading until the connection is closed. In a request-response flow that might not be what I want.

I guess I could work-around by reading in a loop into a preallocated slice of 1024 bytes and concatenate that into the vector.

I don't think that's true. The way I read the documentation, EOF means there are no more bytes right now. It doesn't mean the source is exhausted forever. And I would expect the connection not to be closed until it goes out of scope (may depend on what library you're using; I don't do much network stuff). Truth is I have no idea what I'm talking about.

In any case, if you want to limit the size read at a time to 1024, you can use take(1024) in combination with read_to_end.

No, read_to_end will read until the stream is closed. The stream can be closed without dropping if the other end of the stream closes it.

If you want to read a certain number of bytes read_exact is the tool you should use.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.