Vec::with_capacity + read_to_end = overallocation

jkugelman · September 21, 2021, 3:19am

I'm writing a program that reads large files into memory and I've been trying to optimize memory allocation. I thought I could use Vec::with_capacity to ensure that read_to_end would have a perfectly sized buffer, but it consistently doubles the capacity. If I read a 1MB file into a 1MB Vec it's always 2MB after the read finishes.

Is this a flaw in Read::read_to_end? Or am I doing something wrong?

Here's an example that runs on the playground:

use std::io::{Cursor, Read};

fn main() {
    let data = [0u8; 1024];
    let mut vec = Vec::with_capacity(data.len());
    let mut cursor = Cursor::new(&data);

    dbg!(vec.len(), vec.capacity());
    cursor.read_to_end(&mut vec).unwrap();
    dbg!(vec.len(), vec.capacity());
}

It prints:

[src/main.rs:7] vec.len() = 0
[src/main.rs:7] vec.capacity() = 1024
[src/main.rs:9] vec.len() = 1024
[src/main.rs:9] vec.capacity() = 2048

It reserves space for 1KB, reads 1KB of data, and ends up with a 2KB Vec.

If I reserve space for an extra byte it doesn't grow the capacity:

let mut vec = Vec::with_capacity(data.len() + 1);

Output:

[src/main.rs:8] vec.len() = 0
[src/main.rs:8] vec.capacity() = 1025
[src/main.rs:10] vec.len() = 1024
[src/main.rs:10] vec.capacity() = 1025

Is that what I need to do? I don't love it...

jkugelman · September 21, 2021, 3:32am

I believe the code responsible is here, in read_to_end_with_reservation:

loop {
    if g.len == g.buf.len() {
        /* reserve more space in g.buf */
    }

    let buf = &mut g.buf[g.len..];
    match r.read(buf) {
        Ok(0) => return /* finished */,
        Ok(n) => { /* normal case */ }
        Err(ref e) if e.kind() == ErrorKind::Interrupted => {}
        Err(e) => return Err(e),
    }
}

If the buffer is perfectly sized then after the final Ok(n) fills the buffer it will loop around one more time, reserve more space, and then perform a final read which returns Ok(0). That final "reserve more space" iteration is what causes the overallocation.

I can envision ways to improve this. For instance, when g.len == g.buf.len() it could attempt a read into a small side buffer. If that read returns Ok(0) then it's done. If it doesn't then this code could then and only then extend g.buf and move the small bit of data over.

Would this be a reasonable change or is it too complicated and/or inefficient? What do you think?

drewkett · September 21, 2021, 4:25am

No idea what the actual performance implications would be but my guess is that it’s not that uncommon to fill the whole buffer so which would lead to an extra small buffer read call in many situations. If you know the size of the file you are reading, you can use read_exact instead, and just make one extra call to read with a small buffer if you have reason to believe the size of the file may be different than you expect.

jkugelman · September 21, 2021, 12:48pm

I'm using read_to_end so I don't have to initialize the memory.

(read_exact would also introduce a small TOCTOU window, though that's probably not important.)

Michael-F-Bryan · September 21, 2021, 1:19pm

Out of curiosity, I checked what std::fs::read does and they've got a little helper function which does the same thing.

pub fn read<P: AsRef<Path>>(path: P) -> io::Result<Vec<u8>> {
    fn inner(path: &Path) -> io::Result<Vec<u8>> {
        let mut file = File::open(path)?;
        let mut bytes = Vec::with_capacity(initial_buffer_size(&file));
        file.read_to_end(&mut bytes)?;
        Ok(bytes)
    }
    inner(path.as_ref())
}

fn initial_buffer_size(file: &File) -> usize {
    // Allocate one extra byte so the buffer doesn't need to grow before the
    // final `read` call at the end of the file.  Don't worry about `usize`
    // overflow because reading will fail regardless in that case.
    file.metadata().map(|m| m.len() as usize + 1).unwrap_or(0)
}

(the shenanigans with fn inner() are an optimisation to help reduce compile times)

jkugelman · September 22, 2021, 3:09am

I submitted pull request #89165 to fix this. Improving read_to_end allowed me to get rid of the + 1 in initial_buffer_size, as well as eliminate workaround code in take().

I don't actually know if this is something people want fixed, but even if it's not it was fun hacking on the standard library code. I tell you what, I'm pretty darn nervous. I've never tried to submit a patch to such a high visibility project. Thank goodness the Rust community is so friendly...!

jkugelman · October 5, 2021, 12:55am

It got merged! Hooray!

@josh and @Mark_Simulacrum were so incredibly helpful. And thorough, wow! Seeing how careful they are makes me want to be a better developer and reviewer at my day job. I love how seriously they took such a small PR. It's really something.

Thank you both, and thank you to the whole Rust community for being so wonderful. Rust has really fanned the flames of my love for programming since I found it two years ago.

stonerfish · October 5, 2021, 1:16am

Thank you sir for your work. The way I see it is that is a very needed patch. Many may never even know how much memory you saved them.

josh · October 5, 2021, 1:21am

Thank you for your work! You were a pleasure to work with.

jkugelman · October 5, 2021, 1:52am

I sure did. I fixed everything related I could find.

Take no longer needs to override Read::read_to_end.
The reservation_size callback that allowed Take to inhibit the previous over-allocation behavior isn't needed.
fs::read doesn't need to reserve an extra byte in initial_buffer_size.
There was a unit test that specifically checked that Read::read_to_end does over-allocate. I removed that test.

That's a great question. I don't know what the official policy is--if there is one. My assumption is that everyone should fix everything they can to the best of their ability. I'd feel bad knowingly submitting a partial PR. I don't want make work for other people. And I don't want to rely on others noticing that it's unfinished. I know when I review PRs it's easy to critique the changes in front of me but terribly hard to spot the absence of changes.

stonerfish · October 5, 2021, 2:22am

Thank you. I deleted my question because I started reading your patch and see you dug around and fixed other stuff too. Super awesome.
I dug around a bit myself. (Forgive me, I am a construction worker and just program for fun and on my own. I see everything as a nail) I found this part in std::ffi::c_str

        // Specialization for avoiding reallocation.
        impl SpecIntoVec for &'_ [u8] {
            fn into_vec(self) -> Vec<u8> {
                let mut v = Vec::with_capacity(self.len() + 1);
                v.extend(self);
                v
            }
        }
        impl SpecIntoVec for &'_ str {
            fn into_vec(self) -> Vec<u8> {
                let mut v = Vec::with_capacity(self.len() + 1);
                v.extend(self.as_bytes());
                v
            }
        }

are those 'len() + 1' parts also not needed?

Hyeonu · October 5, 2021, 2:25am

I guess that extra slot is required to insert a nil char since it's the C string.

stonerfish · October 5, 2021, 2:59am

Oh, I think I find it now. The specialization is increasing the vec, so when _new() calls from_vec_unchecked() the reserve_exact(1) will do nothing because the space is already reserved.

system · January 3, 2022, 3:00am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
BufReader bug with Vec::with_capacity! help	17	369	October 10, 2024
Reserve truly exact capacity for Vec	20	2036	July 17, 2024
Reading from stdin: performance	22	11377	January 12, 2023
Why tokio read_to_end only fill the buffer capacity help	4	913	May 16, 2022
Handling OutOfMemory in Write for Vec<u8>	10	287	October 7, 2025

Vec::with_capacity + read_to_end = overallocation

Related topics