Is Vec::with_capacity like Vec::new with Vec::reserve or Vec::new with Vec::reserve_exact?

jbe · August 24, 2022, 7:16am

I wonder if

let vec = Vec::<T>::with_capacity(n);

is the same as

let vec = Vec::<T>::new();
vec.reserve(n);

or

let vec = Vec::<T>::new();
vec.reserve_exact(n);

Also, when I write vec![x; n], is the behavior more like reserve or reserve_exact?

I tried to look into RawVec, but I find the code a bit confusing (yet).

The documentation on with_capacity, reserve, and reserve_exact allow each of those three functions/methods to allocate some extra space.

I would assume that Vec::with_capacity behaves more like Vec::new with a subsequent Vec::reserve_exact, but if I understand it right, the documentation isn't clear on that.

semicoleon · August 24, 2022, 7:37am

I haven't dug into the implementation, but the documentation for with_capacity clearly states it can intentionally allocate more than what is asked for.

This method is allowed to allocate for more elements than capacity.

Comparing that to the documentation for reserve_exact

this will not deliberately over-allocate to speculatively avoid frequent allocations.

I think it's pretty clear that with_capacity is intended to behave like reserve. So regardless of how it actually behaves today you should really treat it like reserve.

jbe · August 24, 2022, 7:54am

I disagree because reserve_exact also says:

After calling reserve_exact, capacity will be greater than or equal to self.len() + additional. […] Note that the allocator may give the collection more space than it requests. Therefore, capacity can not be relied upon to be precisely minimal.

So it's not clear at all that Vec::with_capacity really is intended to behave like reserve.

The difference between reserve and reserve_exact seems more to be that the first deliberately reserves more while the latter may still reserve more than requested non-deliberately.

I think it would make more sense if Vec::with_capacity behaves like Vec::reserve_exact here. Otherwise there would be a lot of unnecessary allocation in cases where the maximum Vec size is known.

But this isn't documented (I think).

quinedot · August 24, 2022, 8:09am

See here. In short, the intention of with_capacity and reserve_exact are the same, but a given allocator (under the accepted RFC) may only allocate by given size multiples (say), so "exact" is a misnomer; the reality is "smallest possible which satisfies the desired capacity / growth", aka sufficient. As opposed to a power-of-two capacity or the like.

(Think an arena allocator that hands out whole pages only or such.)

semicoleon · August 24, 2022, 8:14am

Interesting, I think the documentation on the method could be a little bit clearer that with_capacity behaves like reserve_exact then, especially since reserve_exact is so much more detailed about what guarantees it provides.

Ah those changes aren't merged yet so maybe the method docs will be up to date by the time that description is merged

jbe · August 24, 2022, 8:18am

So the current documentation is incorrect/inconsistent then, because it (currently) says in Vec#guarantees:

vec![x; n], vec![a, b, c, d], and Vec::with_capacity(n), will all produce a Vec with exactly the requested capacity.

And in Vec::with_capacity:

Constructs a new, empty Vec<T> with at least the specified capacity.

I'm happy there is a PR attempting to fix this.

steffahn · August 24, 2022, 8:54am

It’s easy to rule one of them out, isn’t it?

fn with_capacity_1<T>(n: usize) -> Vec<T> {
    let mut v = Vec::new();
    v.reserve(n);
    v
}

fn with_capacity_2<T>(n: usize) -> Vec<T> {
    let mut v = Vec::new();
    v.reserve_exact(n);
    v
}

fn with_capacity_real<T>(n: usize) -> Vec<T> {
    Vec::with_capacity(n)
}

fn main() {
    dbg!(with_capacity_1::<u8>(1).capacity()); //    -> 8
    dbg!(with_capacity_2::<u8>(1).capacity()); //    -> 1
    dbg!(with_capacity_real::<u8>(1).capacity()); // -> 1
}

In case you’re wondering why 8:

    // Tiny Vecs are dumb. Skip to:
    // - 8 if the element size is 1, because any heap allocators is likely
    //   to round up a request of less than 8 bytes to at least 8 bytes.
    // - 4 if elements are moderate-sized (<= 1 KiB).
    // - 1 otherwise, to avoid wasting too much space for very short Vecs.
    pub(crate) const MIN_NON_ZERO_CAP: usize = if mem::size_of::<T>() == 1 {
        8
    } else if mem::size_of::<T>() <= 1024 {
        4
    } else {
        1
    };

raw_vec.rs - source

jbe · August 24, 2022, 9:40am

Hmmm… right, I should have tried myself .

However, note that all three variants might (theoretically) still behave differently in other cases. Even with PR 99790, there wouldn't be a guarantee that Vec::with_capacity follows the exact same strategy as Vec::reserve_exact.

afetisov · August 24, 2022, 11:28am

I would be very surprised if Vec::with_capacity didn't behave as reserve_exact. The name suggests that Vec::with_capacity(n).capacity == n, which is the behaviour of reserve_exact. I'm sure there are tests for that condition somewhere. Even if not in Rust itself, some crates will certainly rely on that assumption and test for it.

steffahn · August 24, 2022, 3:42pm

If that’s the case, then the name is apparently problematic, since … well … it’s easy to falsify this equation

fn equation<T>(n: usize) {
    assert!(Vec::<T>::with_capacity(n).capacity() == n);
}

#[test]
fn test() {
    equation::<()>(0);
}

running 1 test
test test ... FAILED

failures:

---- test stdout ----
thread 'test' panicked at 'assertion failed: Vec::<T>::with_capacity(n).capacity() == n', src/lib.rs:2:5

afetisov · August 24, 2022, 4:43pm

Yes, zero-sized types are an edge case, but the claim is true for non-zero sized types, which is the actually important part (how much memory did I allocate and how many elements can I push without reallocation?).

The documentation could be improved, though. I would also prefer if zero-sized types followed the general algorithm and didn't return an usize::MAX as capacity, but whatever. General specialization could likely result in weirder outcomes.

scottmcm · August 24, 2022, 5:40pm

I'm going to ignore the current implementation details for this answer, and try to discuss it more conceptually.

The really important distinction between reserve and reserve_exact is that the former in a loop is linear cost but the latter can be quadratic cost.

But reserve does that by knowing what the previous capacity was. And of course with_capacity doesn't know what the previous capacity was. And it's not an instance method, so of course it's not being called on an instance in a loop.

So thus with_capacity is more like reserve_exact than reserve, because it doesn't have the information (nor the type signature) it would need to be able to offer amortizing.

system · November 22, 2022, 5:41pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Reserve truly exact capacity for Vec	18	446	April 12, 2024
`Vec::with_capacity` does not painc as the documentation says help	10	1547	May 19, 2022
Needlessly weak promises on vector capacity? help	8	807	January 12, 2023
HashSet capacity is more than it was requested help	5	749	December 30, 2019
Guarantee that Vec::default() does not allocate? help	30	1110	November 20, 2022

Is Vec::with_capacity like Vec::new with Vec::reserve or Vec::new with Vec::reserve_exact?

Related Topics