Is Vec::with_capacity like Vec::new with Vec::reserve or Vec::new with Vec::reserve_exact?

I wonder if

let vec = Vec::<T>::with_capacity(n);

is the same as

let vec = Vec::<T>::new();
vec.reserve(n);

or

let vec = Vec::<T>::new();
vec.reserve_exact(n);

Also, when I write vec![x; n], is the behavior more like reserve or reserve_exact?

I tried to look into RawVec, but I find the code a bit confusing (yet).

The documentation on with_capacity, reserve, and reserve_exact allow each of those three functions/methods to allocate some extra space.

I would assume that Vec::with_capacity behaves more like Vec::new with a subsequent Vec::reserve_exact, but if I understand it right, the documentation isn't clear on that.

I haven't dug into the implementation, but the documentation for with_capacity clearly states it can intentionally allocate more than what is asked for.

This method is allowed to allocate for more elements than capacity.

Comparing that to the documentation for reserve_exact

this will not deliberately over-allocate to speculatively avoid frequent allocations.

I think it's pretty clear that with_capacity is intended to behave like reserve. So regardless of how it actually behaves today you should really treat it like reserve.

2 Likes

I disagree because reserve_exact also says:

After calling reserve_exact, capacity will be greater than or equal to self.len() + additional. […] Note that the allocator may give the collection more space than it requests. Therefore, capacity can not be relied upon to be precisely minimal.

So it's not clear at all that Vec::with_capacity really is intended to behave like reserve.

The difference between reserve and reserve_exact seems more to be that the first deliberately reserves more while the latter may still reserve more than requested non-deliberately.

I think it would make more sense if Vec::with_capacity behaves like Vec::reserve_exact here. Otherwise there would be a lot of unnecessary allocation in cases where the maximum Vec size is known.

But this isn't documented (I think).

1 Like

See here. In short, the intention of with_capacity and reserve_exact are the same, but a given allocator (under the accepted RFC) may only allocate by given size multiples (say), so "exact" is a misnomer; the reality is "smallest possible which satisfies the desired capacity / growth", aka sufficient. As opposed to a power-of-two capacity or the like.

(Think an arena allocator that hands out whole pages only or such.)

9 Likes

Interesting, I think the documentation on the method could be a little bit clearer that with_capacity behaves like reserve_exact then, especially since reserve_exact is so much more detailed about what guarantees it provides.

Ah those changes aren't merged yet so maybe the method docs will be up to date by the time that description is merged

So the current documentation is incorrect/inconsistent then, because it (currently) says in Vec#guarantees:

vec![x; n], vec![a, b, c, d], and Vec::with_capacity(n), will all produce a Vec with exactly the requested capacity.

And in Vec::with_capacity:

Constructs a new, empty Vec<T> with at least the specified capacity.

I'm happy there is a PR attempting to fix this. :+1:

3 Likes

It’s easy to rule one of them out, isn’t it?

fn with_capacity_1<T>(n: usize) -> Vec<T> {
    let mut v = Vec::new();
    v.reserve(n);
    v
}

fn with_capacity_2<T>(n: usize) -> Vec<T> {
    let mut v = Vec::new();
    v.reserve_exact(n);
    v
}

fn with_capacity_real<T>(n: usize) -> Vec<T> {
    Vec::with_capacity(n)
}

fn main() {
    dbg!(with_capacity_1::<u8>(1).capacity()); //    -> 8
    dbg!(with_capacity_2::<u8>(1).capacity()); //    -> 1
    dbg!(with_capacity_real::<u8>(1).capacity()); // -> 1
}

In case you’re wondering why 8:

    // Tiny Vecs are dumb. Skip to:
    // - 8 if the element size is 1, because any heap allocators is likely
    //   to round up a request of less than 8 bytes to at least 8 bytes.
    // - 4 if elements are moderate-sized (<= 1 KiB).
    // - 1 otherwise, to avoid wasting too much space for very short Vecs.
    pub(crate) const MIN_NON_ZERO_CAP: usize = if mem::size_of::<T>() == 1 {
        8
    } else if mem::size_of::<T>() <= 1024 {
        4
    } else {
        1
    };

raw_vec.rs - source

4 Likes

Hmmm… right, I should have tried myself :sweat_smile:.

However, note that all three variants might (theoretically) still behave differently in other cases. Even with PR 99790, there wouldn't be a guarantee that Vec::with_capacity follows the exact same strategy as Vec::reserve_exact.

2 Likes

I would be very surprised if Vec::with_capacity didn't behave as reserve_exact. The name suggests that Vec::with_capacity(n).capacity == n, which is the behaviour of reserve_exact. I'm sure there are tests for that condition somewhere. Even if not in Rust itself, some crates will certainly rely on that assumption and test for it.

If that’s the case, then the name is apparently problematic, since … well … it’s easy to falsify this equation

fn equation<T>(n: usize) {
    assert!(Vec::<T>::with_capacity(n).capacity() == n);
}

#[test]
fn test() {
    equation::<()>(0);
}
running 1 test
test test ... FAILED

failures:

---- test stdout ----
thread 'test' panicked at 'assertion failed: Vec::<T>::with_capacity(n).capacity() == n', src/lib.rs:2:5

Yes, zero-sized types are an edge case, but the claim is true for non-zero sized types, which is the actually important part (how much memory did I allocate and how many elements can I push without reallocation?).

The documentation could be improved, though. I would also prefer if zero-sized types followed the general algorithm and didn't return an usize::MAX as capacity, but whatever. General specialization could likely result in weirder outcomes.

I'm going to ignore the current implementation details for this answer, and try to discuss it more conceptually.

The really important distinction between reserve and reserve_exact is that the former in a loop is linear cost but the latter can be quadratic cost.

But reserve does that by knowing what the previous capacity was. And of course with_capacity doesn't know what the previous capacity was. And it's not an instance method, so of course it's not being called on an instance in a loop.

So thus with_capacity is more like reserve_exact than reserve, because it doesn't have the information (nor the type signature) it would need to be able to offer amortizing.

3 Likes