Controlled Vec expansion

Their requirement is to use the same algorithm for growing capacity that std Vec uses, even if that changes in the future. So it would fail if that changed in std Vec. That's why I suggested a test to detect a change in the algorithm.

That does not answer my question, as I said earlier:

If you want to control the resizing strategy use reserve_exact. If you want to use Vec's default resizing strategy use reserve.

So which case isn't covered?

"Only control the resizing strategy if the default strategy would exceed max_capacity this particular time."

Something like:

/// Reserves capacity for at least `additional` more elements to be inserted
/// in the given `Vec<T>`. The collection may reserve more space to
/// speculatively avoid frequent reallocations.  However, the collection
/// will never speculatively reserve more than this would:
/// ```
/// vec.reserve_exact(max_capacity.saturating_sub(vec.capacity()))
/// ```
pub fn reserve_up_to(&mut self, additional: usize, max_capacity: usize) {

Which you could write yourself if you had:

/// Returns the number of additional elements that the collection will
/// reserve space for if you call `reserve(additional)`.  That is, the
/// following two invocations would request the same amount of memory
/// from the allocator if reallocation is required:
/// ```
/// vec.reserve(additional);
/// ```
/// ```
/// vec.reserve_exact(vec.additional_capacity_of_reserve(additional));
/// ```
/// Note that the allocator may give the collection more space than it
/// requests. Therefore, the additional capacity that will be requested
/// cannot be relied upon to be exactly equal to the amount of new
/// capacity that will result from calling `reserve(additional)`.
pub fn additional_capacity_of_reserve(&self, additional: usize) -> usize {

...provided I have understood this thread completely.

5 Likes

Using the std Vec's algorithm (even if it changes in the future) for increasing capacity, but within a given capacity bound. I.e., what @quinedot said.

Are you sure realloc cannot extend a previous allocation?

[EDIT]: I'm actually puzzled. Vec does not use realloc. Old discussions claim it was. I've also read that realloc does not work well with modern allocators and that in the end it allocates an entirely new region.

Can anybody with deep knowledge of allocation strategies clarify this?

Yes, you dissected the problem in much more precise term than I did, thanks. Sharp as usual.

Presently, additional_capacity_of_reserve is essentially length * 2. But I would have liked not to rely on that. Of course it is a possibility to just rely on that, hope for the future, propose a new method etc.

It is also a correct concern, raised by @tczajka , that in the end you will use additional space because of reallocation (at some point, you'll have 3/2n words in memory in the best case).

1 Like

Ah, that's not the default resizing strategy anymore, that's something else!

But that outcome can be achieved by calling reserve followed by a shrink_to.

It does. alloc.rs - source

I really don't know how I missed but, yes, it does. That's why @quinedot 's solution (the second method) would work for me.

Maybe that's something we can try to push through the stabilization process. It is really non-invasive.

No, you cannot allocate and shrink—even for a little time, you'll use more memory than needed.

Not on any OS that has lazy pages / zero pages. You temporarily create a virtual reservation but no physical pages will be used.

Well, at least, it's logically impossible for this to always happen. There might be another allocation occupying the address space immediately after the one you want to change, which would make it impossible to extend the existing allocation.

(And if you preemptively add gaps in address space to make room, then you now have an allocator where every allocation occupies at least a page. Very inefficient for small allocations.)

2 Likes

Yes, but it is possible to move the vector in virtual memory by just changing the entries in the page table rather than copying the data in physical memory, so this won't require using more memory during the move. This is what the mremap system call does.

Typically there are two strategies in an allocator:

  • large allocations (say, >= 128 KB or so) will call mmap (and mremap if you do realloc) separately for each allocation
  • small allocations will use a different algorithm that will try to find and reuse space within a single larger mmap

So for small allocations you may have moving overhead during realloc, but not for large allocations.

So, basically, assuming realloc is a bit magical (when it can be), all we need is an
additional_capacity_of_reserve that exposes the intentions of the growth logic. At that point one can decide at each push whether to just delegate to the underlying implementation, or to perform a reserve_exact first.

But without that little bit of help from Vec, there's not much one can do (besides assuming doubling logic, of course).

Possible, yes. What I am saying is: it would be practically impossible for realloc to give an unconditional guarantee that the allocation is not moved. If you wanted to do this it would be only feasible for large allocations and would require a specific contract with the allocator.

With existing unstable API, you can probably do this already! It’s a two-step process: Wrap the allocator with a wrapper that refuses to accept allocation/growth requests larger than a specific limit value (size_of::<T> * actual_limit), handle actual growing by first trying .try_reserve, and then following up with .reserve_exact(actual_limit - current_capacity) if that failed.

4 Likes