(this example shows how to do with with_capacity but it could also be done with reserve; it is valid also according to Miri).
My question is: why if it's possible, it isn't provided by the standard library? Why does reserve_exact is not really exact?
This non-exact thing can be really annoying, for example if I want to implement a fixed-capacity ring buffer, I would like to use VecDeque, but I can't without unsafe code or additional capacity field, because with_capacity or reserve_exact are not exact.
EDIT: update code to make it prevent soundness issue, see comment
Can you elaborate on what code you're trying to write and how that depends on the exact capacity field?
By "not really exact" are you referring to what the .capacity() field might return after Vec::with_capacity or to the amount of physical memory actually allocated by the allocator?
reserve_exact is exact to the same extent that your code is, i.e. up to the exactness of your memory allocator. Calling malloc(n) (or any equivalent, like Rust's alloc or C++ new) doesn't mean that actually n bytes of memory will be allocated. Allocators reserve memory in specifically sized chunks, depending on their implementation, which are almost always larger than the memory you required, unless you specifically tailor your allocations sizes to the allocator. For example, almost no allocator ever will give you exactly 1 byte of memory. It will give something more reasonably sized, like 8 or 16 bytes, but if you asked it for 1 byte you wouldn't know it, even though memory would be reserved.
I gave an example of code I would like to write at the end: a ringbuffer with fixed capacity using VecDeque, actually impossible in safecode without an additional capacity field, because VecDeque capacity cannot be trusted.
By "not really exact", I'm indeed talking about the capacity() of the returned vector.
When I only call reserve_exact it seems to work, meaning the returned capacity is what I specified (irrespective of how much the underlying allocator returned):
My issue is not about the allocator internals, it is about the value of VecDeque::capacity not being reliable according to the documentation, while it exists a way to get a reliable value.
I mean, I really think Vec/VecDeque should provide a function allowing to set exactly the capacity field (excepted for ZST maybe), in order to be able to rely on it, to build a ring buffer for example.
Indeed, it seems to work, and according to the stdlib code, it should always work, but the documentations states it cannot be trusted, so I cannot/don't want to build my ring buffer on something documented as not trust-worthy.
Looking through the relevant allocator API, it seeems the approach there is
you request a desired size
the allocator might give back more size (and reports this information via the length of the returned slice)
you choose a desired length information anywhere in the range from the requested size to the reported actual size and can work with this length (i.e. feel free to discard any other length information besides this one)
when deallocating, pass layout information with this length, and you're good
The methods on Vec seem to be documented with the possibility in mind than in step 3. above, a larger length (in the form of a larger capacity) could be stored in the Vec, in order to let the user benefit from the "free extra storage". The implementation however does not currently do this. Except for the zero-sized case, where capacity always is and stays maximal. (And allocators are not involved, anyways.)
The use case of reserve_exact where you just "don't want to waste memory" because you will never or unlikely need any more space than the given capacity doesn't mind the extra free capacity you get, because it doesn't cost anything (the motivation was just "don't waste memory").
Of course reserve has additional considerations of wanting the exponential growth strategy for low amortized overhead.
In principle it seems entirely viable to offer two different methods for step 3.
one being to choose to keep the original requested capacity, guaranteed, so you can use it for your program logic without having to save it redundantly a second time. (Remember that it's totally fine to keep the original capacity as the information used on the Allocator::deallocate call; having to store two subtly different capacity values is indeed entirely redundant.)
and the other being to pass through additional capacity reported by the allocator to the user (which I'm not even sure why the current implementation with its current implementation doesn't do that; anyone have any ideas/sources on that?[1])
On the other hand, perhaps the answer here is just "the capacity field is not intended to be used for program logic" and redundantly storing the requested capacity in a separate field is considered sufficiently low-overhead as to not matter.
I think the issue I linked above mentions doing this:
Jemalloc itself exposes a malloc_usable_size function which can be used to determine how much capacity was actually allocated for a pointer, as well as nallocx which can be used to determine how much capacity will be allocated for a given size and alignment. Vec can in principle query one of these methods and update its capacity field with the result.
The question at hand is: is this ever worth it to check usable_capacity, and is it ever undesirable?
When I ran some tests, it seemed to always end up using the optimized case, so no actual problem -- but the Vec::with_capacity() documentation suggests that one shouldn't make that assumption:
Constructs a new, empty Vec<T> with at least the specified capacity.
(emphasis mine)
(Bigger picture: We should stop mixing Vec<u8> and Bytes, but this is where my personal "is there a call that guarantees cap == len?" story comes from).
The size of T times the capacity (ie. the allocated size in bytes) needs to be the same size as the pointer was allocated with. (Because similar to alignment, dealloc must be called with the same layout size.)
[...]
capacity needs to be the capacity that the pointer was allocated with.
[...]
These requirements are always upheld by any ptr that has been allocated via Vec<T>. Other allocation sources are allowed if the invariants are upheld.
Maybe I'm missing something, but if reserve_exact truly tries to be minimal, doesn't that mean it allocates with requested_capacity * sizeof<T> and can't make use of any additionally returned capacity without contradicting the documentation?
Unless it first reallocates or makes use of specialized knowledge of the current allocator (to pre-emptively request the minimum the allocator will give back), I guess.
I'm not sure I see the problem. Really this terminology
isn't really defined now, is it? We could declare that simply refers to the size of allocation that you-the-user (or the-standard-library-the-user) arbitrarily choose when allocating, in step 3. of what I listed above
so there's a whole range of valid choices of “the size the pointer was allocated with” (from the requested size to the (possibly larger) reported size”. The usage of the definite article “the” doesn’t sound like there’s more than one choice, but if we really want a single consistent choice, we could still just request the user to do their “step 3.” early on, and choose one length in the range, then stick to it, and that could be enough; no?
Edit: Except, the stable allocation APIs in std::alloc::alloc or GlobalAlloc::alloc don't return any (possibly larger) “actual size”, so maybe it's safer to argue that none of this flexible interpretation applies?
Edit2: No, GlobalAlloc doesn't explicitly guarantee the exactness of the size either, saying in alloc:
Allocate memory as described by the given layout
and the docs of Layout speak of a "minimum size". Perhaps the strictest vibes are in dealloc
layout must be the same layout that was used to allocate that block of memory
where it's probably safe to interpret "used to allocate" to mean "passed to alloc/realloc.
Nonetheless, all of this might be irrelevant to Vec if it doesn't promise anywhere what underlying alloc calls are made to begin with. It should be pretty much equivalent to call a generalized alloc function along the lines of Allocator::alloc that returns a size range you can choose from, or to add a system to convert a reserve_exact size into such a range before calling alloc (think, GlobalAllocator::alloc this time) with precise info. The Allocator::alloc approach would mean that, at least for Allocator::alloc the layout passed to the function call is no longer necessarily automatically considered the “layout that was used to allocate that block of memory ”.
I don't see how reserve_exact or from_raw_parts documentation preclude such behavior from happening in reserve_exact.
Yes, same Layout is how I interpret the requirement in the current API.
That's basically the "specialized information about the allocator, I guess" scenario I referred to.
That being said, frankly, I really don't want this level of language-lawyering in my unsafe documentation.[1] It should be clear and explanatory. "Well technically we didn't define 'allocated with' hee hee so it's ok Vec doesn't do what you thought we said it does anymore" sounds like a recipe for programmers viewing the compiler maintainers as actively hostile, IMO. I'd rather Rust get (much) better at supporting those who write unsafe.
Sorry if that comes across as too blunt, I'm not trying to attack your interpretation. What I'm trying to say is: the documentation should be updated to be clear and explanatory so that linguistic gymnastics are not required to explain the intended guarantees, non-guarantees, and safety requirements.
Or phrased differently: if readers weren't supposed to be able to draw conclusions (guarantees of Vec and requirements for safety) from the listed "invariants" in the "safety" section, because something was intentionally left undefined, then they aren't really guarantees of Vec and they aren't actually requirements that can be met.
I'm sure that's not what the lib team meant. I believe they meant to show you how you can use the function safely in cases beyond just reusing the parts of a Vec. So perhaps here's how it should be tuned, keeping in mind the presumably desired wiggle room for Vec's behavior:
If you use the "parts" from a Vec<T>, you can also safely use from_raw_parts
And here's some examples of doing so, including a length/capacity altering one
e.g. by changing T from [i32; 2] to i32
If you meet this list of requirements, you can also safely use from_raw_parts
And here's an example of doing so, where the parts aren't from a Vec
Clarification section that Vec may have more wiggle room than is immediately apparent from the list of requirements, inspired by this thread
Example of what the OP wanted, since if Vec has the wiggle room, they're probably correct that there's no safe way to do it[2]
Oh, I'm fully on agreement here. I mean my remark such as "this terminology isn't really defined" more as a critique on the status quo of documentation there, too; really it should be made clear what is meant.
Ultimately, I suppose, the current documentation is a result of its history, which is probably: patchwork from multiple authors, from different times, with different focus in mind.
Indeed. And I believe/hope that this argument alone should be sufficient to allow reserve_exact sufficient freedom to choose to create more capacity than requested despite promising to "not deliberately over-allocate", if that change is ever wanted.
And this wording is also in the docs of reserve_exact:
Note that the allocator may give the collection more space than it requests. Therefore, capacity can not be relied upon to be precisely minimal.
It seems parts of the docs like this are already written relatively clearly with the Allocator API (that's able to actually return more memory than requested) in mind.
Not I'm looking at from_raw_parts_in which actually does this; though that's only the Allocator-using generalizer (and unstable) API here that calls out the concept of "fit".
Maybe this could be intentional, after all though? After all, currently there seems to be no way for GlobalAllocator to report back a larger size. Technically that could be made possible by adding a new (default implemented) trait method that can be overridden. I wonder if there's discussions on that, too.
Is that the kind of change you're wary might go against (at least implied) documented details? That Vec when using Global would always do exact capacities (unless items are zero-sized), only allowing the over-allocation (by choice of the allocator itself) when a customAllocator is used? I can emphasize with this conclusion: Looking at the capabilities of GlabalAllocator allocators; because GlobalAllocator doesn't offer any alternatives anyways, if the goal is to "not deliberately over-allocate", then Vec must pass an exact Layout argument, and the capacity can never be larger than requested.
I'm not sure what change you're referring to.[1] But when I wrote this comment, it was based by looking at what is currently available in combination with the from_raw_parts docs, pretty much as this paragraph does. That's where my "painted themselves into a corner?" speculation came from. (The docs link to the alloc and dealloc functions specifically, and call out their "same layout" requirement, which one can only conclude from the *mut returning API means "same Layout value".)
Then I tacked on that "Unless it [...] makes use of specialized knowledge" due to some protest from a brainstorming corner of my mind, in combination with having looked at the Allocator trait in combination with this note:
This function [alloc::alloc::alloc] is expected to be deprecated in favor of the alloc method of the Global type when it and the Allocator trait become stable.
Also I'm not sure what the plans for GlobalAlloc are (I haven't looked). Conceivably the global allocator will have the ability to pass back extra data eventually (with the knowledge fresh in my mind at the moment), either by a new trait method, or eventual deprecation of GlobalAlloc, and/or some blanket implementation between GlobalAlloc and Allocator, etc.
Which brings me to where I'm at now: thinking there's enough wiggle room for Vec to still take advantage of extra allocations in reserve_exact, but strongly feeling some explicit clarification is needed in, at least, the from_raw_parts safety documentation.
(It's not good that there's wiggle room because the docs are so confusing.[2])
Was the change you were referring to: linking to the Allocator "fitting" definition, instead of saying "same layout"? I think that'd be fine... if that's what the lib team means! In which case the docs should explain that while the only "fit" that is possible with alloc/dealloc is "equality", that may not always be the case (for the global allocator / in general).
I think the real stickler in the current docs is
These requirements are always upheld by any ptr that has been allocated via Vec<T>
To me that means you're guaranteeing how Vec allocates, so you have to build any future possibilities into the listed invariants. If locking Vec into specific behavior that's compatible with Allocator "fitting" is the goal, I believe that's still compatible with the guarantee, if the change to use "fitting" instead of "same layout" is made.
Maybe libs is fine with that. However, I suspect that guaranteeing exactly how Vec allocates wasn't intended.
If so, there should probably be a discussion over how feasible it is to roll back that guarantee. If it's considered feasible and desirable to roll it back, that should come with some explanatory blog post IMO.
If it gets rolled back, the from_raw_parts docs could split up the cases:
You met these (possibly stricter than what Vec will some day do) invariants? That will always work
You used what Vec (or String or...) did? That will always work too
But that doesn't mean they are held to exactly the same invariants as above
In which case, the "other allocation source" invariants could say "layout has to be equal" today, and be loosened to "layout has to fit" in the future, without being contradictory.
Some time ago I ran into an issue with my Arena crate where my implementation interacting with certain allocator behavior could cause exponential blowup (triple_arena/triple_arena/src/arena/base_arena.rs at 858b461ba1ca955f0d6cc8e4f25807bbd6598abe · AaronKutch/triple_arena · GitHub , note that I later switched to using a custom NonZeroInxVec type, but the virtual capacity fix still has to be used because of the restrictions described in the next paragraph). What I had to do was make my own virtual capacity (in this case simply the length of the Vec). Any special operations will first try to do things within the virtual capacity, and only when the virtual capacity is used up do we expand the virtual capacity into the real capacity. We let the real capacity expand however it wants when it does expand, and keep a predetermined virtual capacity increase.
I think the discussion here is missing another piece. I don't think it is actually possible for Vec to implement a "truly exact" capacity, at least not without doing its own extra virtual capacity thing, which will definitely never happen because it will impact the performance of almost everything. The issue is that the deallocation call uses a Layout that has to correspond to the Layout in bytes that the allocator originally returned (allocators are allowed to assume the allocation size is stored in some user type and later relayed back to them). One of the usize numbers in the struct has to keep this layout-in-bytes number without ever changing it between reallocations. This is also what my NonZeroInxVec type ended up doing.