Is it safe to std::ptr::copy a 0 capacity Vec and then use both the old and new object?

From the std::ptr::copy docs:

If T is not Copy , using both the values in the region beginning at *src and the region beginning at *dst can violate memory safety.

Vec is not Copy. However, if the T is a Vec, and that Vec is empty, then afterwards I expect I have two 0 capacity Vec objects that are both usable, b/c the (pointer, capacity, length) triplet should be all zero. Is this assumption safe? If the Vec were positive capacity, then afterwards if I use both objects when they drop they will free the same memory. I am only wondering if I can count on this just in the empty case. Also if I can count on it for any of the other std containers?

Edit: originally said length, meant capacity

You would definitely need the Vec to have zero capacity, not just zero length.

1 Like

Oops, yes you are correct. I will edit my post.

My intuition says that it's safe to do this for any container whose default constructor is a const fn, like Vec::new and LinkedList::new, because calling ptr::copy on the result of such a function should be no different from calling the function twice.

(Note: BTreeMap::new and BTreeSet::new are likely to be const fn in the future.)

Copying a zero-capacity HashMap<K, V> is an interesting case. It doesn't currently cause any memory safety issues that I'm aware of (though I wouldn't count on this), but it does produce two copies of the same RandomState, which could weaken the protection against DoS attacks. There is a const fn constructor for a non-random hasher in the hashbrown crate: HashMap::with_hasher.

Well technically the pointer will not be zero, since Vecs require their pointers to be non-null. Instead it will be dangling - aligned and non-null but it points to zero readable or writable bytes. This is currently implemented as just the alignment of the type cast to a pointer.

5 Likes

Is there the possibility that this could go away in the future if Box::new becomes a const fn? Or has rust committed to never doing that because it would need to return different values for the same arguments? (in other words because it is inherently impure)

Yes, there is - if Box::new become a const fn I believe it would still be unsound to copy it. Although there is no risk of double-frees (since Rust would have to know anyway not to free box pointers created in a const context), under the current model of Stacked Borrows Box behaves like an &mut in that its pointer can't be aliased, and copying a Box can potentially alias its pointer.

Why? Assuming things are going in the general direction of allowing more and more computation at compile time, doesn't seem crazy to imagine people could write const fn that will run out of memory if boxes leak.

Oh, I was talking about Boxes that have crossed the boundary from compile-time to run-time. For such boxes, their allocation must exist in the program's binary, and so it cannot be freed. For a Box that has been created at compile time and is being used at compile time, copying it comes with both the risk of a double-free and aliasing pointers.

Can you elaborate on what scenario you have that might possibly need to use this?

The general answer to questions like these is essentially, "no, you can't rely on that. If you would like to start relying on it, create a PR to add the guarantee to the documentation and see whether T-libs would be willing to accept it."

nit: reminder that the pointer is never zero, and it's invalid to create one with a zero (via mem::zeroed() or otherwise)

2 Likes

You can use this property is to enable cheaper object construction. Say I have a group of object types that I frequently create together, and I want to be able to produce a new instance of the group quickly. In other words, every time I make an X I also need to make a Y and a Z. If you know the constituents of the group at compile time, you just put them all in a struct and clone/construct when needed. But what if you only know the types going into the group at run time, because say it is based on a configuration file? You can have a list of function pointers that are constructors to call, but totally dynamic function calls are not particularly fast.

One way to enable this is to have a "template" of bytes saved somewhere that you know if you memcpy will give you all the objects in the group in in a valid state, located contiguously one after the other. This way the only thing that is dynamic is exactly how many bytes you copy. But this can only work in instances where you are sure the types in the template are safe to bitwise copy. Copy can guarantee that, but is overly restrictive because some types that can't always be safely copied can still be safely copied if they are in a certain state (e.g. a 0 capacity vector). If there were a guarantee like this around standard containers like this I could just count on using them, but as is it seems like I should probably just create my own unsafe trait to represent this property (DefaultStateIsCopy).