Whats happen when I clone a Vec<&str>?

I'm doing some cli programming and with it comes a lot of string handling here

In this point i had to clone a vec of lines to be able to sort it and pass it ahead to be added in the output vector thinking that this will only copy the references for the lines, not actually copying the whole line data.

So, this actually copies the underlying data to anoter heap memory location or only copies the references of them?

Thanks in advance

PS: the goal there is to keep copying data to a minimum and only do when its necessary

A vector always stores its elements on the heap. The vector itself is a triplet of pointer, capacity and length. So, if you clone() the vector, I think its elements are copied to a new heap location.

In case of a Vec<&str>, the elements of the vector are references to string-slices. So, the actual string data is not stored in the vector's heap-allocated buffer at all; the vector's heap-allocated buffer only contains n references to string-slices. Therefore, when you clone() this kind of vector, then all n references are copied to a new heap memory location; the actual string data is not copied/duplicated.

(The actual string data could live in "static" memory, for example)

1 Like

Yeah, but a Vec<&str> is a vector of pointers for n strs, right?

It copies the &strs.

If you want to get a Vec<String> from your Vec<&str>, then you want something like v.iter().map(|x| x.into()).collect::<Vec<String>>().

1 Like

so the vector actually is much like a linked list that one could implement in C for example? Ordering that would only sort the references in the vec, I assume?

I think a Vec is more like an std::vector in C++, because all elements are stored in a single continuous block of memory. Conversely, a linked list (e.g. std::list) stores each element at a separately allocated memory location, together with a (possibly NULL) pointer to the next element/location.

(a linked list allows for quick insertions/deletions in the middle of the list. For pretty much all other purposes, a vector is generally better/faster; especially because it has better memory locality)


Vector:

Linked List:

1 Like

The vector stored as values in the HashMap contains &str as elements, which are references to an unsized str. The &str does not only contain the base address but also the length information, but not the actual text. So a &str is usually 16 bytes big (two times usize, which will be 8 bytes on most platforms). Side note: interestingly a &String is only 8 bytes big, but accessing the str will require an extra indirection, so I believe &String is generally slower. (Playground)

When you clone a Vec using Vec::clone, a new Vec will be created and all elements will be cloned as well. However, if you have a Vec<&str>, then the elements are shared references. And shared references are simply cloned by copying them, i.e. copying the 16 bytes (impl Copy for &T).

So what happens if you clone a Vec<&str>? A new Vec will be created and memory allocated for the contents. Then each reference in the original Vec (16 bytes for each reference) is copied to the new Vec.

What happens if you clone a Vec<String> in contrast? A new Vec will be created and memory allocated for the contents. Then each String in the original Vec is cloned (not copied). Cloning a (non-empty) String means that memory is allocated for the contents of the String and then the contents of the original String are copied to the new String. So cloning a Vec<String> will usually be much more expensive than cloning a Vec<&str>.

Edit: Forgot to get back to your question. Yes, sorting the Vec<&str> will only sort the references.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.