This, imho, should be the start point. What are Vec<u8>
and [u8]
?
-
[u8]
represents the type for sequences of bytes of any length (that is, the length is a runtime property). In Rust parlance, this is called a slice.-
Since the length is only known at runtime (
!Sized
), a slice cannot be inlined into the stack, since stack memory is managed with compile-time (and thus fixed) parameters. This is what prevents us from using!Sized
stuff directly.We can circumvent this restriction with indirection: any sequence of bytes, whatever its length may be, once in memory, can be referred to by a reference / pointer to the first element and a second field with the number of elements (we call this a fat pointer). This is the case of, for instance:
-
shared reference to a slice,
&[u8]
(or more generally,&[T]
), -
unique reference to a slice,
&mut [u8]
(or more generally,&mut [T]
), -
and owning references / pointers, such as
Box<[u8]>
,Rc<[u8]>
,Arc<[u8]>
.
-
-
-
One way to crate an element with variable length (dynamic allocation) is by using the heap. This works in multiple steps:
-
We ask the heap-allocator for a chunk of memory able to hold
capacity
elements; -
If the allocator succeeds, we get back a pointer to the heap, to the beginning of the allocated (but uninitialised memory);
-
We can then initialiase any
len
number of elements, so as long aslen <= capacity
(else a reallocation is needed).
-
That's why such a heap-managed structured must have at least these three fields (
ptr, len, capacity
), and this is exactly what aVec<u8>
(or more generally, aVec<T>
) is.- a corollary of that is that from a
ptr, len, capacity
tuple, we can choose to keepptr, len
only.ptr
...len
... This rings a bell... Oh, right, we have successfully managed to have a reference to a slice!
- a corollary of that is that from a
Ok, ok, but the OP asked about String
and str
, what has anything to do with it?
Very simple:
String / str
is exactly likeVec<u8> / [u8]
, except that the sequence of bytes must uphold a property / invariant: them being validutf-8
.
That's why there are trivial conversions (casts) from the formers to the latters (<Vec<u8> as From<String>>::from
and str::as_bytes
), whereas the other way around requires a runtime-checked cast.
cough lazy_static
cough