Are references to Vec<T> the same as references to [T]?

TL;DR

Why does this code work?

use std::fmt;

fn takeaslice<T: fmt::Debug>(slice: &[T]) -> &[T] {
    dbg!(&slice);
    slice
}

fn main() {
    let v = Vec::from([1,2,3]);
    takeaslice(&v);
}

Output:

cocoa@TheTreehouse /t/tmp.mcEOkAauWM (main)> cargo run                                                     (self) 
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/what`
[src/main.rs:4:5] &slice = [
    1,
    2,
    3,
]

The function takeaslice should only take a slice of a generic type T, but somehow it can take Vec<T>? I can't tell why they're interchangable. One is stack allocated and statically sized, the other is heap allocated and growable.


While trying to understand the difference between Vec::iter and Vec::into_iter, I decided to look at the definitions of the functions. Vec::iter is defined in the impl Deref block and Vec::into_iter is defined in the impl IntoIterator blocks.

The IntoIterator trait defines how a type is to be transformed into an iterator, so Vec::into_iter returns a IntoIter<T, A> struct, which itself implements the Iterator trait (phew). It's implementation in the source code looks like this.

impl<T, A: Allocator> IntoIterator for Vec<T, A> {
    type Item = T;
    type IntoIter = IntoIter<T, A>;

    /// Creates a consuming iterator, that is, one that moves each value out of
    /// the vector (from start to end). The vector cannot be used after calling
    /// this.
    ///
    /// # Examples
    ///
    /// ```
    /// let v = vec!["a".to_string(), "b".to_string()];
    /// let mut v_iter = v.into_iter();
    ///
    /// let first_element: Option<String> = v_iter.next();
    ///
    /// assert_eq!(first_element, Some("a".to_string()));
    /// assert_eq!(v_iter.next(), Some("b".to_string()));
    /// assert_eq!(v_iter.next(), None);
    /// ```
    #[inline]
    fn into_iter(self) -> Self::IntoIter {
        unsafe {
            let me = ManuallyDrop::new(self);
            let alloc = ManuallyDrop::new(ptr::read(me.allocator()));
            let buf = me.buf.non_null();
            let begin = buf.as_ptr();
            let end = if T::IS_ZST {
                begin.wrapping_byte_add(me.len())
            } else {
                begin.add(me.len()) as *const T
            };
            let cap = me.buf.capacity();
            IntoIter { buf, phantom: PhantomData, cap, alloc, ptr: buf, end }
        }
    }
}

This mostly makes sense, without knowing how all of that works, the idea is just to have pointers to the start and end of the block of memory one is iterating over. Then, since IntoIter implements the Iterator trait, we can use it as an iterator (right?).

Now coming to what Vec::iter is doing, when I ctrl+F the source code for the vec module to find fn iter( nothing comes up, but it is listed in the docs, so I clicked the source button in the docs and it sends me to the source code for the slice module(?). Here's what it links me to.

    /// Returns an iterator over the slice.
    ///
    /// The iterator yields all items from start to end.
    ///
    /// # Examples
    ///
    /// ```
    /// let x = &[1, 2, 4];
    /// let mut iterator = x.iter();
    ///
    /// assert_eq!(iterator.next(), Some(&1));
    /// assert_eq!(iterator.next(), Some(&2));
    /// assert_eq!(iterator.next(), Some(&4));
    /// assert_eq!(iterator.next(), None);
    /// ```
    #[stable(feature = "rust1", since = "1.0.0")]
    #[inline]
    #[cfg_attr(not(test), rustc_diagnostic_item = "slice_iter")]
    pub fn iter(&self) -> Iter<'_, T> {
        Iter::new(self)
    }

Okay, then that must mean it's calling Iter::new (somehow. even though it's not in the impl for vec!!!) on the vector. This is what the source for Iter::new looks like.

impl<'a, T> Iter<'a, T> {
    #[inline]
    pub(super) fn new(slice: &'a [T]) -> Self {
        let len = slice.len();
        let ptr: NonNull<T> = NonNull::from(slice).cast();
        // SAFETY: Similar to `IterMut::new`.
        unsafe {
            let end_or_len =
                if T::IS_ZST { without_provenance(len) } else { ptr.as_ptr().add(len) };

            Self { ptr, end_or_len, _marker: PhantomData }
        }
    }

Now I think it's the same or somewhat similar idea here? Pointer to the start to the end or length of the data. The rest I assume the Iter type can handle. My questions are:

  1. How is .iter implemented for Vec even though it's not in any impl block for it? It's in the impl<T> [T] {...} block.
  2. How does the Iter::new method take in a slice when it's a different struct? What qualifies a struct to be treated as a slice?

I didn't find any answers in the internals of rust. The source code for Vec just wraps the data in a RawVec which wraps RawVecInner (and PhantomData which idk what that is).

struct RawVecInner<A: Allocator = Global> {
    ptr: Unique<u8>,
    /// Never used for ZSTs; it's `capacity()`'s responsibility to return usize::MAX in that case.
    ///
    /// # Safety
    ///
    /// `cap` must be in the `0..=isize::MAX` range.
    cap: Cap,
    alloc: A,
}

This just seems like a unique pointer to a bunch of bytes and the capacity of the allocated vector. I even tested it separately if Vec<i32> can be passed to a function that takes a reference to a slice.

  1. Why is there both .iter and .into_iter? Both of them return structs that implement an Iterator trait, the former treating the data as an array slice and the latter yielding references to heap memory. I don't understand when one would implement the former or the latter.

Thank you for reading! Any help is appreciated :heart: :crab:

Deref coercion converts &Vec<T> into a slice (&[T]).

.into_iter returns owned types and consumes the container, while .iter returns references and doesn't consume the container.

4 Likes

It's called "deref coercion": Vec<T>: Deref<Target = [T]>, so &Vec<T> can coerce to &[T]. Similarly, a &String can coerce to a &str and a &Box<T> can coerce to a &T.

That's due to method call resolution. It does something similar to, but not exactly the same as, deref coercion.

(For example it does up to one level of auto-ref, whereas deref coercion happens "beneath" an existing &_ or &mut _.)

It doesn't; the coercion from &Vec<T> to &[T] happens before the call to .iter.

It's about ownership.

Calling .into_iter on a Vec<T> consumes -- takes ownership -- of the Vec<T>, and the iterator returns owned Ts. The Ts get moved out of heap memory in the process. The memory is deallocated when the iterator drops.

.into_iter is a trait method of Vec<T> as IntoIterator.

Calling .iter on a Vec<T> only borrows the &Vec<T>, and the iterator returns borrowed &Ts. They aren't moved and the memory isn't deallocated. The Vec<T> remains borrowed until all the returned &Ts and the iterator itself are no longer in use.

.iter is a convenience method that does the same as <&Vec<T> as IntoIterator>::into_iter and <&[T] as IntoIterator>::into_iter. Note the & -- reference types are their own full-fledged types, distinct from the referent. (Vec<T>, &Vec<T>, and &mut Vec<T> are different types).

Typically collections implement IntoIterator at least three times:[1]

  • Once on the collection itself (e.g. Vec<T>) to transfer ownership of items
  • Once on shared references (e.g. &Vec<T>) to shared-borrow items
  • Once on exclusive references (e.g. &mut Vec<T>) exclusively-borrow items

And by convention, the latter two are also delegated to by inherent methods on the collection called .iter and .iter_mut. (Whereas .into_iter is a method on the trait itself.)

The trait implementations are important because that's how for works, and also for use in generic contexts.


  1. HashMap has more because maybe you want keys, or values, or both ↩︎

3 Likes

Oh, and here's a graphical representation comparing &Vec<T> and &[T].

4 Likes

Wow that's a detailed answer! Tysm! I just have a few follow up questions to ensure I'm understanding these concepts correctly.

Ahhh I see. So since Vec<T> implements the Deref<Target = [T]> trait, dereferencing is allowed to create an unsized, stack allocated slice. If I check the source, Vec::deref is calling Vec::as_slice which looks like this.

    pub const fn as_slice(&self) -> &[T] {
        // SAFETY: `slice::from_raw_parts` requires pointee is a contiguous, aligned buffer of size
        // `len` containing properly-initialized `T`s. Data must not be mutated for the returned
        // lifetime. Further, `len * mem::size_of::<T>` <= `ISIZE::MAX`, and allocation does not
        // "wrap" through overflowing memory addresses.
        //
        // * Vec API guarantees that self.buf:
        //      * contains only properly-initialized items within 0..len
        //      * is aligned, contiguous, and valid for `len` reads
        //      * obeys size and address-wrapping constraints
        //
        // * We only construct `&mut` references to `self.buf` through `&mut self` methods; borrow-
        //   check ensures that it is not possible to mutably alias `self.buf` within the
        //   returned lifetime.
        unsafe { slice::from_raw_parts(self.as_ptr(), self.len) }
    }

The slice::from_raw_parts is part that seems to be moving the data from heap to stack.

I see, so if I understand it correctly, this is the whole process of resolution. First generating the list of receiver types would be.

  • &Vec<i32>
  • &&Vec<i32>
  • &&mut <i32>
  • &[i32; 3] (Dereferencing the Vec but not the borrow(?))
  • &&[i32; 3]
  • &&mut [i32;3
  • &[i32] (Unsized dereferencing now, this is the only type for which the method .iter exists)
  • &&[i32]
  • &&mut [i32]

Then the first time the compiler encounters a visible method is when the unsized dereference happens. Is that correct?

Nitpick: a &[T] doesn't care where the Ts are allocated. They can be in an array on the stack, in an array in a static variable, on the heap in a Vec, VecDeque, etc etc and these are just some examples. All it cares is that there is a contiguous sequence of Ts somewhere in memory.

1 Like

No, slice::from_raw_parts does not copy any data. It creates a reference to existing data.

1 Like

No Ts move when you deref &Vec<T> to get a &[T] (or call as_slice). The quoted code is effectively copying a pointer and a usize (the length), nothing more.

Like @SkiFire13 said, slices don't care where they "live". No type does in fact. Some types promise they will store their contents on the heap, like Vec<T> and Box<T> promise to store Ts on the heap. But this is like BTreeMap promising to keep things in order and the like -- a library-level promise, and not some language-level intrinsic property.

So for example a Vec<i32> stores i32s on the heap (as part of its implementation), where as the Vec<i32> itself -- a pointer, length, and capacity triple -- may be on the stack,[1] e.g. assigned to a local variable. Or you could put it into a Vec<Vec<i32>>, in which case the Vec<i32> triple is now on the heap. Or you could have static V: Vec<i32> = Vec::new() to put it in static memory.... etc etc. The type doesn't care.

For vec.iter() where vec: Vec<i32>, the list of method receiver candidates would be

  • Vec<i32>
    • &Vec<i32>
    • &mut Vec<i32>
  • [i32] (due to deref)
    • &[i32]
    • &mut [i32]
  • (can't deref or unsize[2] any more, so no more method candidates)

And [i32]::iter has a self: &[i32] receiver. That's the first match that method resolution finds (assuming you haven't locally defined or imported a trait that has a iter method which matches something earlier in the list).

If you had vref: &Vec<i32> and called vref.iter() instead, there would be three more candidates at the top...

  • &Vec<i32>
    • &&Vec<i32>
    • &mut &Vec<i32>

...but that doesn't change which method gets found (unless again you have some surprising trait with an iter method imported).


  1. or in registers! ↩︎

  2. side note: the reference fails to point this out, but only array-to-slice unsizing is attempted for method resolution. There is no separate unsizing step in this list because we didn't end up with an array. The [i32] showed up due to Deref, not unsizing. ↩︎

2 Likes