Looking for A Comprehensive IntoIter Tutorial or Good Examples in Standard Library


#1

Hey all!

I’m working on some algorithms and data structures problems–trying to get them down with idiomatic solutions–but I’m having a hard time mentally parsing what I see in std::collections implementations when it comes to converting data structures into Iterators of various kinds such as that in the Linked List implementation. Even the most basic implementation of IntoIter eludes me. D:

The most recent example I found via Google was pretty simplistic if not a tad old, and I’d love something more full and palatable as an introduction, or–with regards to the former–a simpler example from the standard library.

Does anyone have some material they’d suggest?


#2

Data structures are a pretty hardcore topic in rust; and unless you’re building your own from the ground up (in which case you have many problems beyond just IntoIterator), you should seldom need to write your own iterator.

For many common structs, you can create an iterator from the into_iter methods of your members, possibly with the assistance of ::std::iter::once or ::Iterator::chain. There is one tricky part, though: You have to name the output type. (on stable, at least)


use std::iter::{once, Once, Chain};
use std::vec::IntoIter as VecIntoIter;

struct Things {
    a: i32,
    b: Vec<i32>,
}

type IntoIter = Chain<Once<i32>, VecIntoIter<i32>>;

impl IntoIterator for Things {
    type Item = i32;
    type IntoIter = IntoIter;
    fn into_iter(self) -> IntoIter { once(self.a).chain(self.b) }
}

fn main() {
    let things = Things { a: 4, b: vec![5, 6, 7] };
    
    for x in things {
        println!("{}", x);
    }
}

The second thing you linked could similarly have been written to return a Chain<Chain<Once<i8>, Once<i8>>, Once<i8>>.

But this isn’t an option if you need to operate on the data in some way (at least, not on stable). I will try to write more later, but in short, you will need to write:

  • A “iterator” type, to be the return type of into_iter. For a struct living in its own module, this type is usually called IntoIter.
  • An Iterator impl for said iterator type. (this is the important part! the rest is mostly type-system boiler-plate)
  • An IntoIterator impl for the struct (which should just construct and return the iterator)

#3

Okey dokes. So about that LinkedList:

A bird’s-eye view of iteration code in std::linked_list

Here’s a bird’s-eye snapshot of all or most of the code involved in iterating through linked lists in the standard library (ignoring unstable features and #[inline] attributes):

pub struct Iter<'a, T: 'a> {
    head: Option<Shared<Node<T>>>,
    tail: Option<Shared<Node<T>>>,
    len: usize,
    marker: PhantomData<&'a Node<T>>,
}

impl<'a, T> Clone for Iter<'a, T> {
    fn clone(&self) -> Self {
        Iter { ..*self }
    }
}

pub struct IterMut<'a, T: 'a> {
    list: &'a mut LinkedList<T>,
    head: Option<Shared<Node<T>>>,
    tail: Option<Shared<Node<T>>>,
    len: usize,
}

#[derive(Clone)]
pub struct IntoIter<T> {
    list: LinkedList<T>,
}

impl<'a, T> Iterator for Iter<'a, T> { /* 20 lines... */ }

impl<'a, T> DoubleEndedIterator for Iter<'a, T> { /* 13 lines ... */ }

impl<'a, T> ExactSizeIterator for Iter<'a, T> {}

impl<'a, T> Iterator for IterMut<'a, T> { /* 20 lines ... */ }

impl<'a, T> DoubleEndedIterator for IterMut<'a, T> { /* 13 lines ... */ }

impl<'a, T> ExactSizeIterator for IterMut<'a, T> {}

impl<T> Iterator for IntoIter<T> {
    type Item = T;

    fn next(&mut self) -> Option<T> {
        self.list.pop_front()
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        (self.list.len, Some(self.list.len))
    }
}

impl<T> DoubleEndedIterator for IntoIter<T> {
    fn next_back(&mut self) -> Option<T> {
        self.list.pop_back()
    }
}

impl<T> ExactSizeIterator for IntoIter<T> {}

impl<T> IntoIterator for LinkedList<T> {
    type Item = T;
    type IntoIter = IntoIter<T>;

    fn into_iter(self) -> IntoIter<T> {
        IntoIter { list: self }
    }
}

impl<'a, T> IntoIterator for &'a LinkedList<T> {
    type Item = &'a T;
    type IntoIter = Iter<'a, T>;

    fn into_iter(self) -> Iter<'a, T> {
        self.iter()
    }
}

impl<'a, T> IntoIterator for &'a mut LinkedList<T> {
    type Item = &'a mut T;
    type IntoIter = IterMut<'a, T>;

    fn into_iter(self) -> IterMut<'a, T> {
        self.iter_mut()
    }
}

That’s a lot of code! Can we focus on what’s important?

In order for for x in my_list { ... } to work, this is the minimum that would work:

pub struct IntoIter<T> {
    list: LinkedList<T>,
}

impl<T> Iterator for IntoIter<T> {
    type Item = T;

    fn next(&mut self) -> Option<T> {
        self.list.pop_front()
    }
}

impl<T> IntoIterator for LinkedList<T> {
    type Item = T;
    type IntoIter = IntoIter<T>;

    fn into_iter(self) -> IntoIter<T> {
        IntoIter { list: self }
    }
}

type IntoIter = IntoIter<T>? What?
This line defines the associated type IntoIterator::IntoIter to be our own iterator, which “just so happens” to also be called IntoIter (by convention)

The contents of IntoIter:
One can see that LinkedList’s IntoIter contains a LinkedList. Funny, right? However, this is not true in general for all data structures!

More generally, IntoIter should hold the data in whatever form that will allow it to easily consume elements one-by-one. It just so happens that LinkedList is already a suitable format for iteration! (since it has O(1) retrieval and removal of the first element, via pop_front()).

What’s all that other stuff do, then?

Some of it adds more functionality to the iterator:

Clone: IntoIter had #[derive(Clone)]. This is nice because it allows users to save the current state of an iterator. It will also allow the user to call .cycle() on it.

DoubleEndedIterator: Implementing this on your iterator allows your iterator to be reversed with .rev(). It requires manually implementing a next_back() method:

impl<T> DoubleEndedIterator for IntoIter<T> {
    fn next_back(&mut self) -> Option<T> {
        self.list.pop_back()
    }
}

size_hint: This is a method in Iterator which you can optionally implement if you are able to provide lower/upper bounds on the number of elements. It helps give methods like <Vec as FromIterator>::from_iter an idea of how much memory to allocate, for a small performance boost.

// impl Iterator for IntoIter<T> { ...
    fn size_hint(&self) -> (usize, Option<usize>) {
        (self.list.len, Some(self.list.len))
    }
// }

ExactSizeIterator: This expands on size_hint by making an explicit promise that the lower bound and upper bound are equal. By providing a (trivial) impl of this trait, users will be able to call .len() on the iterator:

impl<T> ExactSizeIterator for IntoIter<T> {}

That’s it for IntoIter.

Iter/IterMut

The rest has to do with iter and iter_mut:

Iter and IterMut are the return types of iter and iter_mut. Like IntoIter, these are iterators, but they borrow or mutate the items instead of consuming them. These are not part of any standard interface, but are often quite useful if they can be provided.

There are also two other IntoIterator impls, for &LinkedList and &mut LinkedList, which return Iter and IterMut. These are simply for convenience; they allow users to write for x in &list as shorthand for for x in list.iter().

What’s with all the unsafe code?

Right. Iter and IterMut have a bunch of unsafe in their implementation.

This is nothing inherent to iteration in general. Rather, it’s like I said: Data structures are a pretty hardcore topic in rust.

Especially linked lists. (seriously, go read that. It’s an emotional roller coaster)


#4

@ExpHP, thank you for the in-detail reply! Sorry it took me so long to reply, I’m traveling quite a bit at the moment.

I suppose it can be said I have a problem, as data structures–rather than a sane, manageable side-project–are how I tend to dive into a language’s internals to familiarize myself. I’m building out Binary Search Trees and Weighted, Directed Graphs, so Linked Lists seemed a sane (saner, at least…) place to start.

I will be going over your replies a few times for sure as I begin to understand them more thoroughly. I will also be diving into the series you linked at the bottom, as both you and others have heavily suggested it and, from what I have read so far, it’s a great read.