Sketch of Vec, Slice and Array -> Iterator: Helpful? Clear?

You might be drawing connections I didn't mean to make. Let's back up a minute.

  1. It's true that .into_iter() on a Vec<T> consumes the Vec and yields Ts, while .iter() and .iter_mut() do not consume and do not yield Ts.
  2. It's also true that .into_iter() is a method of a trait, IntoIterator, while .iter() and .iter_mut() are inherent methods.
  3. Finally, it's true that .into_iter() is implemented on Vec<T> precisely, while .iter() and .iter_mut() are implemented for [T] and used on Vec through Deref and DerefMut.

But these three facts are essentially orthogonal; they have almost nothing to do with each other. You can't connect any of them with the word "because" or "therefore" because they all have different reasons.

  1. is just the definition of how iteration for those items works.
  2. is a consequence of the fact that there is no trait for iter_mut() and iter(). Partly this is because such a trait is inexpressible without GATs, as I mentioned earlier; however, you also don't really need a trait version of those methods because slices (and all well-behaved data structures) just make foo.iter() an alias for (&foo).into_iter() and foo.iter_mut() an alias for (&mut foo).into_iter(). There is no need for Iterable and IterableMut traits to implement on Vec because they would be redundant with the IntoIterator implementations that already exist for &Vec and &mut Vec.
  3. is simply the consequence of two (unrelated) limitations in the Rust language itself. In the first place, it's not possible to implement IntoIterator for [T] without breaking backwards compatibility, although the limitation that originally prevented this from working has since been removed. But it's also not possible to call self-taking methods on non-Copy receivers through a pointer unless the pointer is specifically Box, because there's no DerefMove trait. This is another thing that has been proposed but no solid design for DerefMove has ever been established and it is less popular than GATs, so it probably won't be implemented any time soon.

My point is that there's not a grand unified theory that explains all these things, like you seem to be grasping for. All these facts are true but for different and mostly unrelated reasons. Don't look too hard for patterns or you'll start to see them where they aren't there.

As for other data structures that relate to each other like Vec<T> and [T], consider String and str, PathBuf and Path. The thing to note about these is that none of the "borrowed" types are references; instead, they're the thing being referenced, because that's the only way you can implement Deref. Your suggested system of FullFeatured<T> and DataOnly<'_, T> would not be able to implement Deref correctly (although I think you could do something with trait objects...)

2 Likes

Thank you. You are indeed getting at the heart of what I am trying to accomplish. So, I will lower my expectation (if you will :). What I'm after is a consistent sequence of "how to apply" in a way that is unified in that it is consistent throughout how the functions might be composed, and otherwise to re-use code in a scalable way.

You made several helpful points and demonstrations. When you said:

... in terms of "what comes first, chicken or the egg" kind of thing, are you suggesting implement the logic in (&foo).into_iter(), then use it to implement foo.iter()?

Am I right to relate what you are saying here to the array in C. So, "they're the thing being referenced" is like my using a pointer to access the array itself (the array variable is a pointer in C). &str is not a borrow but how we access the data itself. Yes?

In going from FullFeatured to DataOnly I believe you said that I would not be able to correctly implement Deref. I thought that would have been the case for AsRef but not Deref. I have been modeling deref as a version of my struct that is the serializable version of my struct (i.e., show me the data minus any functions)... also as the properties that are likely to define equality. Is that a flawed view in your opinion?

In the link you posted

Is there any reason Box<[T]> shouldn't implement IntoIter ?

IntoIter is a struct, or a type spec made by the IntoIterator trait. How does this suggestion make sense?

One last question: If you have a data type that hosts a collection of data, generic over T, how do you think about the design of how to iterate over the data?

Would that not undermine the ownership boundary? I.e., let a borrow take ownership is precisely not allowed. True unless working in an “inner mutability” context created using Box?

This makes sense. The chain, zip, eq_cmp etc... all require a second Iterator but does so via IntoIterator. Precisely via U: IntoIter; a spec expressed using a trait alias of IntoIterator with one of the two type parameters specified:
IntoIterator<Item = Self::Item> where IntoIterator has two type parameters Item and IntoIter.

fn chain<U>(self, other: U) -> Chain<Self, U::IntoIter>
    where
        Self: Sized,
        U: IntoIterator<Item = Self::Item>,
    {
        Chain::new(self, other.into_iter())
    }

Similarly, the for loop.

Something was still not fully resonating with me about the design until re-reading the posts this morning. Your intuition in a prior post to talk about why a Collection generally is not the right place to store the state of the Iterator, was on to something.

If we submit that a Collection cannot also be an Iterator AND we want to treat a Collection as Iterator "on demand", how might we accomplish this?

This is a powerful construct. You have two structures that are inherently incompatible with each other, in this case one is an observer of the other (not allowed to be self-aware :)), but using the type system and traits, we create an infrastructure that treats each type as if they were "one in the same".

That's big. I get the idea of unifying types, but I did not see this one before now.

Iterator<Item = T> ~ Collection<T>

// some combination of coercion and dispatch
Collection<T> -> Iterator<Item = T>
Iterator<Item = T> -> Collection<T>

I suspect this might help simplify and convey an "inherent truth" about iterating over Vec, Array and slices (keeping in mind what @trentj mentioned earlier; don't look for something that isn't there).

Do you know where else I see this general "incompatibility"? Types that read from themselves to generate a report/custom view of "self", i.e., a subset of self-referencing data structures. The ones that I would exclude are those where the "view" is an element of the collection used to extend the collection; so truly conceptually inseparable (e.g., a struct that hosts a list of types, where a new type is expressed in terms of the types already in the collection).

No, they all take/require an IntoIterator (but every Iterator is also an IntoIterator)

2 Likes

Thank you! Yes. I misspoke - I'll make the change in the post itself to clarify.

The arrows, as presented anyway, did not get a lot of love. This next version is in a table format.
Better?

This is same in Python, too. There is built-in iter function that returns an iterator for passed object, and then you are calling the next function on the returned iterator for getting elements. The IntoIterator trait was strange for me until I noticed that.

2 Likes

Thanks to, and informed by the feedback provided to-date, I have provided an updated version that describes the "entry-points" for leveraging the Rust iterator infrastructure.

Last updated Feb 13, 2021

While this seems to provide a reasonably complete view/starting point of the "moving parts", whilst touching on the decisions and implementations need to be made for a hypothetical MyCollection, this does not capture the trade-offs of each option, and in context of the move vs borrow semantics.

It doesn't really matter which is implemented in terms of what. That could be changed tomorrow and nothing would be different from a user's perspective. I just meant they do the same thing.

There is a sense in which IntoIterator is more important, because if IntoIterator didn't exist then you wouldn't be able to use it in generics; whereas if .iter() didn't exist you would just have to use (&foo).into_iter() instead and it would be a little bit more verbose and annoying but not really less expressive in any way. But since both exist it doesn't matter which one is implemented in terms of the other.

Hmm, sorry, I know C pretty well but I'm not really sure what this question means. I think you might be confusing &str with str? They are distinct types. String is a pointer to str, &str is another kind of pointer to str. Vec<T> is similarly a pointer to [T].

The problem (?) with FullFeatured<T> and DataOnly<'_, T> is that FullFeatured<T> is not a pointer to DataOnly<'_, T>. You couldn't implement Deref for FullFeatured so that it points to DataOnly (or vice versa). This doesn't mean they're not useful types, they're just not related to each other the way Vec<T> and [T], or String and str are.

Deref is fundamentally a trait that is implemented for pointers, or things that are in some way pointer-like (there's some disagreement around what exactly should be Deref; here's an earlier thread on that subject). Deref doesn't say anything about serialization or equality or what other data the struct might have... it basically just says "this struct points to something". Vec<T> implements Deref because it's a (smart) pointer, and the thing it points to is [T]. Since it implements Deref, you can use . to call methods of [T] on Vec<T>, the same way you can call methods of [T] on &[T]; they're both pointers.

The author meant IntoIterator.

If Collection<T> can implement IntoIterator, do that. If you can implement IntoIterator for &Collection<T> and/or &mut Collection<T>, with the semantics one would expect, you should do that too. If there are other ways of iterating over your collection, write inherent methods, like Vec<T>::drain (which yields Ts by emptying, not consuming, the Vec).

The other "half" of the interface between collections and iterators, which hasn't been mentioned directly in this thread yet, is FromIterator<T>, which is the trait that enables Iterator::collect. You implement FromIterator<T> for Collection<T> to be able to write things like let c: Collection<T> = some_iterator.collect(). I suppose you could say that FromIterator<T> and IntoIterator are what Rust has instead of a high-level Collection interface like in Java.

DerefMove (if it existed) could take self in the same way DerefMut takes a &mut self; for example:

trait DerefMove: DerefMut {
    type Output;
    fn deref_move(self) -> Self::Output;
}

Box does not need this because the knowledge of how to move things out of a Box is actually coded within the compiler itself. You can call .into_iter() on a Box<Vec<T>>. But this isn't possible for other types.

2 Likes

That's a great comment that took me a while to connect with what I'm realizing is a "bigger" learning from all this.

While you make the parallel with what's going on in .Net, does the Rust "infrastructure" accomplish anything noteworthy? (i.e., the use of traits, inherent methods and the dot operator)

On a related note, I was thinking that whatever was accomplished here, could be considered as a more general pattern for unifying two inherently incompatible types. In the case of iterating over MyCollection, I need a separate struct to host the state of the iterator; in that sense MyCollection and MyIterator are "incompatible". More generally, another example might be the challenge often faced with self-referencing data structures. The solution (other than Box) can be to create two structs to accomplish the task. The usual downside is then how to "unify" them for the user.

Am I off base with this train of thought?

I'm not sure what you mean by "noteworthy". Rust's traits are quite close to Haskell's type classes, for example.

Ah, you have gotten too used to Rust! Has the magic of it left you :)) ?

I would include how Rust uses traits despite the fact that it was proved useful in Haskell.

So while the pattern seems to be used in .NET, is the user experience of it (if you will), particularly impressive in your opinion?

For instance, one hassle with a strongly typed system is the ability to reuse logic where overlap exists between types. What is helpful is coercion and other forms of casting. Facilitating where it makes sense makes the type system that much more useful (in practice).

Great point. What I’m thinking about is a way to consider data and mutable views of the data. To be clear, read-only data, only able to mutate some synthesis/transformation of the data.

OO has you think data and methods under the “same roof”. In Rust, the borrow checker makes it challenging to ref self. It’s a contrived need but somehow conceptually compelling.

FP is all about function composition where the syntax is only ergonomic in something like Haskell. JS function chaining works nicely because it abstracts over what I love about Rust.

However, is there a way to otherwise unify the mutable view and data? Perhaps more like how Iterator and IntoIterator? That’s perhaps the best of both? Best way to align the most appealing design, keeping the borrow checker happy with a robust/clean ergonomic syntax.

I think this QotW is a good explanation of it:

The magic of rust is the combination, not really any of the individual things themselves.

I conquer. However, you did not answer my question :)). Part of what I like about Rust is how it’s both not OO nor FP, but can also be OO and FP. Rust isn’t limiting how I think about my abstractions and design, I am! I’m my own worst enemy :))

I suspect/wonder if there is not a pattern, or something in the Iterator and IntoIterator combination that I should be looking at to get me to “think differently”. You said you saw this in .NET. Iteration is fundamental, so at one level “of course” you see it elsewhere. I’m wondering if from your experience if you see anything else that maybe only Rust accomplished. Do you know what I mean?

Updated versions of the charts... A theme that comes to mind is both the amount of type inference made possible and how it's utilized to provide an ergonomic MyCollection/Iterator capacity.

Ideas for the key message/observation are welcome in addition to general impression of usefulness.

last updated Feb 16, 2021

This includes the feedback thus far provided.

Any suggestions on how to make this more accessible?

Here is the final version of the documentation. Thank you to everyone for providing their constructive feedback.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.