Creating a mutable iterator (good practices)

Hi everyone, I'm trying to do a somewhat complex data structure (a node tree) and I'm facing some problems. Italic parts are parts useful for context but nod mandatory for the final question

On my first attempt, each node had a mutable reference on the other nodes connected to its inputs and outputs (while all the nodes were in a vector created in main)
But I discovered that we can't have multiple mutable references to one object, so this attempt is a fail.

On my second attempt, I'm trying to make a structure based on indexes. Each node is still in a vector and each node refers to others via their index in this vector. I didn't find a better solution but I would like to hear some feedback about this.

And here is my problem. I was trying to make a custom iterator on my node data to only list nodes that match a condition (the condition being "is not deleted" because I'm not removing a deleted node from the vector each time as it forces me to change all the indexes referred by some nodes)

This iterator would allow the user to make all sorts of operations on the nodes, and thus be a mutable reference iterator that allows modification of the value in the vector

This is a small reproduction of the faced problem :

struct MyStruct {
    data: Vec<i32>,
    curr: usize,
}

// Iterator of all pair numbers
impl Iterator for MyStruct {
    type Item = i32;
    fn next(&mut self) -> Option<Self::Item> {
        if self.curr >= self.data.len() {
            None
        } else if self.data[self.curr] % 2 == 0 {
            self.curr += 1;
            Some(self.data[self.curr - 1])
        } else {
            self.curr += 1;
            self.next()
        }
    }
}

fn main() {
    let mut my_struct = MyStruct{data: (0..10).collect(), curr: 0};

    for i in &mut my_struct.data {
        *i = 0; // Works
    }
    for i in &mut my_struct {
        *i = 0; // Do not compile
    }
}

I saw and understand the problem here
Then what would be the cleanest approach ?
Creating an unsafe block because I know the operation is valid and do not present any overlap, thus creating something close to the real vector iterator
Or return a vector of valid indexes, which is less user friendly but does not require unsafe blocks

Thanks !

Also I'm new to this forum, if something is not right let me know :smile:

A more general principle: don't build your application data structures out of references, mutable or not. As a beginner, you should think of references as being primarily for the purposes of passing values to and from functions, not for building data structures. When you are starting out, any time you find yourself writing struct MyStruct<'a>, you should conclude that you are making a mistake. There are exceptions, but this should be considered an advanced topic until you are more comfortable with the typical ways to use references.

Instead, data structures should always own their contents.

Don't implement Iterator on your data structures. The only kind of iterator you can sometimes build that way is a consuming iterator — one where the data structure is gone afterward. In order to build an iterator over references, the thing being referred to (the contents of MyStruct) needs to be separate from the iterator itself, so that mutating the iterator doesn't invalidate the references to the items.

Instead you should define separate iterator structs, and then use those to implement IntoIterator for your structure or a reference to it. This is how the mutable iterator would look:

struct MyMutIter<'a> { ... }

impl Iterator for MyMutIter {
    type Item = &'a mut i32;
    fn next(&mut self) ...
}

impl<'a> IntoIterator for &'a mut MyStruct {
    type Item = &'a mut i32;
    type IntoIter = MyMutIter<'a>;
    fn into_iter(self) -> MyMutIter<'a> {
        MyMutIter { ... }
    }
}

// The other two common `IntoIterator` implementations are:
// Consuming
impl IntoIterator for &'a mut MyStruct {
    type Item = i32;
    ...
}
// Immutable reference
impl<'a> IntoIterator for &'a MyStruct {
    type Item = &'a i32;
    ...
}

Then what would be the cleanest approach ?

Since your iterator

  • is based on a Vec, and
  • only steps forward,

you can build on top of the existing mutable iterator for slices, std::slice::IterMut. Start with my_struct.data.iter_mut(), put that in a field of your iterator struct, and call its next() instead of maintaining an index with self.curr += 1;. This way, you can make use of the standard library's existing unsafe code[1] instead of having to write your own.


  1. Technically, it's also possible to build a slice iterator out of purely safe code with careful use of pattern matching. But that's a trick that doesn't generalize to other data structures. ↩︎

12 Likes

Thanks for your reply ! I'll try this.

I have just one more question when you say

Where was this principle broken on my try ?
Thanks !

I did misread your original statement a bit; on reconsideration, perhaps I should have said “own and not borrow” instead. “Don't include any references as part of the internal structure of your data structures.” This:

each node had a mutable reference on the other nodes

will always lead to failure, even if you don't have multiple mutable references involved, and even if the references aren't mutable. You can't (normally) make a data structure that has a reference from one part of it to another part, or rather, if you do, you will find that the data structure becomes unusable. This is because the lifetime system expects the 'a in MyStruct<'a> to refer to data which outlives the structure — which is valid, unmoved, and unmodified for as long as the MyStruct exists. So if you try to force such a lifetime to point into the structure itself, computing the consequences of that leads to the conclusion “this structure is borrowed by itself for the rest of its existence”, which makes it unmovable, immutable, and often not droppable.

6 Likes

Understood, thanks a lot !