How does Vec's iterator return a mutable reference?

How does Vec's iterator manage to return a mutable reference? That ought to be prohibited, right?

Why should it be prohibited?

1 Like

Because if the iterator returns mutable references to the same object more than once, you've violated Rust's single mutable access rule. Vec's iterators do not do that, but if you could write your own mutable-ref-returning iterators, you could create ones that do.

1 Like

Writing a mutable iterator often requires unsafe code for this reason, and then the programmer must ensure that the aliasing requirements are upheld (since the compiler can't do it for you).

For example, in one of my old crates, the immutable iterator is implemented in safe Rust, but the nearly-identical mutable iterator requires an unsafe block.

(As for Vec, all of its iterators are implemented using raw pointers, so they require unsafe anyways.)

7 Likes

No, the elements don't overlap. You can't e.g. .push() onto the end of a vector through a mutable iterator to its elements, you can only mutate individual elements, which are distinct objects. The implementation of the iterator guarantees that it will only ever yield references to distinct elements, so this is fine.

No, that's the point: this is not possible in safe code. To demonstrate that, you could write your own iterator, which doesn't compile for this exact reason:

struct MyIterMut<'a, T> {
    items: &'a mut [T],
}

impl<'a, T> Iterator for MyIterMut<'a, T> {
    type Item = &'a mut T;

    fn next(&mut self) -> Option<Self::Item> {
        if let Some((item, next)) = self.items.split_first_mut() {
            self.items = next;
            Some(item)
        } else {
            None
        }
    }
}

And the error is exactly what one would expect:

error[E0495]: cannot infer an appropriate lifetime for autoref due to conflicting requirements
  --> src/lib.rs:9:48
   |
9  |         if let Some((item, next)) = self.items.split_first_mut() {
   |                                                ^^^^^^^^^^^^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime defined on the method body at 8:13...
  --> src/lib.rs:8:13
   |
8  |     fn next(&mut self) -> Option<Self::Item> {
   |             ^^^^^^^^^
note: ...so that reference does not outlive borrowed content
  --> src/lib.rs:9:37
   |
9  |         if let Some((item, next)) = self.items.split_first_mut() {
   |                                     ^^^^^^^^^^

So basically, the compiler is saying that it can't prove that the references being returned do not overlap, which is correct. You could circumvent this check using unsafe code, similar to what <[T]>::split_first_mut() does internally.

3 Likes

That's about what I figured. If you try to write a mutable iterator, you create a double mutable borrow, which is detected by the borrow checker. I wondered if there was some trick to do what Vec's iterator does without unsafe code. The answer appears to be no, as expected. Thanks.

This also comes up with chunk_mut, which breaks an array into disjoint chunks. That, too, needs unsafe code.

The problem here is that split_first_mut() is called on self.items which doesn't have an 'a lifetime due to being accessed through the &mut self parameter of next. However you can fix fix this code by first taking out items from self.items (you can replace a &mut [] which is 'static due to static rvalue promotion), splitting it and then putting back only next. For instance, this compiles and works fine:

struct MyIterMut<'a, T> {
    items: &'a mut [T],
}

impl<'a, T> Iterator for MyIterMut<'a, T> {
    type Item = &'a mut T;

    fn next(&mut self) -> Option<Self::Item> {
        let items = std::mem::replace(&mut self.items, &mut []);
        if let Some((item, next)) = items.split_first_mut() {
            self.items = next;
            Some(item)
        } else {
            None
        }
    }
}
12 Likes

Also, this is false, <[T]>::split_first_mut() doesn't use any unsafe internally, it uses pattern patching. You can replace if let Some((item, next)) = items.split_first_mut() with if let [item, next @ ..] = items and nothing would change.

2 Likes

Ah that's interesting. However, before that kind of pattern was introduced, it did use unsafe.

I did not know about that feature. I'm amazed that it works for mutable refs. Have to think about the implications of that.

It used to use unsafe when slice patterns weren't a thing.

Another way to remember this is that collect() is valid for all iterators, so any valid implementation of Iterator must allow all the results to coexist with each other. This is trivially true if the iterator is written in entirely safe code, but must be enforced manually if unsafe is involved.

And, because the Iterator trait isn't marked as unsafe, this manual enforcement can't rely on any particular user behavior. i.e. "Don't do that" isn't good enough protection.

1 Like

The problem at hand is, I believe, called Splitting borrows and the Rustonomicon has a chapter on those. It's worth reading, although it only deals with cases where a mutable iterator can actually be implemented safely. Which is not the case for things like Vec.

I'm pretty sure that slice patterns are enough to implement a mutable Vec iterator safely:

struct SliceIter<'a,T> {
    slice: &'a mut [T]
}

impl<'a,T> Iterator for SliceIter<'a,T> {
    type Item = &'a mut T;
    fn next(&mut self)->Option<&'a mut T> {
        let slice = std::mem::replace(&mut self.slice, &mut []);
        match slice {
            [] => None,
            [x, rest @ .. ] => {
                self.slice = rest;
                Some(x)
            }
        }
    }
}

(Rust Playground)

Edit: This turns out to be almost a duplicate of @SkiFire13 's solution above.

2 Likes

Splitting borrows and the Rustonomicon has a chapter on those

Which says: "In order to "teach" borrowck that what we're doing is ok, we need to drop down to unsafe code." Does that pre-date slice patterns going into the language?

In the example above, is it the slice pattern that's allowing the split borrow, or is it std::mem::replace? That function is a raw pointer operation using unsafe code that hides the connection between argument and result from the borrow checker. "Replace can be used to disassociate the original value at that index from self , allowing it to be returned:"
Can you write the above without std::mem::replace?

Both std::mem::replace (or swap) and splitting is necessary to do it. If you lack any one of them, it isn't possible.

1 Like

I'm not sure if this is a fair characterization of replace. Regardless of how it's implemented, it's completely safe to use and doesn't destroy any lifetime information. It's not hiding anything, but rather performing an extra step which is necessary for the borrow check to work.

Fundamentally, what we need to do is construct an &'a mut [T] that contains the unyielded values. To do that, we need an &'a mut [T] that can be split into two parts, but what we actually have is &'_ mut &'a mut [T].

We can't just strip the outer &'_ mut because of aliasing rules. We could save the resulting &'a mut [T] somewhere else, and then we'd have two different exclusive references to the same data: the original one and our copy.

Instead, we destroy the original by replacing it with a reference to the empty slice. At that point, our copy is the only extant reference and we can do what we like with it. In this case, that's splitting off one element and then putting the rest back.

1 Like

It turns out that Option::take() is a viable alternative to the std::mem functions here:

struct SliceIter<'a,T> {
    slice: Option<&'a mut [T]>
}

impl<'a,T> Iterator for SliceIter<'a,T> {
    type Item = &'a mut T;
    fn next(&mut self)->Option<&'a mut T> {
        match self.slice.take() {
            Some([]) | None => None,
            Some([x, rest @ .. ]) => {
                self.slice = Some(rest);
                Some(x)
            }
        }
    }
}

(Rust Playground)

Yes, but Option::take is implemented using std::mem::take, which is implemented using std::mem::replace.

4 Likes

Right. Underneath "take" is unsafe code in std::mem.

let result = ptr::read(dest);
ptr::write(dest, src);
result

After you've done a split like that, have you broken the destructor, as would happen with "forget"?