Implement mutable iterator

This code compiles fine:

struct Iter<'a, T> {
    shards: &'a [T],
    next: usize,
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a T;
    fn next(&mut self) -> Option<Self::Item> {
        if self.next < self.shards.len() {
            self.next += 1;
            Some(&self.shards[self.next - 1])
        } else {
            None
        }
    }
}

However if I add mut to &'a [T] (also respectively to &'a T and &self.shards[...]), it won't compile:
Please see this link to rust playground

How can I make the mutable version compile?

Note that this is only a minimal reproducible example of the actual code. The actual code is not doing self.next += 1 but using a different ordering which can jump back and forth (but visits each index at most once). Therefore using Slice::IterMut or things like that is not an option.

The problem is: how can the compiler know that? This is one of those things you generally prove yourself and then use unsafe to implement.

An alternative is to not use Iterator, and instead provide a fn next(&mut self) -> &mut T method that ties the lifetime of the returned reference to self. Note that this won't allow having references to multiple items at the same time.

2 Likes

This is one of those things you generally prove yourself and then use unsafe to implement.

What would the code look like in that case?

One way would be something like this (NB: I don't know if this is applicable to your real use case or not)

struct Iter<'a, T> {
    shards: &'a mut [T],
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a mut T;
    fn next(&mut self) -> Option<Self::Item> {
        let shards = std::mem::replace(&mut self.shards, &mut []);
        match shards {
            [next, rest @ ..] => {
                self.shards = rest;
                Some(next)
            },
            _ => None
        }
    }
}
1 Like

If you're willing to use some heap allocation, you could write something like this:

struct Iter<'a, T> {
    shards: Vec<Option<&'a mut T>>,
    next: usize,
}

impl<'a, T> From<&'a mut [T]> for Iter<'a, T> {
    fn from(arr: &'a mut [T])->Iter<'a,T> {
        Iter {
            shards: arr.into_iter().map(Some).collect(),
            next: 0
        }
    }
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a mut T;
    fn next(&mut self) -> Option<Self::Item> {
        if self.next < self.shards.len() {
            self.next += 1;
            Some(self.shards[self.next - 1].take().expect("Visited an item twice!"))
        } else {
            None
        }
    }
}

And, for good measure, an unsafe version. You should probably avoid this unless my previous version proves too expensive for your use case.

struct Iter<'a, T> {
    shards: *mut T,
    next: usize,
    len: usize,
    phantom: std::marker::PhantomData<&'a mut [T]>
}

impl<'a, T> From<&'a mut [T]> for Iter<'a, T> {
    fn from(shards: &'a mut [T])->Iter<'a, T> {
        Iter {
            next: 0,
            len: shards.len(),
            shards: shards.as_mut_ptr(),
            phantom: std::marker::PhantomData
        }
    }
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a mut T;
    fn next(&mut self) -> Option<Self::Item> {
        if self.next < self.len {
            self.next += 1;
            unsafe {
                // Safety: `next` is always in the range [0,len) and never takes
                //         the same value twice, so the results will never alias
                Some(&mut *self.shards.offset(self.next as isize - 1))
            }
        } else {
            None
        }
    }
}

@2e71828 I took your Vec<Option<>> example and removed the Option part just to see what it would do:

struct Iter<'a, T> {
    shards: Vec<&'a mut T>,
    next: usize,
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a mut T;
    fn next(&mut self) -> Option<Self::Item> {
        if self.next < self.shards.len() {
            self.next += 1;
            Some(self.shards[self.next - 1])
        } else {
            None
        }
    }
}

and I instantly got the same error: error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements

What is it about the Option that satisfies the borrow checker? I feel like I'm missing something pretty basic.

In your snippet without the Option, you are trying to copy a mutable reference by returning it. Obviously, that's not allowed, because mutable references need to be exclusive. Option::take() moves the inner value out of the vector (by replacing the Some with None) instead of attempting to erroneously duplicate it.

2 Likes

Also note that Option isn’t particularly special here— Vec::remove and Vec::swap_remove will do a similar job, but will also change the indices of the remaining items.

1 Like

It's not necessarily required to use raw pointers here; something like &'a [UnsafeCell<T>] should work fine, too; or perhaps even &'a [Cell<T>] using Cell::as_ptr (the latter is easier to create from a &mut [T], as there's existing safe API Cell::from_mut and as_slice_of_cells).

1 Like

And while we are at it, one could reverse the order of elements in the vector and pop from it as well:

struct Iter<'a, T> {
    shards: Vec<&'a mut T>,
}

impl<'a, T> From<&'a mut [T]> for Iter<'a, T> {
    fn from(arr: &'a mut [T]) -> Iter<'a, T> {
        Iter {
            shards: arr.into_iter().rev().collect(),
        }
    }
}

impl<'a, T> Iterator for Iter<'a, T> {
    type Item = &'a mut T;
    
    fn next(&mut self) -> Option<Self::Item> {
        self.shards.pop()
    }
}

These don't seem to implement Sync, so I wouldn't be able to use them.
Any other suggestions?

Raw pointers aren't Sync, either.

1 Like

Can you point me to a source for that?

My understanding was that if you are using raw pointers you are relying on your own code logic that all the accesses are safe.

See The Rustonomicon § 8.2

In some cases, it is sound to implement Send and/or Sync for your type manually even if it contains non-Send or non-Sync components, but I'm not familiar enough with Rust's thread safety to say what those cases are.

My understanding was that if you are using raw pointers you are relying on your own code logic that all the accesses are safe.

Yes, and Rust is designed to encourage you to take a moment to consider whether your pointer (or UnsafeCell) uses are safe in the presence of threads, and write an unsafe impl Send for MyType {} only if they are. The default assumption is that if a type contains * or UnsafeCell it is not thread safe.

2 Likes

Yes, hence they aren't Sync or Send, so that you have to think twice before manually writing unsafe impl Sync for Foo {} for your raw-pointer wrapping type. This is the same in the case of UnsafeCell, too – raw pointers are also interior mutability primitives.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.