Mutable Iterator and Lifetime

Hello,

I am trying to write a mutable iterator over a u8 array. And cannot make It work. I have read that I need unsafe for that here.

Here the immutable version. Was not too hard to achieve. playground

Sub question : If you have ressource on the meaning of the lifetime param over the impl and the Slice I would love to know what they precisely meant.

And here the mutable one


struct SliceOption {
    start  : usize,
    end    : usize,
    stride : usize
}

struct Slice<'a> {
	idx    : usize,
	option : &'a SliceOption,
    data   : &'a Vec<u8>
}

impl<'a> Slice<'a> {
	fn for_array(array: &'a Vec<u8>, option: &'a SliceOption) -> Slice<'a> {
		Slice {
			idx    : option.start,
			option : option,
			data   : array
		}
	}
}

impl<'a> Iterator for Slice<'a> {
    type Item = &'a mut u8;

    fn next(&mut self) -> Option<& mut u8> {

    	if self.idx >= self.option.end { return None }
    	let index = self.idx;
    	self.idx += self.option.stride; 

        Some(&mut self.data[index])
    }

}

fn main() {

	let numbers = vec![0u8,1,2,3,4,5,6,7,8,9];

	let option = SliceOption{
		start:0,
		end:10,
		stride:3
	};

	for i in Slice::for_array(&numbers,&option)
	{
		*i = 0;
	}

	println!("{:?}",numbers );
}

Which does not compile (obviously)

Don't think of it as a mutable reference, or immutable reference. Instead think of it in terms of unique and shared references respectively.

https://limpet.net/mbrubeck/2019/02/07/rust-a-unique-perspective.html

The reason why making an iterator that yields unique references is harder is because it is harder to prove that you are yielding unique references.

let x = iter.next().unwrap();
let y = iter.next().unwrap();
// here x and y can *never* alias

This is very hard to prove in general, and this problem just doesn't exist with shared references, because well, they are shared. It doesn't matter if they alias.

It will be a lot easier to prove if you use actual slices instead of &Vec<u8> because of the nice functions slices provide, for example: split_first_mut.

Also you can never convert a &T to a &mut T, ever. No one is special enough to do this :slight_smile:. (Note: You can go from &Type<T> to &mut Tvia interior mutability, and you must always go through aUnsafeCell`, but that isn't relavent for iterators).

Here is a simple implementation for a slice iterator,

pub struct SliceIterMut<'a, T> {
    slice: &'a mut [T],
}

impl<'a, T> Iterator for SliceIterMut<'a, T> {
    type Item = &'a mut T;

    fn next(&mut self) -> Option<Self::Item> {
        let (first, rest) = std::mem::take(&mut self.slice).split_first_mut()?;
        self.slice = rest;
        Some(first)
    }
}

But why not use the std iterator?

fn main() {
	let numbers = vec![0u8,1,2,3,4,5,6,7,8,9];

	for i in numbers[0..10].iter_mut().step_by(3)
	{
		*i = 0;
	}

	println!("{:?}",numbers );
}

Edit: Fixed according to @alice's correction

1 Like

It wont compile if you leave the slice in self.slice. One option is to use mem::take to replace the slice with an empty slice before splitting:

impl<'a, T> Iterator for SliceIterMut<'a, T> {
    type Item = &'a mut T;

    fn next(&mut self) -> Option<Self::Item> {
        let (first, rest) = mem::take(&mut self.slice).split_first_mut()?;
        self.slice = rest;
        Some(first)
    }
}

playground

1 Like

Okay, let's do this error-by-error.

First, we have this error:

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in generic type due to conflicting requirements
  --> src/main.rs:26:5
   |
26 |     fn next(&mut self) -> Option<&mut u8> {
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 26:5...
  --> src/main.rs:26:5
   |
26 | /     fn next(&mut self) -> Option<&mut u8> {
27 | |         if self.idx >= self.option.end {
28 | |             return None;
29 | |         }
...  |
33 | |         Some(&mut self.data[index])
34 | |     }
   | |_____^
   = note: ...so that the method type is compatible with trait:
           expected fn(&mut Slice<'a>) -> std::option::Option<&mut u8>
              found fn(&mut Slice<'a>) -> std::option::Option<&mut u8>
note: but, the lifetime must be valid for the lifetime `'a` as defined on the impl at 23:6...
  --> src/main.rs:23:6
   |
23 | impl<'a> Iterator for Slice<'a> {
   |      ^^
   = note: ...so that the types are compatible:
           expected std::iter::Iterator
              found std::iter::Iterator

This is kinda verbose (honestly kinda a terrible error message), but it essentially says that our lifetime shouldn't outlive the lifetime of Option<&mut u8>. Hm, doesn't something seem odd about about Option<&mut u8> here. What is the item type again?

    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&mut u8> {

This seems different. Let's correct that.

    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&'a mut u8> {

Then, we get another error.

error[E0596]: cannot borrow `*self.data` as mutable, as it is behind a `&` reference
  --> src/main.rs:33:19
   |
10 |     data: &'a Vec<u8>,
   |           ----------- help: consider changing this to be mutable: `&'a mut Vec<u8>`
...
33 |         Some(&mut self.data[index])
   |                   ^^^^^^^^^ cannot borrow as mutable

Okay, just make it mutable, easy. The compiler even says how to do this.

error[E0308]: mismatched types
  --> src/main.rs:18:19
   |
18 |             data: array,
   |                   ^^^^^ types differ in mutability
   |
   = note: expected type `&mut std::vec::Vec<u8>`
              found type `&'a std::vec::Vec<u8>`

More mutability changes.

error[E0495]: cannot infer an appropriate lifetime for lifetime parameter in function call due to conflicting requirements
  --> src/main.rs:33:19
   |
33 |         Some(&mut self.data[index])
   |                   ^^^^^^^^^^^^^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime #1 defined on the method body at 26:5...
  --> src/main.rs:26:5
   |
26 | /     fn next(&mut self) -> Option<&'a mut u8> {
27 | |         if self.idx >= self.option.end {
28 | |             return None;
29 | |         }
...  |
33 | |         Some(&mut self.data[index])
34 | |     }
   | |_____^
note: ...so that reference does not outlive borrowed content
  --> src/main.rs:33:19
   |
33 |         Some(&mut self.data[index])
   |                   ^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'a` as defined on the impl at 23:6...
  --> src/main.rs:23:6
   |
23 | impl<'a> Iterator for Slice<'a> {
   |      ^^
   = note: ...so that the expression is assignable:
           expected std::option::Option<&'a mut u8>
              found std::option::Option<&mut u8>

This is where things become interesting. You see, &mut references are exclusive. If you hold a mutable reference to something, you cannot have other references to the same memory location. The error message says that method next borrows the entire struct, when we only wanted to borrow 'a.

In fact, the compiler is correct here, if the same index would be passed multiple times, it would violate Rust memory model. In fact, if stride was 0, this would violate the memory model.

How do we avoid borrowing the entire structure? One way out is to move data field instead. If you move, instead of borrowing, you avoid the need to borrow the entire structure. For instance, doing the following could work. This isn't quite correct, but will let us continue further.

Some(&mut std::mem::take(&mut self.data)[index])

Okay, cool, let's compile again.

error[E0277]: the trait bound `&'a mut std::vec::Vec<u8>: std::default::Default` is not satisfied
   --> src/main.rs:33:34
    |
33  |         Some(&mut std::mem::take(&mut self.data)[index])
    |                                  -^^^^^^^^^^^^^
    |                                  |
    |                                  the trait `std::default::Default` is not implemented for `&'a mut std::vec::Vec<u8>`
    |                                  help: consider removing 1 leading `&`-references
    |
    = help: the following implementations were found:
              <std::vec::Vec<T> as std::default::Default>

Hm, yeah, cannot take &mut Vec<u8>. There is no value that could be put back instead. However, we don't really need vectors here, slices should be sufficient, and you can use std::mem::take on &mut [u8]. We can replace our usages of Vec<u8> with [u8].

struct Slice<'a> {
    idx: usize,
    option: &'a SliceOption,
    data: &'a mut [u8],
}

impl<'a> Slice<'a> {
    fn for_array(array: &'a mut [u8], option: &'a SliceOption) -> Slice<'a> {
        Slice {
            idx: option.start,
            option: option,
            data: array,
        }
    }
}

Okay, let's compile this and check if it works.

error[E0308]: mismatched types
  --> src/main.rs:46:31
   |
46 |     for i in Slice::for_array(&numbers, &option) {
   |                               ^^^^^^^^ types differ in mutability
   |
   = note: expected type `&mut [u8]`
              found type `&std::vec::Vec<u8>`

Okay, so we have a shared borrow here, let's make it mutable.

for i in Slice::for_array(&mut numbers, &option) {

Compile again, and...

error[E0596]: cannot borrow `numbers` as mutable, as it is not declared as mutable
  --> src/main.rs:47:31
   |
39 |     let numbers = vec![0u8, 1, 2, 3, 4, 5, 6, 7, 8, 9];
   |         ------- help: consider changing this to be mutable: `mut numbers`
...
47 |     for i in Slice::for_array(&mut numbers, &option) {
   |                               ^^^^^^^^^^^^ cannot borrow as mutable

Sure, let's put mut as compiler says.

let mut numbers = vec![0u8, 1, 2, 3, 4, 5, 6, 7, 8, 9];

Does this compile? Surely there is something else to fix.

Right?

Well, yes, but the program compiles now. The program fails at runtime, as expected, as std::mem::take isn't quite a correct fix, replacing array in the structure with an empty array. Sure, you can get the first element, but after the first element, the program will crash.

thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 3', src/main.rs:34:19
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

But now that we are using slices, can we, say, partially move the slice out? The progression always moves forward, so we only can move part of a slice. Let's check out the documentation at https://doc.rust-lang.org/std/primitive.slice.html for any interesting methods. split_at_mut sounds particularly interesting.

pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) { ... }

This returns two references, so we can move one out, and assign the other one back in. Because we need to correct indexes after using split_at_mut, we also want to modify Slice to own SliceOption instead of borrowing it. All fields are copy-able, so there is no real need to borrow in this case.

#[derive(Copy, Clone)]
struct SliceOption {
    start: usize,
    end: usize,
    stride: usize,
}

struct Slice<'a> {
    idx: usize,
    option: SliceOption,
    data: &'a mut [u8],
}

impl<'a> Slice<'a> {
    fn for_array(array: &'a mut [u8], option: &'a SliceOption) -> Slice<'a> {
        Slice {
            idx: option.start,
            option: *option,
            data: array,
        }
    }
}

impl<'a> Iterator for Slice<'a> {
    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&'a mut u8> {
        if self.idx >= self.option.end {
            return None;
        }
        let index = self.idx;
        self.idx += self.option.stride;

        let (a, b) = std::mem::take(&mut self.data).split_at_mut(index + 1);
        self.idx -= a.len();
        self.option.end -= a.len();
        self.data = b;
        Some(&mut a[index])
    }
}

At this point, this program works. Can there be more done. Yes. SliceOption's start and end seem hardly idiomatic, when Rust slices can be also used a represent... well, slices. We can easily remove those.

#[derive(Copy, Clone)]
struct SliceOption {
    stride: usize,
}

struct Slice<'a> {
    idx: usize,
    option: SliceOption,
    data: &'a mut [u8],
}

impl<'a> Slice<'a> {
    fn for_array(array: &'a mut [u8], option: &'a SliceOption) -> Slice<'a> {
        Slice {
            idx: 0,
            option: *option,
            data: array,
        }
    }
}

impl<'a> Iterator for Slice<'a> {
    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&'a mut u8> {
        if self.idx >= self.data.len() {
            return None;
        }
        let index = self.idx;
        self.idx += self.option.stride;

        let (a, b) = std::mem::take(&mut self.data).split_at_mut(index + 1);
        self.idx -= a.len();
        self.data = b;
        Some(&mut a[index])
    }
}

fn main() {
    let mut numbers = vec![0u8, 1, 2, 3, 4, 5, 6, 7, 8, 9];

    let option = SliceOption {
        stride: 3,
    };

    for i in Slice::for_array(&mut numbers[0..10], &option) {
        *i = 0;
    }

    println!("{:?}", numbers);
}

Also, we don't really need SliceOption for a single parameter.

struct Slice<'a> {
    idx: usize,
    stride: usize,
    data: &'a mut [u8],
}

impl<'a> Slice<'a> {
    fn for_array(array: &'a mut [u8], stride: usize) -> Slice<'a> {
        Slice {
            idx: 0,
            stride,
            data: array,
        }
    }
}

impl<'a> Iterator for Slice<'a> {
    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&'a mut u8> {
        if self.idx >= self.data.len() {
            return None;
        }
        let index = self.idx;
        self.idx += self.stride;

        let (a, b) = std::mem::take(&mut self.data).split_at_mut(index + 1);
        self.idx -= a.len();
        self.data = b;
        Some(&mut a[index])
    }
}

fn main() {
    let mut numbers = vec![0u8, 1, 2, 3, 4, 5, 6, 7, 8, 9];

    for i in Slice::for_array(&mut numbers[0..10], 3) {
        *i = 0;
    }

    println!("{:?}", numbers);
}

I don't think idx is necessary either. I mean, it's going to be either 0 or stride - 1 at this point. We can make it so that first position of a slice is always a returned value to make it consistently 0 and therefore make this field useless.

struct Slice<'a> {
    stride: usize,
    data: &'a mut [u8],
}

impl<'a> Slice<'a> {
    fn for_array(array: &'a mut [u8], stride: usize) -> Slice<'a> {
        Slice {
            stride,
            data: array,
        }
    }
}

impl<'a> Iterator for Slice<'a> {
    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&'a mut u8> {
        if self.data.is_empty() {
            return None;
        }
        let (a, b) = std::mem::take(&mut self.data).split_at_mut(1);
        self.data = b.get_mut(self.stride - 1..).unwrap_or(&mut []);
        Some(&mut a[0])
    }
}

As it happens, Rust provides a convenience function called split_first_mut, so no reason to use split_at_mut(1). It returns an optional, so we don't need to check whether self.data is empty, we can simply use ? operator to return None if it returns None.

impl<'a> Iterator for Slice<'a> {
    type Item = &'a mut u8;

    fn next(&mut self) -> Option<&'a mut u8> {
        let (a, b) = std::mem::take(&mut self.data).split_first_mut()?;
        self.data = b.get_mut(self.stride - 1..).unwrap_or(&mut []);
        Some(a)
    }
}

Are we done yet? No, as it happens. We don't need the Slice either, as Rust provides step_by method for its iterators. We could have done it from the start, but then there wouldn't be a learning exercise, would there be?

fn main() {
    let mut numbers = vec![0u8, 1, 2, 3, 4, 5, 6, 7, 8, 9];

    for i in numbers[0..10].iter_mut().step_by(3) {
        *i = 0;
    }

    println!("{:?}", numbers);
}

You may be concerned about performance of step_by, but as it happens, step_by isn't quite naive. It doesn't call next() multiple times, but rather calls nth method of Iterator which for slice iterators (such as one returned by iter_mut method of slice) moves the pointer by n elements, which is really cheap.

5 Likes

Oh I did not see this issue about uniqueness... Thank you for the ressources. Very interesting

My main objective is to learn, so rebuild the wheel is very important to me.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.