Split mutable iterator over composite structure

ettersi · April 12, 2022, 4:34am

Does stable and safe Rust provide some means to split Iterator<Item = &mut (A,B)> into (Iterator<Item = &mut A>, Iterator<Item = &mut B>)?

Basically I want something like this to work:

let mut v: Vec<(u32,u32)> = vec![(6,6),(6,6)];
let (iter0, iter1) = v.iter_mut().some_magic_function();
iter0.for_each(|x| x = 0);
iter1.for_each(|x| x = 1);
assert_eq!(v, vec![(0,1),(0,1)]);

Put differently, what I am asking for is whether there is a way to lift splitting borrows from operating on structs to operating on iterators of structs.

scottmcm · April 12, 2022, 6:02am

No, because iterators are lazy and this can't be done lazily -- in your example, iterating iter0 would have to advance the iterator, at which point iter1 wouldn't be able to look at those items any more.

So either:

you do it with allocations instead, aka https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.unzip
you call iter_mut twice (and restructure code as needed to get that to pass the borrow checker)

ettersi · April 12, 2022, 6:12am

I'm happy to call iter_mut() twice, but I want both iterators to be alive at the same time so I can update the values concurrently. I believe this is not possible using only safe Rust, so my question is whether the standard library already has some carefully checked unsafe code somewhere which I can leverage to get what I want.

alice · April 12, 2022, 7:04am

The standard library doesn't have any unsafe code for this use-case.

H2CO3 · April 12, 2022, 7:30am

It seems like this is a representation problem, not an algorithm one? (As it often is the case.) If your use case calls for updating the tuple elements in parallel as the primary means of speeding it up, then you should be storing them in two separate arrays to begin with, i.e. transpose Vec<(u32, u32)> to (Vec<u32>, Vec<u32>), and it will be easy enough to reconstruct an iterator-of-pairs by zipping them when needed.

ettersi · April 12, 2022, 7:46am

Yeah, I'm aware that reorganising my data structure as you suggested would solve this particular problem. I'm still a bit reluctant to do so, however, because for most other purposes having a vec-of-structs rather than a struct-of-vecs is just much more convenient. In Julia, there's StructArrays.jl to switch back and forth between these representations. Is there a similar crate in Rust?

H2CO3 · April 12, 2022, 7:51am

If you often need to access data in both representations, then you are basically out of luck. No matter which one you choose, some operations will be more convenient and faster, and others will be more laborious and slower. If you are primarily looking for convenience, then you could give the "data frame" abstraction a go (e.g. polars::DataFrame::transpose()).

alice · April 12, 2022, 7:53am

It would be possible to write your own unsafe code to implement the iterators. Are you interested in such a solution?

ettersi · April 12, 2022, 8:01am

Possibly, but for the moment I'm thinking about writing a macro which would transform

#[to_struct_of_vecs]
struct S {
    a: A,
    b: B,
    ...
}

into

struct S {
    a: A,
    b: B,
    ...
}

struct VecOfS {
    a: Vec<A>,
    b: Vec<B>,
    ...
}

impl VecOfS {
     pub fn push(&mut self, s: S) {
         self.a.push(s.a);
         self.b.push(s.b);
         ...
     }

     pub fn get(&self, i: usize) -> S {
         return S { a: self.a[i], b: self.b[i], ...};
     }

     pub fn vecs(&mut self) -> (VecOfMutRefs<A>, VecOfMutRefs<B>, ...) {
        return (
              VecOfMutRefs::new(&mut self.a), 
              VecOfMutRefs::new(&mut self.b), 
              ....
        );
    }

    // Many other methods...
}

Not sure whether I have the time budget for that, though.

2e71828 · April 12, 2022, 8:25am

It's possible without unsafe, if you need it to work for arbitrary iterators¹ :

use std::cell::RefCell;
use std::collections::VecDeque;
use std::rc::Rc;

pub fn split_mut_iter<'it, A: 'it, B: 'it, I>(
    iter: I,
) -> (
    impl Iterator<Item = &'it mut A>,
    impl Iterator<Item = &'it mut B>,
)
where
    I: Iterator<Item = &'it mut (A, B)>,
{
    split_iter(iter.map(|&mut (ref mut a, ref mut b)| (a, b)))
}

pub fn split_iter<A, B, I>(iter: I) -> (impl Iterator<Item = A>, impl Iterator<Item = B>)
where
    I: Iterator<Item = (A, B)>,
{
    let state = Rc::new(RefCell::new(IterState::new(iter)));
    (AIterator(Rc::clone(&state)), BIterator(state))
}

struct IterState<A, B, I: ?Sized> {
    pending_a: VecDeque<A>,
    pending_b: VecDeque<B>,
    source: I,
}

impl<A, B, I> IterState<A, B, I>
where
    I: ?Sized + Iterator<Item = (A, B)>,
{
    pub fn new(iter: I) -> Self
    where
        I: Sized,
    {
        IterState {
            pending_a: VecDeque::default(),
            pending_b: VecDeque::default(),
            source: iter,
        }
    }

    fn pump_source(&mut self) -> Option<()> {
        let (a, b) = self.source.next()?;
        self.pending_a.push_back(a);
        self.pending_b.push_back(b);
        Some(())
    }

    pub fn next_a(&mut self) -> Option<A> {
        if self.pending_a.len() == 0 {
            self.pump_source();
        }
        self.pending_a.pop_front()
    }

    pub fn next_b(&mut self) -> Option<B> {
        if self.pending_b.len() == 0 {
            self.pump_source();
        }
        self.pending_b.pop_front()
    }
}

struct AIterator<I: ?Sized>(Rc<RefCell<I>>);
struct BIterator<I: ?Sized>(Rc<RefCell<I>>);

impl<A, B, I: ?Sized> Iterator for AIterator<IterState<A, B, I>>
where
    I: Iterator<Item = (A, B)>,
{
    type Item = A;
    fn next(&mut self) -> Option<A> {
        self.0.borrow_mut().next_a()
    }
}

impl<A, B, I: ?Sized> Iterator for BIterator<IterState<A, B, I>>
where
    I: Iterator<Item = (A, B)>,
{
    type Item = B;
    fn next(&mut self) -> Option<B> {
        self.0.borrow_mut().next_b()
    }
}

¹ If you're always iterating over Vecs, then an unsafe implementation will be able to avoid the heap allocations of this version.

alice · April 12, 2022, 9:41am

Right, that's similar to how unzip works. The unsafe is only necessary to avoid the allocations in the VecDeques you have.

SkiFire13 · April 12, 2022, 9:58am

Have you considered using the soa_derive crate?

ettersi · April 12, 2022, 10:04am

Precisely what I was looking for!

system · July 11, 2022, 10:04am

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Mutably iterate over linked list help	9	662	October 23, 2022
[Solved] Iterator borrowing from itself and abstracting over mutability help	3	579	July 9, 2019
Help with iter_mut lifetimes help	11	1263	January 12, 2023
Implement mutable iterator help	18	570	October 20, 2022
Mutable chaining of iterators help	3	543	March 6, 2020

Split mutable iterator over composite structure

Related Topics