Split mutable iterator over composite structure

Does stable and safe Rust provide some means to split Iterator<Item = &mut (A,B)> into (Iterator<Item = &mut A>, Iterator<Item = &mut B>)?

Basically I want something like this to work:

let mut v: Vec<(u32,u32)> = vec![(6,6),(6,6)];
let (iter0, iter1) = v.iter_mut().some_magic_function();
iter0.for_each(|x| x = 0);
iter1.for_each(|x| x = 1);
assert_eq!(v, vec![(0,1),(0,1)]);

Put differently, what I am asking for is whether there is a way to lift splitting borrows from operating on structs to operating on iterators of structs.

No, because iterators are lazy and this can't be done lazily -- in your example, iterating iter0 would have to advance the iterator, at which point iter1 wouldn't be able to look at those items any more.

So either:

  1. you do it with allocations instead, aka https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.unzip
  2. you call iter_mut twice (and restructure code as needed to get that to pass the borrow checker)
1 Like

I'm happy to call iter_mut() twice, but I want both iterators to be alive at the same time so I can update the values concurrently. I believe this is not possible using only safe Rust, so my question is whether the standard library already has some carefully checked unsafe code somewhere which I can leverage to get what I want.

The standard library doesn't have any unsafe code for this use-case.

1 Like

It seems like this is a representation problem, not an algorithm one? (As it often is the case.) If your use case calls for updating the tuple elements in parallel as the primary means of speeding it up, then you should be storing them in two separate arrays to begin with, i.e. transpose Vec<(u32, u32)> to (Vec<u32>, Vec<u32>), and it will be easy enough to reconstruct an iterator-of-pairs by zipping them when needed.

2 Likes

Yeah, I'm aware that reorganising my data structure as you suggested would solve this particular problem. I'm still a bit reluctant to do so, however, because for most other purposes having a vec-of-structs rather than a struct-of-vecs is just much more convenient. In Julia, there's StructArrays.jl to switch back and forth between these representations. Is there a similar crate in Rust?

If you often need to access data in both representations, then you are basically out of luck. No matter which one you choose, some operations will be more convenient and faster, and others will be more laborious and slower. If you are primarily looking for convenience, then you could give the "data frame" abstraction a go (e.g. polars::DataFrame::transpose()).

2 Likes

It would be possible to write your own unsafe code to implement the iterators. Are you interested in such a solution?

1 Like

Possibly, but for the moment I'm thinking about writing a macro which would transform

#[to_struct_of_vecs]
struct S {
    a: A,
    b: B,
    ...
}

into

struct S {
    a: A,
    b: B,
    ...
}

struct VecOfS {
    a: Vec<A>,
    b: Vec<B>,
    ...
}

impl VecOfS {
     pub fn push(&mut self, s: S) {
         self.a.push(s.a);
         self.b.push(s.b);
         ...
     }

     pub fn get(&self, i: usize) -> S {
         return S { a: self.a[i], b: self.b[i], ...};
     }

     pub fn vecs(&mut self) -> (VecOfMutRefs<A>, VecOfMutRefs<B>, ...) {
        return (
              VecOfMutRefs::new(&mut self.a), 
              VecOfMutRefs::new(&mut self.b), 
              ....
        );
    }

    // Many other methods...
}

Not sure whether I have the time budget for that, though.

It's possible without unsafe, if you need it to work for arbitrary iterators¹ :

use std::cell::RefCell;
use std::collections::VecDeque;
use std::rc::Rc;

pub fn split_mut_iter<'it, A: 'it, B: 'it, I>(
    iter: I,
) -> (
    impl Iterator<Item = &'it mut A>,
    impl Iterator<Item = &'it mut B>,
)
where
    I: Iterator<Item = &'it mut (A, B)>,
{
    split_iter(iter.map(|&mut (ref mut a, ref mut b)| (a, b)))
}

pub fn split_iter<A, B, I>(iter: I) -> (impl Iterator<Item = A>, impl Iterator<Item = B>)
where
    I: Iterator<Item = (A, B)>,
{
    let state = Rc::new(RefCell::new(IterState::new(iter)));
    (AIterator(Rc::clone(&state)), BIterator(state))
}

struct IterState<A, B, I: ?Sized> {
    pending_a: VecDeque<A>,
    pending_b: VecDeque<B>,
    source: I,
}

impl<A, B, I> IterState<A, B, I>
where
    I: ?Sized + Iterator<Item = (A, B)>,
{
    pub fn new(iter: I) -> Self
    where
        I: Sized,
    {
        IterState {
            pending_a: VecDeque::default(),
            pending_b: VecDeque::default(),
            source: iter,
        }
    }

    fn pump_source(&mut self) -> Option<()> {
        let (a, b) = self.source.next()?;
        self.pending_a.push_back(a);
        self.pending_b.push_back(b);
        Some(())
    }

    pub fn next_a(&mut self) -> Option<A> {
        if self.pending_a.len() == 0 {
            self.pump_source();
        }
        self.pending_a.pop_front()
    }

    pub fn next_b(&mut self) -> Option<B> {
        if self.pending_b.len() == 0 {
            self.pump_source();
        }
        self.pending_b.pop_front()
    }
}

struct AIterator<I: ?Sized>(Rc<RefCell<I>>);
struct BIterator<I: ?Sized>(Rc<RefCell<I>>);

impl<A, B, I: ?Sized> Iterator for AIterator<IterState<A, B, I>>
where
    I: Iterator<Item = (A, B)>,
{
    type Item = A;
    fn next(&mut self) -> Option<A> {
        self.0.borrow_mut().next_a()
    }
}

impl<A, B, I: ?Sized> Iterator for BIterator<IterState<A, B, I>>
where
    I: Iterator<Item = (A, B)>,
{
    type Item = B;
    fn next(&mut self) -> Option<B> {
        self.0.borrow_mut().next_b()
    }
}

¹ If you're always iterating over Vecs, then an unsafe implementation will be able to avoid the heap allocations of this version.

2 Likes

Right, that's similar to how unzip works. The unsafe is only necessary to avoid the allocations in the VecDeques you have.

1 Like

Have you considered using the soa_derive crate?

1 Like

Precisely what I was looking for! :heart: