Nested iteration within mutable iteration


#1

So I find myself wanting to do something like this fairly frequently:

for object_a in &mut objects {
    for object_b in &objects {
        if object_a == object_b {
            continue
        }

        object_a.mutate(object_b.something);
    }
}

Obviously this doesn’t work since I can’t borrow objects immutably while it’s already borrowed mutably.

So, is there an idiomatic way of:

  1. Iterating mutably over every pair of objects in a vector/slice, or…
  2. Iterating over (&mut object, &[every_other_object])

Or should I just index into objects instead?


#2

Feels to me like an unusual situation to be in; I assume there are no data dependencies between the various mutations that occur? (i.e. the thing that gets modified on object_a is not something, or if it is, then it’s not the same parts that are read from object_b)? Usually in a case like this I’d prefer not to have them together on the same struct.

In Rust, SoA is not just useful for performance, but also for dealing with mutability hazards:

struct Objects {
    something: Vec<Something>,
    something_else: Vec<SomethingElse>,
}

fn foo(objects: &mut Objects) {
    for a in &mut objects.something_else {
        for b in &objects.something {
            a.mutate(b);
        }
    }
}

If that’s not possible, you can do this with indexing; you just need to make effective use of slice’s splitting APIs:

fn split_out_mut(xs: &mut [T], i: usize) -> (&[T], &mut T, &[T]) {
    assert!(i < xs.len());
    let (left, right) = xs.split_at_mut(i).unwrap();
    let (x, right) = right.split_first_mut().unwrap();
    (left, x, right)
}

#3

Ah, SoA is actually the perfect solution for this case, thanks!

That split_out_mut function looks like it could be really useful in other situations too though.


#4

I am trying to do something similar. I’m writing a video game and as part of world generation, I need to iterate over each tile and check if there is a water tile anywhere in the vicinity. Whether there is or isn’t, I’ll still be mutating the items from the outer loop, but the inner loop is immutable. Important note: I’m excluding the current item from the outer loop in my coverage of the inner loop. In this case, splitting out the two arrays is a non-option for me. Any suggestions?


#5

@christopherdumas how is splitting a non-option? A function like split_out_mut (or the existing splitting APIs on slices) can be used as a building block to write a function like e.g.

fn get_mut_and_ref(&mut [T], mut_idx: usize, immut_idx: usize) -> (&mut T, &T);

Though honestly, even in this case, I’d still try to go for an SoA approach. Basically, I think I would try to have two copies of the game world. I would look at one copy (immutably) to write onto the other copy (mutably), and then inbetween ticks I would swap them.

Alternatively, I would only look immutably at the world, build a list of changes that need to occur, and then perform them all at once.

(either of these strategies also naturally solves the issue of “directional bias” that can occur as the result of modifying the neighbor of a tile before the outer loop reaches it)

That said, I don’t write games, and I only know they have very different performance considerations than most code, so maybe one or both of these ideas are performance pitfalls. :man_shrugging:


#6

These are ordinarily good suggestions! The game map is very large, however, since it is also 3D: thus it is 100x100x120 (roughly, this can vary) or about 1.2 million tiles. Since I have to hold this in memory in its entirety (because this is a real-time game and there are many important NPC AIs that need to act off-screen), having two copies would be a rather large memory load, especially since each tile is not just a simple number/enum but a collection of objects of varying sizes concerning biomes, fluids, etc. In addition, since I’m doing this in multiple passes (in fact, this is why this is a multiple-pass generation system), directional bias is not a concern: all water tiles are generated before the pass in question occurs (all this does is choose which biome a stack of tiles would make sense to be a part of, which determines animal and vegetation generation). Splitting doesn’t seem like it would be an option (or at least, I assumed it wasn’t) because of the problems with borrow-checking converting a Vec<RwLock<T>> into a slice of .writed or .readd Ts would create.


#7

Can you elaborate on this? I’m actually not sure how the borrow-checker even comes into play here, since using RwLock is basically lifting the borrow-checker’s job into runtime.


#8

Like I said I only assumed it wouldn’t be in the above comment. I don’t exactly know for sure. I could give it a shot I suppose but I would rather try to find a cleaner solution: doing a split_out_mut on a nested list seems like it could get… ugly.


#9

I might be reading too much into your previous comment, but if you are using RwLocks it seems to me that stuff like split_out_mut shouldn’t even be necessary. Writing to a RwLock only requires an immutable borrow, because that’s how thread-safety is expressed in Rust. (immutable borrows are often called “shared borrows” for this reason, as opposed to &mut which must be unique)


#10

Sorry yeah I just looked at my code again and it’s not directly a RwLock, its a struct with some other fields and an RwLock field is one of them. The non-RwLock field is the one I’m modifying in this pass.


#11

So if I understand correctly:

  • you have a world map, that gets generated in the first pass
  • you want to iterate over subset regions of that map to perform various updates to tile data based on the world nearby
  • there are several kinds of updates, based on several criteria, including proximity-to-water

My thought is to have a separate data structure that describes the regions and can generate the update iterators for you. You populate those region structures by scanning the world (or as you generate it). A simple example might be a HashSet of ordinate tuples. There might be smarter data structures for spatial mapping, but that’s an optimisation.

  • As you scan (or generate) a water tile, you tell the water-region about it.
  • That adds all the tile coordinates in the desired radius to the set. So on for however many other region types you need.
  • Then you iterate over all the tile coordinates in the water region, visiting each one to do the wetness-related updates. Then the next pass for the next region/property.
  • Each tile is only visited once per property, because the HashSet will effectively dedup tiles that are near >1 water tile.

At the end, you can discard the region maps, or keep them, depending on whether they’re needed or useful for later processing.


#12

@dcarosone, That looks like a really interesting suggestion! I might indeed take that path, and it certainly provides advantages (and a few disadvantages) over the setup I’m currently using. It’s late where I am but I’ll certainly look at your suggestions!


#13

A simple optimisation for memory usage, if you have a lot of region maps, might be to just keep a vec of the actual hits in the region (ie, list of water tiles notified), and separate the expansion-by-radius hashset to a finalisation stage as you’re preparing the iterator, so you only expand one region map at a time as you consume them.

But as noted, there are probably other better internal implementations, the key idea being to separate a region map that spans the tiles of interest.

And you might be able to make the update logic clearer, too, by combining sets. If there’s a thing that needs to be done only if the tile is both near-water and near-fire (or whatever), you can use set intersections or unions to visit those, rather then a lot of if-conditions looking at neighbours from each tile.