Help with iterating though variable sized zipped iterators

Hello everyone

I have a question for the following code. It's a reduced variant of the actual code that still replicates the problem.

I am parallelizing a task over a list of particles. The algorithm I use the calculate (get_chunk_size in the code below) can end up producing zero particles for the last thread. Here for example I have 5 particles and 4 threads which ends up distributing the particles like [2, 2, 1, 0].
A different algorithm would rather do [2, 1, 1, 1].
The reason I use the first algorithm is because it is super simple and easy to convert from an index in the particle list to an index in the chunked particle list. Also for the runtime the simulation with typical particle numbers of 1024*1024*1024 it doesn't really matter if the last thread is empty and all other threads have a few more particles.

So the problem I have now is the following. How do I let the last thread just handle 0 particles? I think if I could just produce an empty slice it would be fine. I am trying to do this with zip_longest but I can't produce an empty slice then, because it's lifetime would have to be longer than thread::scope() s lifetime since particles also has this longer lifetime.

Do you have an idea how I could solve this? I am also open to suggestions that just circumvent the problem.

use itertools::Itertools;
use std::thread;

fn main() {
    const MIN: f32 = -0.5;
    const MAX: f32 = 0.5;

    let mut particles = vec![
        [MIN, MIN],
        [MIN, MIN],
        [MAX, MAX],
        [MIN, (MAX + MIN) / 2.],
        [(MAX + MIN) / 2., (MAX + MIN) / 2.],
    ];

    let n_particles = particles.len();
    const N_GRID: usize = 4;
    const N_THREADS: usize = 4;

    // Number of particles per thread.
    let chunk_size: usize = get_chunk_size(n_particles, N_THREADS);
    let mut communicators = vec![0; N_THREADS];

    thread::scope(|s| {
        for either_or_both in communicators
            .iter_mut()
            .zip_longest(particles.chunks_mut(chunk_size))
        {
            let mut v = vec![];
            let (_comm, p_local) = match either_or_both {
                itertools::EitherOrBoth::Both(comm, p_local) => (comm, p_local),
                itertools::EitherOrBoth::Left(comm) => (comm, &mut v[..]),
                itertools::EitherOrBoth::Right(_) => panic!("This shouldn't happen."),
            };

            s.spawn(|| {
                // Do something with p_local.
                p_local;
            });
        }
    });
}

fn get_chunk_size(len: usize, n_chunks: usize) -> usize {
    (len + n_chunks - 1) / n_chunks
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0597]: `v` does not live long enough
  --> src/main.rs:32:68
   |
24 |       thread::scope(|s| {
   |                      - has type `&'1 Scope<'1, '_>`
...
32 |                   itertools::EitherOrBoth::Left(comm) => (comm, &mut v[..]),
   |                                                                      ^ borrowed value does not live long enough
...
36 | /             s.spawn(|| {
37 | |                 // Do something with p_local.
38 | |                 p_local;
39 | |             });
   | |______________- argument requires that `v` is borrowed for `'1`
40 |           }
   |           - `v` dropped here while still borrowed

For more information about this error, try `rustc --explain E0597`.
error: could not compile `playground` due to previous error

I just now found a possible fix. I can just use zip and chain an empty vec that is defined in the outer scope of thread::scope(). What do you think about that?

use std::iter::once;
use itertools::Itertools;
use std::thread;

fn main() {
    const MIN: f32 = -0.5;
    const MAX: f32 = 0.5;

    let mut particles = vec![
        [MIN, MIN],
        [MIN, MIN],
        [MAX, MAX],
        [MIN, (MAX + MIN) / 2.],
        [(MAX + MIN) / 2., (MAX + MIN) / 2.],
    ];

    let n_particles = particles.len();
    const N_GRID: usize = 4;
    const N_THREADS: usize = 4;

    // Number of particles per thread.
    let chunk_size: usize = get_chunk_size(n_particles, N_THREADS);
    let mut communicators = vec![0; N_THREADS];
    let mut v: Vec<[f32; 2]> = vec![];

    thread::scope(|s| {
        for (_comm, p_local) in communicators
            .iter_mut()
            .zip(particles.chunks_mut(chunk_size).chain(once(&mut v[..])))
        {
            s.spawn(|| {
                // Do something with p_local.
                p_local;
            });
        }
    });
}

fn get_chunk_size(len: usize, n_chunks: usize) -> usize {
    (len + n_chunks - 1) / n_chunks
}

(playground)

Thread::scope() requires that whatever the threads borrow outlives (i.e., practically, is declared before and outside) the scope. If you are creating references to variables inside the scope, then that's no different from thread::spawn()ing with non-'static references: nothing guarantees that the thread terminates before those local references are invalidated!

Why are you using a vector in the first place, anyway? If you need an empty slice, just create a literal empty array, which will be subject to static promotion: Playground.

2 Likes

Ahh, I didn't know one could do that. It failed for me because I tried to do &mut [] which isn't the correct type. I didn't know that one could cast it like that with &mut [] as _.

Mutable (and shared) references to slices implement Default and can be summoned from air to boot (empty slice). (Sorry for the brevity; on mobile.)

I've used such chains occasionally.

The [..] trick works with arrays too, of course.

Oh snap, I didn't even consider that &mut [][..] could be valid syntax. Thanks a lot everyone. So both the zip_longest and the chain solution work with this.

I just had the hunch to add some more [..]s. Interesting that there isn’t a clippy lint against unnecessary [..] on something that’s already a slice.

// compiles; no warnings, no clippy lints
// …
    thread::scope(|s| {
        for either_or_both in communicators
            .iter_mut()
            .zip_longest(particles.chunks_mut(chunk_size))
        {
            let (_comm, p_local) = match either_or_both {
                itertools::EitherOrBoth::Both(comm, p_local) => (comm, p_local),
                itertools::EitherOrBoth::Left(comm) => (
                    comm,
                    &mut [][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..]
                        [..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..]
                        [..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..]
                        [..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..]
                        [..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..]
                        [..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..][..],
                ),
                itertools::EitherOrBoth::Right(_) => panic!("This shouldn't happen."),
            };

            s.spawn(|| {
                // Do something with p_local.
                p_local;
            });
        }
    });

Rust Playground

2 Likes

It's nice to consider this in the big picture. I allude a lot to Rust's compositionality in the type system, but the syntax obeys similar rules, too. An indexing expression is just: any expression, followed by an indexing [...] clause on the RHS. Thus, you can put… whatever on the LHS. If it's an indexable type, it will compile.

In general, you are free to mix expressions in any, potentially wild/non-idiomatic, manner, because the grammar is 100% unambiguous. For instance, this compiles:

[[[||||||{println!("hello world");}]]][0][..][0][..][..][0]()()();

It's not like you should write code like this.

Love that one :rofl:

On a more serious note, such things can also be good to test compiler behavior. E.g. add a few more and it “segfaults” (I would assume due to a stack overflow).