How to implement a scatter-gather data array for threads?

I don't know if it is suitable to call this "scatter-gather". What I want to do is allocating an array with a length of N, and then passing each element of this array to a thread for subsequent processing.

The identical C implementation is:

#include <stdlib.h>
#include <stdio.h>
#include <pthread.h>

#define N 10
typedef int DATA;

static void * thread_start(void *arg)
{
	DATA *val = (DATA*)arg;
	printf("val = %d\n", *val);
	return NULL;
}

int main() {
	pthread_t *pids = (pthread_t*)malloc(sizeof(pthread_t) * N);
	DATA *a = malloc(sizeof(DATA) * N);

	for (int i = 0; i < N; i++) {
		a[i] = i;
		pthread_create(&pids[i], NULL, thread_start, &a[i]);
	}

	for (int i = 0; i < N; i++) {
		pthread_join(pids[i], NULL);
	}

	free(a);
	free(pids);
	return 0;
}

Obviously, the elements in this array won't result in any race condition due to that they are referenced independently by each thread. But, for Rust, data that is shared by multiple threads must be marked as Send if the data can be safely sent between threads, and be marked as Sync if the data can be safely shared by threads.

To my code, the data is Send because it is sent from the main thread to the subthread. So, I cannot use Rc<T> to wrap this custom data. However, for Arc<T> container, a lock guard must be specified to keep accessing to Arc<T> synchronously. It seems very redundant to use a lock here for data that never be accessed simultaneously by threads.

This is what I implemented:

use std::thread;
use std::rc::Rc;

#[derive(Debug)]
struct Element {
    data: Rc<i32>,
}

const N: usize = 10;

fn main() {
    let mut a = Vec::new();
    for i in 1..N {
        a.push(Element{data: Rc::new(i as i32)});
    }

    let mut pids = Vec::new();
    for i in 1..N {
        let pid = thread::spawn(move || {
//                ^^^^^^^^^^^^^ `Rc<i32>` cannot be sent between threads safely
            let data = &a[i];
            println!("{:?}", data);
        });
        pids.push(pid);
    }
}

When you look at the signature for IndexMut::index_mut() (the trait behind slice[index] syntax) the compiler only sees fn index_mut(&mut self, index: Idx) -> &mut Self::Output, which makes it hard to tell the compiler about this nuance.

Unlike in C where the programmer ensures safety by avoiding dodgy operations, Rust requires you to make it impossible to introduce memory problems through any conceivable use of your library as long as the programmer doesn't use unsafe.

The gold standard for doing this sort of "I want to apply this operation to each element in my array/iterator/whatever in parallel" is the rayon crate. It'll handle the unsafe code which shares mutable references to different elements across a thread pool while exposing an Iterator-like api.

5 Likes

Change for i in 1..N to for data in &mut a or for data in a.chunks_mut(size), and then scrap it and use rayon :slight_smile:

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.