SMP Parallelism


Is there a Rust library that can express SMP-parallelism by reusing existing iterator (D-range) -based algorithms in the standard library similar to what is possible in D's std.parallelism?


immutable n = 1_000_000_000; immutable delta = 1.0 / n; real getTerm(int i) { immutable x = ( i - 0.5 ) * delta; return delta / ( 1.0 + x * x ) ; } immutable pi = 4.0 * taskPool.reduce!"a + b"(map!getTerm(n.iota));

Have you looked at rayon?

1 Like

This doesn't scale:

Why all the explicit trait implementations? This limitation in the language is severe. Will it be fixed?

What do you mean by "scale" in this context?

Why all the explicit trait implementations?

This was a little startling to me when I first learned the language, but it's actually very good because most of those are strong Monomorphisms such that they are direct function invocations. To overcome this in many other languages, they use dynamic dispatch, but Rust gives you the choice to use what is the best option at different times based on your requirements. It's like C++ in that way.

This limitation in the language is severe. Will it be fixed?

'Fixed' implies it's a problem :slight_smile: . From what I've seen I think there will be some more options making it less noisy and easier to manage in the future, but I would doubt this type of strong typing will go away (If I understand your complaint here).

1 Like

Nonscalable in this case means that the number of explicit implementations (template overloads in C++/D) needed for a specific traits requires MN number of implementations, where M is number of algorithms and N is the number of types. C++ and D's template system is designed around the idea to categorize the interface between algorithms and containers into different kinds of iterator- (C++) and range- (D) groups (input-, forward-, bidirectional-, randomaccess) to turn this count in M+N. M+N is scalable, MN is not.

I see no reason for having explicit implementations of this trait for all the builtin integral types such as

impl IntoParallelIterator for Range<u8> impl IntoParallelIterator for Range<u16> impl IntoParallelIterator for Range<u32> impl IntoParallelIterator for Range<usize> impl IntoParallelIterator for Range<i8> impl IntoParallelIterator for Range<i16> impl IntoParallelIterator for Range<i32> impl IntoParallelIterator for Range<isize>

D solves this with template restrictions. such as isIntegral in std.traits. Rust must have this. Otherwise something is very wrong with Rust's vision for compact, templated safe code.

Please correct me if I'm wrong. I'm hoping I'm wrong about this issue. :slight_smile:

1 Like

Ah. I believe (though others know better than me) that the reason those ranges are explicitly defined is to support fixed length arrays over those types. I don't know well enough myself why a bound of T: Copy+Sized wouldn't have been enough.

For the general case there are these implementations though:

impl<'data, T: Sync + 'data> IntoParallelIterator for &'data [T]
impl<'data, T: Sync + 'data> IntoParallelIterator for &'data Vec<T>
impl<T: Send> IntoParallelIterator for Vec<T>
impl<T: ParallelIterator> IntoParallelIterator for T

Which I would think would cover most general use cases. That is, in all likely hood you wouldn't need to create a custom trait implementation for your type because those generic definitions are probably enough. There are also impls in there for all the std::collections structs as well it looks like.

1 Like

Rust has traits instead of template restrictions.

Rayon could have implemented impl<I> IntoParallelIterator for I: where I: IntoIterator (implement for all collections at once) however, the developers chose not to do so because there is a faster implementation for vectors/slices/etc (see rayon 0.4.2 - Once specialization stabilizes, they'll be able to add that blanket impl, get rid of the individual implementations, and then specialize the vector/slice/etc implementations.


But is there a (future?) way to add impls for a bunch of integral types at once? Rayon currently does this with a macro, so technically there is no copy-paste, but it isn't an elegant solution.

Ah, I think I've figured the answer. There is an Integer trait which you can use to avoid boilerplate generating macros: But it is not in the standard library yet, hence it might make sense to go with a macro for now.

1 Like

Does the mention of Rayon (GitHub - nikomatsakis/rayon: Rayon: A data parallelism library for Rust) mean that it is the de facto standard? "parallel iterators" may be experimental in Rust, but they have been around for a while in Java and other languages.

I had been going to have a go with Simple_Parallel (GitHub - huonw/simple_parallel: Straight-forward functions and types for basic data parallel operations.) but it has all the appearances of a stalled or deceased project.

I recognise that code. :slight_smile:

1 Like

Yep, rayon is the crate to go with for cilk / ForkJoin style of parallelism. simple_parallel predates rayon, it is less expressive and is not maintained any more. Switching from simple_parallel to rayon should be straight forward if you know about weights API (I didn't Calling Rayon Users: Opinions Wanted - #2 by matklad :laughing: ).

A more low level crate crossbeam may be useful as well.


There is also Jobsteal and some other crates.

The thing Rust is missing is called "integer generics," meaning having integers as arguments to type constructors (read: templates) in addition to the current ability to have types as arguments.

Rust doesn't have an Integer trait in the standard library because its still unclear what the best way to define the numerics trait hierarchy is, given all of the different needs that different users have.

This would help with some issues (impls for arrays, in particular), but not with this range example, where there is a separate implementation for each integer type, not each integer value.

Notably though:

    let n = 1000000000u64;
    let delta = 1.0 / n as f64;
    let total = (0..n as u32).into_par_iter().map(move |i| {
    	let x = (i as f64 - 0.5) * delta;
    	1.0 / (1.0 + x * x)
    let pi = 4.0 * delta * total;

seems to perform really rather well. And it gets the right answer! So Rayon is looking very good.

1 Like

Lovely! Thanks, for now, Russel.

BTW: When will SCons support Rust? :wink:

let n = 1_000_000_000u64 would be a bit more readable :wink:

1 Like

Given Cargo, there is little point in getting SCons into the Rust build game. Chapel is developing Mason based on the ideas of Cargo, so SCons isn't going to win there either. To be honest SCons hasn't been keeping up to date with languages such as Rust, Chapel, Go, and their dependency and build strategies. It remains the best C, C++, Fortran, (D,), and LaTeX build tool though.

1 Like