I originally did a multi-threaded twin primes sieve algorithm in Nim, then translated it to Rust. They are functionally 1-to-1 equivalent, but the Rust version is significantly slower (even slower than an equivalent D version) because of how it does its multi-threading, using Rayon. Below are timing differences (in seconds).
input | Nim | Rust
-------|----------|--------
10^10 | 0.406 | 0.563
-------|----------|--------
10^11 | 4.291 | 5.525
-------|----------|--------
10^12 | 47.343 | 54.785
-------|----------|--------
10^13 | 565.291 | 645.235
Here are links to the gist files for both versions.
Nim
Rust
I received lots or help from this forum to get the Rust version to this point, but the multi-threading in Nim is much simpler, and faster than for Rust. Below is threading code section for each.
Nim
parallel: # perform in parallel
for indx, r_hi in restwins: # for each twin pair row index
spawn twins_sieve(indx, r_hi.uint) # sieve selected twin pair restracks
stdout.write("\r", (indx + 1), " of ", pairscnt, " threads done")
sync() # when all the threads finish
Rust
let (lastwins, cnts): (Vec<_>, Vec<_>) = {
let counter = RelaxedCounter::new();
restwins.par_iter()
.map( |r_hi| {
let out = twins_sieve(*r_hi, kmin, kmax, kb, start_num, end_num, modpg, s,
&primes.to_vec(), &resinvrs);
print!("\r{} of {} threads done", counter.increment(), pairscnt);
out
}).unzip()
};
This is my first really serious Rust project, and I assume Rust must be able to do this algorithm faster than I have it so far. How can I make the multi-threading, et al, more performant to be closer to Nim? (Nim actually transcodes to C, which is then compiled.)
Compiled with (current) Nim 1.0.4, gcc 9.2.0, and Rust 1.40, on Linux 64-bit distro, I7 cpu, 2.6 - 3.5 GHz, 4 cores/8 threads, 16GB mem.