Beginner looking for help on idiomatic Rust

I am new to Rust and I wanted to write something that finds the first 100 primes. This is my code. I am sure there is a more idiomatic way to write this in Rust. Any help appreciated.

// primes.rs


fn next_p(divs: &[u64], cand: &u64) -> Option<u64> {
    let root = (*cand as f64).sqrt();
    for d in divs {
        if *d as f64 > root {
            return Some(*cand);
        }
        if cand % d == 0 {
           return None; 
        }
    }
    None
}

fn main () {
    println!("*** Eratosthenes' Sieve ***");
    // Declare a vector to hold 100 primes.
    let mut divs:Vec<u64> = Vec::with_capacity(100);
    // Start it off with 2.
    divs.push(2);
    // First candidate is 3.
    let mut cand = 3u64;

    // Loop until the vector is full.
    while divs.len() < 100 {
        match next_p(divs.as_slice(), &cand) {
            Some(p) => divs.push(p),
            None => (),
        }

        cand += 2;
    }
        println!("{:?}", divs);
}

That does not look like the Sieve of Eratosthenes to me. https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

Or am I missing a point?

The algorithm you've implemented is not the sieve of Eratosthenes.

Here are some suggestions to improve the code:

  • It's better to pass u64 to a function by value (cand: u64) instead of reference (cand: &u64).
  • The signature of next_p suggests that it can return any u64, but in fact, it can only return None or Some(cand). The real meaning of the function is to check if cand is prime, not to find the next prime as the name would suggest. The function should be renamed to is_prime and return bool. Returning bool will make it more clear that there are only two possible outcomes.
  • The iteration inside next_p can be rewritten using iterators:
    if divs
        .iter()
        .take_while(|d| **d as f64 <= root)
        .all(|d| cand % d != 0)
    {
        return Some(*cand);
    }
    None
  • Instead of converting each item of divs to f64, it's better to convert root back to u64.
  • The 100 constant appears in two places, so it should be deduplicated into a const or an argument to a function.
  • match can be replaced with if let if it only has one arm with the code.
  • Shortened names reduce readability of code. divisors is better than divs; candidate is better than cand.
  • Writing tests is good. Check how your code behaves on different number of requested primes. Write a property test that checks that the length of the produced vector matches the requested number and that every item is a prime.

I also recommend always running clippy on your code.

2 Likes

Avoid doing integer math with floating-point numbers. Why would you convert an integer to f64 and take the sqrt, only to compare against another integer, instead of just squaring the smaller number? Besides being slow, floating-point arithmetic is inexact for large numbers. I'm not sure if that matters in this case, but it's far easier to use integer arithmetic instead and avoid having to reason about what your code will do when passed numbers too big to fit in an f64.

There's a great paper called The Genuine Sieve of Eratosthenes (pdf) that explains exactly how much worse this algorithm (which the author calls "the unfaithful sieve") is than the true sieve and what you can do instead. The code in the paper is Haskell but I found it pretty easy to follow despite having no Haskell experience. It contains an algorithm you can use to generate an infinite sequence of primes using a variation on the true sieve, which is a bit more complex than "all the first 100 primes", but might be fun to translate into Rust.

1 Like

Thank you, that's most helpful.

Thank you! Yes, that's an excellent point about squaring!

Doing a single floating point sqrt and a two casts is more efficient than squaring every element of a potentially very long vector.

However, it's true that floating point operations introduce inexact values, so avoiding them when you can is a good idea. In this case, calculating an integer square root is probably suitable. There is a crate for that. But you need to check if it rounds up or down and update your algorithm accordingly.

1 Like

Probably, if you convert back to u64 outside the loop (as you rightly suggested). I'm not sure the relative cost of a mul vs. converting a large u64 to f64 -- which is probably a couple shifts and something to find the exponent. But yes, fair point. If the mul in the loop is that much of a problem you should at least do some math to prove that the precision loss does not lead to errors (a precaution I have never seen anyone take).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.