Which is more idiomatic? Functional, imperative or a mix?

Hi

I'm looking for advice / comments on what is the preferred idiomatic approach to writing Rust code. I'm doing the exercises on exercism.io, and as a Python/C/JavaScript programmer, I'm really enjoying learning Rust. I've also programmed in Scheme, FORTH, CoffeeScript, Fortran, and a few other obscure languages. Hence, my curiosity about what I thought would be a 'C'-like imperative language, actually seems to lean more towards a functional style.

My two examples (probably not that good; I am a naive Rust programmer!) are to solve the Largest Series Product on exercism.io.

The first is functional:

#[derive(Debug)]
pub enum Error {
    Digit,
    Window,
}


pub fn lsp(digits: &str, size: u32) -> Result<u32, Error> {
    if size == 0 { return Ok(1) };
    digits
        .chars()
        .map(|c| c.to_digit(10).ok_or(Error::Digit))
        .collect::<Result<Vec<u32>, _>>()
        .and_then(|numbers| {
            numbers
                .windows(size as usize)
                .map(|w| w.iter().product())
                .max()This text will be hidden
                .ok_or(Error::Window)
        })
}

The second is imperative/mutual style:

#[derive(Debug)]
pub enum Error {
    Digit,
    Window,
}


// to calcuate the products over the string, we need to keep an array of the product that we want
// to do from the digits.  multipication is quite quick, so rather than dividing by the last digit
// and then multiplying by the new digit, we'll just recalculate the whole product.

pub fn lsp(digits: &str, size: u32) -> Result<u32, Error> {
    if size == 0 { return Ok(1) };
    let mut prod = vec![0u32; size as usize];
    let mut pointer: usize = 0;
    let mut max_value: u32 = 0;
    let mut count: u32 = 0;
    for c in digits.chars() {
        count += 1;
        prod[pointer] = c.to_digit(10).ok_or(Error::Digit)?;
        pointer = (pointer + 1) % (size as usize);
        let mut sum: u32 = 1;
        for p in 0..size as usize{
            sum *= prod[p];
        }
        if sum > max_value { max_value = sum; }
    }
    if count < size { return Err(Error::Window); }
    Ok(max_value)
}

I'm sure there are numerous Rust conventions I'll be breaking, so I wanted to ask:

  1. what are the Rust conventions when it comes to this kind of code? (this presupposes that there (is a|are) convention(s) on the correct approach!).
  2. Also what's your favoured approach?

Thanks very much for your thoughts.

5 Likes

For me, it is clear what the functional version does after one read-through, even if the function was called "foo". It's far from what could be called "naive beginner code" :slight_smile: For example, the "Iterator over Results -> collect into a Result" is a "hidden" gem that many don't find for a long time.

One thing some people might do differently is to let-bind the digits result with a ? instead of using and_then - but both are idiomatic Rust IMO.

Of course, the usual blurb about benchmarking both versions applies - one of Rust's goals is to make the highly abstracted iterator chain as zero-cost as possible, letting the optimizer do the work. But it's not always perfect (yet).

8 Likes

In Rust you use the iterators chain style or the imperative style on the base of many factors. The functional style is slower to compile, and it's slower to run when it becomes complex. But it's often shorter, simpler to understand, and less bug-prone. In this case I prefer your first version.

I prefer a signature like this, that is more descriptive, and avoids the casts of "window_size":

fn largest_series_product(digits: &str, window_size: usize) -> Result<u32, Error>;

One alternative way of writing it:

fn largest_series_product(digits: &str, window_size: usize) -> Result<u32, Error> {
    Ok(digits
       .chars()
       .map(|c| c.to_digit(10).ok_or(Error::Digit))
       .collect::<Result<Vec<_>, _>>()?
       .windows(window_size)
       .map(|w| w.iter().product())
       .max()
       .unwrap_or(1))
}
4 Likes

Thank you very much for your thoughts! I only 'found' the 'Iterator over Results -> collect into a Result' feature by banging my head against the compiler :slight_smile: for a while! Google, however, came to the rescue, and I saw it in some code and thought, "oh neat!".

I wondered about using a let and ? just on the digits.

Performance-wise, I did benchmark it; the harder-to-read mutable/imperative type is about 50% faster than the functional one, but that might be solely because the functional one constructs a second list, whereas the other just iterates over digits; I've blogged about it here which I hope it's not a faux pas in blogging and asking the question here.

1 Like

That is much neater. I didn't realise you could ? in the middle of the expression, nor that windows(...) was defined on the result (or does the ? in the middle do the unwrap() if it's an Ok(...))?

Also, unwrap_or() is new to me, but I don't see how it is working?
Thanks very much!

1 Like

unwrap_or(y) will return x if it is called on a Some(x) [or Ok(x) for the Result version] or y if it is called on a None [or Err(_)]

2 Likes

In the itertools crate there is a chunks() ( itertools::Itertools - Rust ) that works on iterables, but unlike windows() it also returns partially filled windows (why? I don't like this), and its usage is a bit of a mess (because of borrowing issues).

2 Likes

In my opinion, the answer is almost always mixed is idiomatic. I would not go in one extreme or the other. For instance, in this problem I would probably do something like this:

pub fn lsp(digits: &str, size: u32) -> Result<u32, Error> {
    if size == 0 {
        return Ok(1);
    }

    if digits.len() < size as usize {
        return Err(Error::Window);
    }

    // We are only concerned with digits, so treat as ASCII.
    // Non-ASCII will be caught when converting to digits.
    let windows = digits.as_bytes().windows(size as usize);
    let mut max = 0;

    // If we take a `map` approach, we find it difficult to return
    // early on errors, so a simple for loop might be better for this
    for window in windows {
        let mut product = 1;
        for &byte in window {
            product *= (byte as char).to_digit(10).ok_or(Error::Digit)?;
        }

        if product > max {
            max = product
        }
    }

    Ok(max)
}

This example performs nearly identical to the imperial example, but you can still get some nice abstractions with using things like windows. I know that this is not identical to the way you wrote it (though, it is functionally identical), but this is what I would consider idiomatic in my humble opinion. And personally, I believe this reads better than a functional approach (but that's certainly subjective).

9 Likes

Interesting. I kind of wondered if 'mixed' was the idiomatic approach as both approaches are supported by the language; it could've been two competing groups, or a single group who recognised that both have a role to play. Thanks for taking the time to write out the code; it's given me another perspective on how Rust is used.

1 Like

chunks doesn't do overlapping subsequences. In itertools you have only tuple_windows, which is not useful when you need varying window size.

2 Likes

One optimization you can make is to iterate over bytes instead of chars. The Bytes iterator implements ExactSizeIterator so the collect call will be slightly faster. Also, iterating over bytes is always slightly faster than iterating by over characters but not by that much.

You might consider compiling with -C opt-level=3 instead of -O (opt level 2) but this doesn't make much (if any) difference if you've already applied the first optimization.

On my machine, this reduces the imperative version's lead to 20%.

1 Like

Actually, benchmarking is hard... With judicious use of test::black_box to prevent the compiler from optimizing away too much code, the functional version is at least 50% faster:

#![feature(test)]
extern crate test;

use test::Bencher;

#[derive(Debug)]
pub enum Error {
    Digit,
    Window,
}


pub fn lsp_func(digits: &str, size: u32) -> Result<u32, Error> {
    if size == 0 { return Ok(1) };
    digits.bytes()
        .map(|c| (c as char).to_digit(10).ok_or(Error::Digit))
        .collect::<Result<Vec<u32>, _>>()?
        .windows(size as usize)
        .map(|w| w.iter().product())
        .max()
        .ok_or(Error::Window)
}

pub fn lsp_imp(digits: &str, size: u32) -> Result<u32, Error> {
    if size == 0 { return Ok(1) };
    let mut prod = vec![0u32; size as usize];
    let mut pointer: usize = 0;
    let mut max_value: u32 = 0;
    let mut count: u32 = 0;
    for c in digits.chars() {
        count += 1;
        prod[pointer] = c.to_digit(10).ok_or(Error::Digit)?;
        pointer = (pointer + 1) % (size as usize);
        let mut sum: u32 = 1;
        for p in 0..size as usize{
            sum *= prod[p];
        }
        if sum > max_value { max_value = sum; }
    }
    if count < size { return Err(Error::Window); }
    Ok(max_value)
}

const DIGITS: &'static str = "123456780129384721984310238409213840192834098721395641024301239847109328741928734091827430918732478465764646491928374";

#[bench]
fn bench_func(b: &mut Bencher) {
    b.iter(|| {
        test::black_box(lsp_func(test::black_box(DIGITS), test::black_box(10)).unwrap());
    });
}


#[bench]
fn bench_imp(b: &mut Bencher) {
    b.iter(|| {
        test::black_box(lsp_imp(test::black_box(DIGITS), test::black_box(10)).unwrap());
    });
}
4 Likes

interesting, i measured that it's 2x as slow on windows of length 3 and 50% faster on windows of length 10 for that snippet.

2 Likes

Did you remember to optimize (rustc -O)? The functional variant is pretty consistently faster on my machine:

rustc -O --test the_test.rs
./the_test --bench
1 Like

@cbreeden, @stebalien - wow, much better bench-marking than I did! I know that the imperative version can be improved:

  1. It calculates the first windows-size - 1 products which will all be 0, so aren't needed.
  2. It calculates the whole product again; a combination of a remembering the product and dividing by the number about to be dropped, and then multiplying by the new one would give it a constant performance regardless of window-size - but it would then need to handle divide by zero, etc.

It's curious that the functional version is faster! Is v when v is a vector, an expensive op? I thought it would compile down to a simple offset.

I never tried it with rustc -O ...; I will do so.

I was simply using cargo --bench, but I still get the same numbers with your method -- imperial is bad for small windows, good for large windows. It's interesting, how the relation between the size of the input and the window sizes impact the benches. The mix method does pretty good for most small windows, but as you increase the window sizes the imperial method starts to win out (on my machine, for large inputs when the windows >= ~11). Is it possible that it could be vectorizing the product? I doubt it would be able to vectorize the mix variant I gave due to where the error handling is taking place but I also wasn't sure if you could vectorize when the length of the windows were unknown.

If you compiled with cargo --release, your code should have been optimized.

There's usually some bounds-checking overhead but that's not the issue here (I've tried using unsafe methods to remove the bounds checking and it doesn't make much of a difference). I don't know why the functional version is faster but I can guess why test::black_box affects the imperative version so much: it optimizes better so preventing optimizations will hurt it more.

With the two versions I presented, as window size increased, they converged to roughly the same time (window size ~ 20), which implies the product calculation starts to affect the timings. At small window sizes, I'm guessing the small extra overhead (for millions of iterations) starts to impact it. But for the convenience, it's a very small price to pay. If performance is really an issue, I guess this sort of hand optimisation in individual, profiled, functions is the way to go.

I looked up the docs on cargo build --release, and, as you said, it's already optimised. Rust compiled code is very fast!

Thanks again for your thoughts and help!

On further testing, my version overflows. Using u64 instead of u32 works better but doesn't appear to change the performance characteristics.

I also tried optimizing the functional version by keeping a running product but that version is significantly slower:

pub fn lsp_func2(digits: &str, size: usize) -> Result<u64, Error> {
    if size == 0 { return Ok(1) };
    let digits = digits
        .bytes().map(|c| (c as char).to_digit(10).ok_or(Error::Digit))
        .collect::<Result<Vec<u32>, _>>()?;

    if digits.len() < size {
        return Err(Error::Window);
    }
    
    let mut value: u64 = 1;
    let mut zeros = size;

    Ok(iter::repeat(0)
       .take(size)
       .chain(digits.iter().cloned())
       .zip(digits.iter().cloned())
       .map(|(leave, enter)| {
           if leave == 0 { zeros -= 1 } else { value = value/(leave as u64) }
           if enter == 0 { zeros += 1 } else { value = value*(enter as u64) }
           if zeros == 0 { value } else { 0 }
       })
       .max()
       .unwrap())
}
2 Likes

Long story short: Idomatic rust is the rust that runs fastest :stuck_out_tongue:

3 Likes