Functional programming in Rust (code review)


#1

Hi, I’m a Rust beginner and I’m trying to write code using a “functional programming style”. I’d really appreciate if anyone can review this code. It’s a simple program that read numbers in a file, map a function and compute some statistics on the results (see the project page).


#2

Looks pretty cool to me. My only question was with the data and params args:

fn compute_stats(data: Vec<f64>, algo: &str, params: Vec<f64>)

Would these be better as references rather than owned? I’m still learning - I don’t have it crystal clear what happens to an owned param with respect to the stack but am wondering if the whole Vec is copied on method invocation?


#3

If compute_stats isn’t inlined, then the Vec is moved but that’s only 24 bytes (on 64bit) - the contents of the Vec, living on the heap, aren’t copied (just a pointer to them is).

@juliendehos, what specifically do you want comments on? Just the “functional” aspects or more broadly?


#4

@vitalyd : comments on the Rust aspects, mainly. I think the functional patterns (map, fold, closures, pattern matching…) should be ok but I’m not sure how to implement them in the Rust way.


#5

Here’s a version to consider:

use std::env;
use std::fmt;
use std::fs::File;
use std::io;
use std::io::prelude::*;

#[derive(Debug)]
struct Stats {
    avg: f64,
    var: f64,
    algo: String,
    params: Vec<f64>,
}

fn compute_stats(data: &[f64], algo: &str, params: Vec<f64>) -> Stats {
    let n = data.len() as f64;
    let avg = data.iter().sum::<f64>() / n;
    let var = data.iter().fold(0.0, |acc, x| acc + (x - avg).powf(2.0)) / n;
    Stats {
        avg,
        var,
        algo: algo.into(),
        params,
    }
}

impl fmt::Display for Stats {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        writeln!(f, "average; variance; algo; params")?;
        write!(
            f,
            "{}; {}; {}; {}",
            self.avg,
            self.var,
            self.algo,
            self.params
                .iter()
                .map(ToString::to_string)
                .collect::<Vec<String>>()
                .join("|")
        )
    }
}

fn read<R: Read>(mut io: R) -> io::Result<Vec<f64>> {
    let contents = &mut String::new();
    io.read_to_string(contents)?;
    contents
        .split_whitespace()
        .map(|x| {
            x.parse()
                .map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))
        })
        .collect()
}

fn create_func(algo: &str, params: &[f64]) -> Result<Box<Fn(&f64) -> f64>, String> {
    Ok(match (algo, params.len()) {
        ("mul2", 0) => Box::new(|x| x * 2.0),
        ("mul", 1) => {
            let k = params[0];
            Box::new(move |x| x * k)
        }
        ("sin", 2) => {
            let a = params[0];
            let b = params[1];
            Box::new(move |x| (x * a + b).sin())
        }
        _ => Err(format!(
            "unrecognized algo or invalid args: {} {:?}",
            algo, params
        ))?,
    })
}

fn main() -> Result<(), Box<std::error::Error>> {
    let args: Vec<String> = env::args().collect();
    if args.len() < 3 {
        return Err(format!("usage: {} <input> <algo> <params>", args[0]).into());
    }
    let filename = &args[1];
    let algo = &args[2];

    let params = (&args[3..])
        .iter()
        .map(|x| x.parse())
        .collect::<Result<Vec<f64>, _>>()?;

    let func = create_func(algo, &params)?;
    let data = read(File::open(filename)?)?;
    let data = data.iter().map(|x| func(x)).collect::<Vec<_>>();
    let stats = compute_stats(&data, algo, params);

    println!("{}", stats);
    Ok(())
}

You can throw some generics in there, but they don’t really add any value in this simple case. Mostly, the above removes unwrap()s and use of process:exit(). I took some liberty to simplify the create_func code, at the expense of a slightly more generic error message. This can also be made better (e.g. define an enum for the algo + its parameters + its execution), but I punted to keep this simple.

The Display impl can be more efficient, but again, doesn’t matter here.

I made the read() fn generic so that you could, e.g., substitute in a String for testing.

For a real CLI, you’d likely want to use clap and/or structopt crates rather than hand-rolling.

Feel free to ask questions.


#6

Yes, for arguments use the most primitive type, as it’s most flexible. So &[] is better than Vec in arguments.

Try clippy, it has suggestions for these kinds of micro best practices:

cargo +nightly install clippy --force
cargo +nightly clippy

#7

Thanks for the comments.

Ok for &[] over Vec in arguments. Clap and structopt are very interesting, indeed. Clippy seems temporarily broken, so I’ll try to install it again later.

Traits and Result make me think of type classes and Either in Haskell. In create_func, the composition of the Err value inside the Ok value is quite impressive…


#8

Just for kicks, here’s how you might do it via an enum and avoid boxing:

#[derive(Debug)]
enum Algo {
    Mul2,
    Mul(f64),
    Sin { a: f64, b: f64 },
}

impl fmt::Display for Algo {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            Algo::Mul2 => f.write_str("x * 2.0"),
            Algo::Mul(by) => write!(f, "x * {}", by),
            Algo::Sin { a, b } => write!(f, "sin(x * {} + {})", a, b),
        }
    }
}

impl Algo {
    fn from_args(name: &str, params: &[f64]) -> Result<Self, String> {
        Ok(match (name, params.len()) {
            ("mul2", 0) => Algo::Mul2,
            ("mul", 1) => Algo::Mul(params[0]),
            ("sin", 2) => Algo::Sin {
                a: params[0],
                b: params[1],
            },
            _ => Err(format!(
                "unrecognized algo or invalid args: {} {:?}",
                name, params
            ))?,
        })
    }

    fn apply(&self, input: f64) -> f64 {
        match *self {
            Algo::Mul2 => input * 2.0,
            Algo::Mul(by) => input * by,
            Algo::Sin { a, b } => (input * a + b).sin(),
        }
    }
}

let algo = Algo::from_args(algo, &params)?;
let data = read(File::open(filename)?)?;
let data = data.iter().map(|&x| algo.apply(x)).collect::<Vec<_>>();

Again, doesn’t really matter in this small example but just an FYI for another design option.

Yes, that’s pretty accurate.