[Solved] Help me fight the borrow checker over a nice API to work with samples


#1

I am doing some statistics and working with random samples. Here is some typical auxiliary code:

pub fn gen_log_sample(sample_size: usize, k: usize) -> Vec<f64> {
    let mut sample = frechet(1.1, sample_size);

    sample.order_stat_top(k).normalize().ln();

    sample.unwrap()
}

The frechet(α, n) function returs a Sample, which is just a struct wrapper around Vec<f64>. I do this to impl several transformations on the Sample. Each transformation takes &mut self and returns self as a &mut Sample to allow chaining. I took the inspiration for this from the gnuplot library. The final unwrap takes ownership self, removes the wrapper and returns the raw Vec<f64>.

This is already way nicer than what I had before, but I’m still unhappy about the chaining. I would love to write

pub fn gen_log_sample(sample_size: usize, k: usize) -> Vec<f64> {
    // This code does not compile!
    let mut sample = frechet(1.1, sample_size)
        .order_stat_top(k)
        .normalize()
        .ln()
        .unwrap()
}

This would get me to the point where I wouldn’t even need those ill-named auxiliary functions like gen_log_sample(..) anymore. But there are some problems caused by ownership/borrowing: I can’t combine the first two lines (sampling & transformations), because then no one owns the Sample. Neither can I combine transformations & unwrapping, because unwrap needs to take ownership, and transformations don’t yield ownership.

I would love to hear your ideas how I can improve my API for this. (I feel like there probably is some trait magic to guide me)


#2

You don’t need to return and pass &mut Sample - just pass the Sample and return the Sample, i.e. move the value into and out of the intermediate functions.


#3

Sample -> Sample type functions is how I would write this as well.

Note there is a tradeoff here, as Sample -> Sample functions have their own set of limitations; for instance, it’s hard to apply them iteratively inside a loop (doing so often requires some trick like storing the value in a mutable Option and .take()-ing and ::std::mem::replace()-ing it).

Apparently I forgot that I already discovered that the option trick isn’t necessary (scroll to bottom)


#4

I was considering that solution as well for a bit. The drawback seems to be, that it makes the first style harder, as I need more let bindings.

pub fn gen_log_sample(sample_size: usize, k: usize) -> Vec<f64> {
    let sample = frechet(1.1, sample_size);

    let sample = sample.order_stat_top(k).normalize().ln();

    sample.unwrap()
}

But then, I’m not sure I’m ever going to do that. I’ll go with your suggestion until it actually becomes a nuisance. Thanks for the encouragement, both of you!


#5

I thought you wanted the chaining style, no? If so, this is what it looks like with the function stubs:

struct Sample(Vec<f64>);

impl Sample {
    fn new() -> Self {
        Sample(vec![])
    }
    
    fn order_stat_top(self, k: usize) -> Self {
        self
    }
    
    fn normalize(self) -> Self {
        self
    }
    
    fn ln(self) -> Self {
        self
    }
    
    fn unwrap(self) -> Vec<f64> {
        self.0
    }
    
}

fn main() {
    let mut s = Sample::new();
    let v = s.order_stat_top(1).normalize().ln().unwrap();
}