I'm trying to solve the exercises at the end of Chapter 8 of the book The Rust Programming Language.
I'm trying to solve the first one: "Given a list of integers, use a vector and return the mean (the average value), median (when sorted, the value in the middle position), and mode (the value that occurs most often; a hash map will be helpful here) of the list."

Here is my code:

use std::collections::HashMap;
fn main() {
let number_list = vec![1, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10, 8, 12, 8, 1, 2, 3];
let mut mutable_vec = number_list.clone();
fn calculate_mean(vec: Vec<i32>) -> f32 {
let mut sum = 0;
for i in &vec {
sum += i
}
return (sum as f32) / (vec.len() as f32);
}
fn calculate_median(mut_vec: &mut Vec<i32>) -> f32 {
mut_vec.sort();
let length = mut_vec.len();
if length % 2 == 0 {
// even number
return (mut_vec[length / 2] as f32 + mut_vec[length / 2 + 1] as f32) / 2.0;
} else {
//uneven number
return mut_vec[(length - 1) / 2] as f32;
}
}
fn calculate_mode(vec: Vec<i32>) -> Vec<i32> {
let mut hashmap = HashMap::new();
let mut biggest_value = 0;
let mut answer = Vec::new();
for value in vec {
let count = hashmap.entry(value).or_insert(0);
*count += 1;
if count > &mut biggest_value {
biggest_value = *count
}
}
for (k, v) in hashmap {
if v == biggest_value {
answer.push(k);
}
}
return answer;
}
println!("the mean is: {}", calculate_mean(number_list));
println!("the median is: {}", calculate_median(&mut mutable_vec));
println!("the mode is: {:?}", calculate_mode(number_list));
}

The above code won't compile. The problem being that the vector number_list is moved to calculate _mean and to calculate_mode. So, if I comment one of this function the above code works.

What is the best way to solve this issue ? Is there a way to do it without dupplicating the vector number_list ?

You don't need to take ownership of the Vec to calculate the mean or mode. For example, here in calculate_mean, you're only using a reference:

for i in &vec {
sum += i
}

So, try only taking a reference:

fn calculate_mean(vec: &[i32]) -> f32 {
// Or
// fn calculate_mean(vec: &Vec<i32>) -> f32 {
// But for shared slices like this, `&[i32]` is more general and
// more idiomatic

Once you change the function signatures, you'll have to change other parts of your code to match. Give it a shot and feel free to ask follow-up questions.

So I try to write clone-free functions - because this can be a source of hidden memory bloat for LARGE arrays (I deal with nearly a billion f32s in a single memory slice, so this can be the difference of working or not). Since you have no idea how big the input vector may be.

Basically I like the element of least surprise, a function shouldn't just clone my vector. Force the caller to clone based on the function signature.

A) Take &mut Vec<i32> because you are telling the caller you WILL modify the data. If the caller doesn't want to risk his vector being changed, the caller can clone. (in most statistics cases, the order didn't matter). Thus you'd just call sort directly on the passed-in value.

B) Take x: Vec<i32> - taking ownership of the vector.. Here again, you force the caller to EITHER clone the data, or make the call to median as his LAST operation.

In all of the above, you're giving the caller flexibility to make the code significantly more efficient (not running out of RAM in my case).

And again, it's a general rule of thumb that I follow.