Is this a good solution to first exercise in chapter 8?

Hi anyone reading this, the title says it all haha. I tried to handle all cases I could think of with median and mode. But I'm not crazy about the readability of my mode function, do any of you have suggestions for how to improve this code? Thank you in advance!

use rand::{thread_rng, Rng};
use std::collections::HashMap;

fn median(list: &Vec<i32>) -> f64 {
    let len = list.len();
    if len % 2 == 0 {
        let a = list[(len / 2) - 1];
        let b = list[len / 2];
        return (a + b) as f64 / 2.0;
    } else {
        return list[len / 2] as f64;
    }
}

fn mode(list: &Vec<i32>) -> Vec<(i32, i32)> {
    let mut map = HashMap::new();
    for &n in list {
        map.entry(n).and_modify(|count| *count += 1).or_insert(1);
    }

    let max_val = map.values().cloned().max().unwrap_or(0);
    map.into_iter().filter(|&(_, val)| val == max_val).collect()
}

fn main() {
    let mut rng = thread_rng();
    let mut list = Vec::new();

    for _ in 0..rng.gen_range(50..=100) {
        list.push(rng.gen_range(-4..=4));
    }

    list.sort();
    let med = median(&list);
    let mode = mode(&list);

    if mode.len() > 1 {
        println!("Median: {med}\nModes: (occur {} times)", mode[0].1);
        for (n, _count) in mode {
            println!("\tMode: {}", n);
        }
    } else {
        println!("Median: {}\nMode: {} (occurs {} times)", med, mode[0].0, mode[0].1);
    }
}```

First off, &Vec<T> as a function argument is bad. It can't give the implementation more capabilities than a slice would, but it requires the caller to have a literal Vec. If the caller only has some other type that would coerce to a slice, you just unnecessarily forced them to allocate a Vec.

Next up, explicit returns are completely unnecessary in median(). Rust is an expression language.

Then, the way you manipulate the HashMap entries in mode can be simplified by inserting a 0 (instead of 1) if the key does not exist. Inserting the identity element of the operation (addition) is better in general, because it lets you perform the operation unconditionally, simplifying the code.

The return type of mode is also non-ideal. You are returning a vector of pairs, but the second field of each pair – the count – is the same; it must be, by definition! The current return type doesn't provide this guarantee, it allows the function to return different modes with different counts, which is nonsensical.

Finally, neither of your functions handle an empty input correctly. The first one will panic, and the second one is brittle (you are using the impossible value of 0 to compare counts against, so I guess it's working correctly now, but it may break if you come back later to modify the function). You should be returning Option in both cases.

The improved solution.

2 Likes

Thank you for taking the time to respond! Your first observation has made me realize that I put up the wrong code. The one I meant to put up had &[i32] as the type annotation for list

yeah i forgot to go back and change or_insert(1) to or_insert(0), I noticed that it would count the occurrences wrong if I left it as 1. But i never got to doing it haha.

thank you for catching the explicit return. I thought I had to use them in ifs

Yeah I see the redundance of the return type of mode. But I don't understand how the function allows for there to be different modes with different counts, isn't map.into_iter().filter(|&(_, val)| val == max_val).collect() only returning the numbers that occur max_val times? Would you elaborate on what you mean a little more?

When I wrote this program, I did so under the assumption that there wouldn't be a case where the input is empty. So from now on, even if the input won't be empty, should I still write code that handles None?

Also, thank you for putting up the improved solution. That's really helpful!

The return type allows it. I.e., it is the guarantee that it's always the same (maximal) count that is not communicated to the caller; so this information is lost.

Of course, the current implementation of the function does always return the same count, but since anything else would be nonsensical, the return type should not even allow the possibility of multiple counts.

A Vec or a slice can be empty. So if you don't have a good reason to assume that it isn't (and by that, I mean an algorithmic guarantee, and emphatically not "people will not pass an empty slice", because they absolutely will), then you should handle such edge cases.

3 Likes

Oh I see. It doesn't make sense for mode's return type to include each mode's occurrence count. Can doing it like I did previously impact performance? I feel like it does when I think of each step taken to return it that way, but just want to confirm.

Gotcha, I'll make sure going forward to consider the edge cases and handle them accordingly.

A couple of side questions. I know this post is about the first problem, but should I post my solutions to the other exercises in chapter 8 here or just a new create a new post?

And, I'm really trying to wrap my head around ownership, the borrow checker, the types that difference methods expect and return, and how to use them. I'm using Brown's interactive version of the book and each exercise and quiz helps, but most of the book is reading with a couple of simply quizzes. Do you know anywhere where I can practice these things?

No idea, and it probably doesn't matter. If it does, start measuring.

Create a new post.

The best way is to start building something real. No amount of reading other people's blogs will give you a practical feeling of the ownership and borrowing model.

2 Likes

Gotcha, thank you for your help dude!