Huffman coding for rosetta code entry #code-review


#1

Hi there,

on my way learning Rust I was digging in some basic algorithm and I found out that the Huffman coding entry in Rosetta code wiki doesn’t include a Rust implementation. So I decided to give it a try:

https://play.rust-lang.org/?gist=b4c8abe345e18e2d4832&version=stable

Since this is my first Rust code with some common sense in it I would like to ask for code review before creating the entry in Rosetta code wiki.

Of course many thanks to all the people that helped me out on #rust-beginners and #rust channels.

Thanks!

Enrico


#2

This looks pretty good to me. One minor change I would make, just to shorten some code, is to replace this:

let mut p = "";

match i {
    0 => p = "0", // left node
    1 => p = "1", // right node
    _ => (),
}

with this:

let p = match i {
    0 => "0", // left node
    1 => "1", // right node
    _ => "",
};

Edit: Also, perhaps instead of a Vec of children, Node::Internal should have left_child and right_child fields.


#3

many thanks @mbrubeck ! that makes totally sense to me.

BTW… https://github.com/Hoverbear/rust-rosetta/blob/master/src/huffman_coding.rs

Why this is not in Rosetta code wiki already? :disappointed:

EDIT: https://github.com/Hoverbear/rust-rosetta/issues/441


#4

I like using a fixed size array when left/right child members come up. This enables both static checking that you have exactly 2 children (of course), but it also makes it possible to generalize some algorithms, but iterating over the array, or using k and 1 - k to access the children (where k is either 0 or 1). It lets you exploit symmetries to avoid repetition, simply. (I learned this from @ http://eternallyconfuzzled.com/ )


#5

@bluss that was actually the idea since I’m planning to transform this to an n-ary algorithm. But maybe for the sake of simplicity we can just use left_child and right_child. Still not sure though.


#6

I would second bluss’s idea of using a fixed size array. My only other concern would be the:

let elements = Vec::from(elements);
let mut priority_queue = BinaryHeap::new();

priority_queue.extend(elements.into_iter());

In the create_tree function. I think the elements are being cloned here, and I don’t think that is needed.


#7

Take also a look at the D entry (that beside being short is also quite efficient, it takes a rather short time to build a tree of a million different symbols).


#8

This will move the elements, not copy (speaking rust level semantics). The into_iter for vec ensures this.