Please review a small exercise


#1

As a learning exercise I wrote a small module that solves the Knight’s tour problem. Please review my code.

this is the module code from knights\mod.rs file:

use std::fmt;

pub struct Desk {
    x: i32,
    y: i32,
    max_step: i32,
    desk: Vec<Vec<i32>>,
}

impl Desk {
    pub fn new(x: i32, y: i32) -> Self {
        Desk {
            x: x,
            y: y,
            max_step: x * y,
            desk: vec![vec![0; x as usize]; y as usize]
        }
    }

    pub fn find_first(&mut self, x: i32, y: i32) -> bool {
        self.find_first_internal(x, y, 1)
    }

    fn find_first_internal(&mut self, x: i32, y: i32, mut step: i32) -> bool {
        if x < 0 || y < 0 || x >= self.x || y >= self.y || self.desk[y as usize][x as usize] > 0 {
            false
        } else {
            self.desk[y as usize][x as usize] = step;
            step += 1;
            let res = step > self.max_step
                || self.find_first_internal(x + 1, y + 2, step)
                || self.find_first_internal(x + 1, y - 2, step)
                || self.find_first_internal(x - 1, y + 2, step)
                || self.find_first_internal(x - 1, y - 2, step)
                || self.find_first_internal(x + 2, y + 1, step)
                || self.find_first_internal(x + 2, y - 1, step)
                || self.find_first_internal(x - 2, y + 1, step)
                || self.find_first_internal(x - 2, y - 1, step)
            ;
            if !res {
                self.desk[y as usize][x as usize] = 0;
            }
            res
        }
    }
}

impl fmt::Display for Desk {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let mut s = String::new();
        for y in 0..self.y {
            for x in 0..self.x {
                s.push_str(&format!("{:02} ", self.desk[y as usize][x as usize]));
            }
            s.push_str("\n\n");
        }
        write!(f, "{}", s)
    }
}

and this is how I use it in the main.rs

mod knights;

use std::time::Instant;
use std::io;

fn main() {
    println!("Enter four numbers: desk dimensions and first step coordinates");

    let mut input = String::new();
    io::stdin().read_line(&mut input).expect("Failed to read line");
    let mut iter = input.split_whitespace();

    let mut desk = {
        let x: i32 = iter.next().unwrap().parse().unwrap();
        let y: i32 = iter.next().unwrap().parse().unwrap();
        knights::Desk::new(x, y)
    };

    let start;
    let res = {
        let x: i32 = iter.next().unwrap().parse().unwrap();
        let y: i32 = iter.next().unwrap().parse().unwrap();
        start = Instant::now();
        desk.find_first(x, y)
    };
    let elapsed = start.elapsed();

    if res {
        println!("\n{}", desk);
    } else {
        println!("Can't knight all desk");
    }

    println!("Working time {:?} milliseconds.", (elapsed.as_secs() * 1_000) + (elapsed.subsec_nanos() / 1_000_000) as u64);
}

Questions:

  1. I see a big performance gap between debug and release builds. I suspect this is mainly because of using vector of vectors instead of two dimensional array or a slice of slices. Is it true? How can I make such an array or a slice of slices with variable size by a one line code? When I tried to init it like &mod vec![&mod vec![0; y as usize]; x as usize] I got a compilation error saying that &mod of Vec doesn’t implement Clone. Is there some other macro that allows to make &mod [&mod [<T>]] ?

  2. There is a warning (or just a suggestion from the intellij-rust plugin) that following two lines of the constructor could be simplified but I don’t understand how.

         x: x,
         y: y,
    

It points to this https://github.com/rust-lang/rfcs/blob/master/text/1682-field-init-shorthand.md

  1. Why Rust doesn’t support function overloading, i.e. having two functions with the same name but with different signatures? I prefer the find_first_internal() function to be named as find_first too.

Any other comments about the code are welcome.

P.S. how to make a preformatted text in this message colored according to Rust or other programming language syntax?


Is it a bug in Rust compiler or in LLVM?
#2

Some more or less general thoughts:

  • When checking for slice index validity, choose my_slice.len() for comparison, as there’s a big change the following access doesn’t require another boundary-check to be inserted by the compiler
  • you might want to switch to u32 or even usize for indexing to avoid unnecessary casts and checks within your inner loops
  • since the knight can only move to at most 8 distinct fields, you might want to precalculate the allowed moves for each field in an [u8; x * y], where each bit stands for an allowed move.
  • to be SSA-aware I’d do something like let next_step = step + 1; - it’s more robust and easier to read (at least for me)
  • I personally prefer to “exit a method early” over “nesting” (this also removes res)
if step > self.max_step return true;
if self.find_first_internal(x + 1, y + 2, next_step) return true;
if self.find_first_internal(x + 1, y - 2, next_step) return true;
if self.find_first_internal(x - 1, y + 2, next_step) return true;
if self.find_first_internal(x - 1, y - 2, next_step) return true;
if self.find_first_internal(x + 2, y + 1, next_step) return true;
if self.find_first_internal(x + 2, y - 1, next_step) return true;
if self.find_first_internal(x - 2, y + 1, next_step) return true;
if self.find_first_internal(x - 2, y - 1, next_step) return true;
self.desk[y as usize][x as usize] = 0;
false;

there are loooooots of other optimizations that can be integrated, but I believe you want to stick with more or less idiomatic Rust.

(formatted code can be enclosed in ```)
Hope this helps a bit.


#3

When checking for slice index validity, choose my_slice.len()

This is good only for square desk but it could be a rectangle, for example 5 x 6.

you might want to switch to u32 or even usize for indexing to avoid unnecessary casts and checks within your inner loops

No I can’t because the calculated value could be negative and program will panic. Can I switch all the x and y value checks into one Result check? I mean use x and y straightforward but receive Result from accessing the vector of vectors (the desk) instead of possible panic.


#4

Just use the variable names:

...
x,
y,
...

You can make find_first take an Option for the step and then default it to 1 if it’s None.


#5

Just use the variable names:

x,
y,

Thanks, I didn’t know about that. Is it always x to x and y to y or order is also significant?

You can make find_first take an Option for the step and then default it to 1 if it’s None.

Ok. But what if the second signature is completely different? Why it was decided not to support function overloading in Rust?


#6

Order doesn’t matter - it matches up by name (ie name of the binding = name of the field).

Don’t know the precise reason but I’m guessing it’s for stylistic reasons (ie same sentiment as requiring widening casts to be explicit). There’s usually a nice way to avoid the need for this in Rust though, given all its other features. In particular, generic functions make different permutations more pleasant, IMO. Is there a good example of where you’d want this with significantly different signatures?


#7

First check for the row index, retrieve the nested slice and perform the same for the column. That’s what the compiler does anyway.

Move the range checks out to the caller and reduce them to the needed minimum. There’s no need to check x + 1 < 0 if x was already valid. Same for the upper bound. If you pre-calculate those boundary checks as mentioned in my previous post you can even eliminate all explicit range-checks.

I found a one-dimensional array to be faster in most cases.

As said before: If speed matters there are lots of tricks at the price of readability.


#8

My goal is not diving into the CPU ticks but learning Rust. I’ve changed my exercise code into following

knights\mod.rs file:

use std::fmt;

pub struct Desk {
    max_step: usize,
    desk: Vec<Vec<usize>>,
}

impl Desk {
    pub fn new(x: usize, y: usize) -> Self {
        Desk {
            max_step: x * y,
            desk: vec![vec![0; x]; y]
        }
    }

    pub fn find_first(&mut self, x: usize, y: usize) -> bool {
        self.find_first_internal(Some(x), Some(y), 1)
    }

    fn find_first_internal(&mut self, x_opt: Option<usize>, y_opt: Option<usize>, mut step: usize) -> bool {
        if y_opt.is_none() || x_opt.is_none() {
            return false
        }

        let x = x_opt.unwrap();
        let y = y_opt.unwrap();
        if y >= self.desk.len() || x >= self.desk[y].len() || self.desk[y][x] > 0 {
            return false
        }

        self.desk[y][x] = step;
        step += 1;

        return step > self.max_step
            || self.find_first_internal(x.checked_add(1), y.checked_add(2), step)
            || self.find_first_internal(x.checked_add(1), y.checked_sub(2), step)
            || self.find_first_internal(x.checked_sub(1), y.checked_add(2), step)
            || self.find_first_internal(x.checked_sub(1), y.checked_sub(2), step)
            || self.find_first_internal(x.checked_add(2), y.checked_add(1), step)
            || self.find_first_internal(x.checked_add(2), y.checked_sub(1), step)
            || self.find_first_internal(x.checked_sub(2), y.checked_add(1), step)
            || self.find_first_internal(x.checked_sub(2), y.checked_sub(1), step)
            || (|| {self.desk[y][x] = 0; false})()
    }
}

impl fmt::Display for Desk {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let mut s = String::new();
        for row in &self.desk {
            for cell in row {
                s.push_str(&format!("{:02} ", cell));
            }
            s.push_str("\n\n");
        }
        write!(f, "{}", s)
    }
}

main.rs file:

mod knights;

use std::time::Instant;
use std::io;

fn main() {
    println!("Enter four numbers: desk dimensions and first step coordinates");

    let mut input = String::new();
    io::stdin().read_line(&mut input).expect("Failed to read line");
    let mut iter = input.split_whitespace();

    let mut desk = {
        let x: usize = iter.next().unwrap().parse().unwrap();
        let y: usize = iter.next().unwrap().parse().unwrap();
        knights::Desk::new(x, y)
    };

    let start;
    let res = {
        let x: usize = iter.next().unwrap().parse().unwrap();
        let y: usize = iter.next().unwrap().parse().unwrap();
        start = Instant::now();
        desk.find_first(x, y)
    };
    let elapsed = start.elapsed();

    if res {
        println!("\n{}", desk);
    } else {
        println!("Can't knight all desk");
    }

    println!("Working time {:?} milliseconds.", (elapsed.as_secs() * 1_000) + (elapsed.subsec_nanos() / 1_000_000) as u64);
}

#9

I don’t think it has any relation to a code style. Function overloading is a useful feature and has a good style when used in the source code. I think the real reason was a reluctance to symbol decorations in the object files (mangled symbols). This is how function overloading is implemented in C++, isn’t it? BTW just found a good article about demangling C++ symbols in Rust by the cpp_demangle crate: http://fitzgeraldnick.com/2017/02/22/cpp-demangle.html

Take a look into the second version of the exercise code in my above reply to fyl2xp1.

Could you show me how to change my code above to use a generic function?


#10

I’d just rename the second one to find or find_next.

Typical reasons for overloading:

  • Optional parameters -> use Option in Rust
  • Call a method with the same parameters but alternate types -> use generics in Rust
  • Misuse the types and number of parameters to modify how the method works -> split into multiple methods with descriptive names (valid for all languages)

Rust uses its exceptionally strong type system to select the proper traits and provide a comfortable type inference at the same type. I believe this collides with overloading which weakens at least the type inference. Overloading complicates the use of variadic functions (which may be implemented in the future). Maybe you should ask this as a separate question.

I find the current code very hard to read.

  • I wouldn’t call find_fist_internal if the caller already knows there’s a None in the parameters.
  • checked_add doesn’t make sense, unless your board fills your entire memory :wink:
  • return can be omitted for the last expression
  • why do you use || for a simple sequence?
  • to me, the last element in this chain is a quite dirty hack for no obvious reason

#11

But it also finds the first knights route thus the find_first_internal has a meaningful name and its _internal part was added just because Rust doesn’t support function overloading.

Not a useful solution since I can’t omit those optional parameters and still need to set them as None. For example:

fn main() {
	//fu(1, 2); // doesn't compile
	fu(1, 2, None);
}

fn fu(_a: i32, _b: i32, _c: Option<i32>) {
	println!("it  works");
}

The code logic for different parameter types could be different. In this case a generic function is not a solution.

It’s not necessarily a misuse. For example I may want to have several versions of a parse() function. The first version will get a string parameter that makes it able to do a multi-pass parsing. The second version will get a stream or a socket and will not be able to do a multi-pass parsing of a big text. The third version could be an asynchronous version of the second and with an additional callback parameter. Of course I can make those functions with different names but this will be a manually done name mangling, i.e. mimic of function overloading.

In Java the varargs feature is just a syntactic sugar of sending the last sub-set of parameters as an array. Is it what may be implemented in Rust in the future? I don’t see how function overloading could complicate the use of this.

The caller can’t check it without making the code less readable.

checked_sub() is the safe way to do a subtraction from usize you suggested to switch into from i32. checked_add() is used just because of checked_sub().

I consider this as a bad practice and think it could be good only in a small closure functions like one I use there. If in a future I will add some new code after such a return without the return keyword I may forget to add this keyword back, this would compile and would introduce a bug. Was in such a situation when programmed in Groovy.

Don’t understand your question. Why I use the OR logic between return values of the function calls? Because it’s logical and because it stops calling them either at the end or after one of them returned true. The same technique is used in UNIX shells to run command B in case command A has failed and to run even command C in case both A and B have failed.

What makes it dirty in your eyes? If all the previous returned false I also return false but do something before that.


#12

I actually don’t have the feeling I can provide you with a single suggestion you would accept. Maybe someone else can.


#13

Maybe you can show how you were writing this code of the Knight’s tour?