Short implementation of Convex Hull algorithm

Hi!

Mechanical engineer here, with mostly Python and C++ experience. Finally starting to write useful code in Rust.

This is an implementation of the "Graham Scan" method of computing plane convex hulls for natural numbers. I realise it's not much and its structured whilst having (by habit) C++ in the back of my head. So I'd love to get some feedback on how it could improve.

As a(n important) side-question: How to extend it for floating point numbers? There is a lot of potential for rounding errors etc., so for a large projects containing these types of algorithms, would you treat floats in-place or would you start the application by (given the desired precision) convert to real numbers and then convert back at the end?

The code is found (along with my stupid ponderings) here: https://github.com/SteffenSunde/rust_notebooks/blob/master/convex_hull.ipynb

Thanks a lot!

3 Likes

Wow, the notebook format works really well for this sort of explanatory/exploratory coding! I used to use it all the time with Python, but didn't know the Rust backend worked this well.


fn convex_hull(points: &Vec<Point>) -> Vec<Point> {
    let mut points = points.clone();
    points.sort();  // How to make sorted AND immutable?

    ...
}

The signature for your convex_hull() function is a bit funny. Usually you'll want to accept a &[T] instead of &Vec<T>, seeing as the only benefits a Vec can provide are when doing mutation.

Another thing to notice is that you're accepting a reference to some points list then immediately cloning it. Usually if your algorithm needs to make potentially expensive copies, you'll give the caller the choice about how this is done. This usually takes the form of accepting a Vec<T> by value, because then the caller can decide whether they want to call my_points.clone() or if they're happy with giving away ownership of the original list so you can reuse the data and allocation (rust API guidelines link).

You could also take the functional path and use the Itertools::sorted() extension. This is kinda like python's sorted() function in that you invoke this method on some iterable thing and get back an iterator that is sorted with Ord.


fn orientation(a: &Point, b: &Point, c: &Point) -> Option<Turn> { ... }

A Point is 64-bits long (2x i32), while a &Point is also 64-bits long and adds a level of indirection. Your Point type implements Copy, so it's actually more efficient (and idiomatic) to pass by value here.


I do a lot of work in the field of CAD and CNC and I can tell you that floats are a real pain to work with. Rust makes this more obvious than most languages by only implementing PartialOrd for floats ("two instances of this type are comparable... most of the time") and not implementing Eq ("you can always check for equality") and Ord ("you can always compare two instances").

When doing computational geometry you'll often try to phrase questions in terms of "is this curve CW or ACW?" because orientation tends to be more resistant to rounding errors.

Another thing you can do is to manually implement equality for Point using some "good enough" epsilon value (i.e. (self.x - other.x).abs() < EPSILON && (self.y - other.y).abs() < EPSILON). It feels like there should be a better way, but this tends to be the best bang for your buck.

I dunno. I guess you could try to use some sort of Rational type, but considering how ubiquitous f32 and f64 are it'll be annoying to integrate with 3rd party libraries. Looking at num::rational::Rational it looks like a Rational<f64> is just implemented with a pair of i64s so there shouldn't be too much memory overhead, but you'll probably notice a performance hit when doing things like trigonometry or square roots. It'd also mean a Point implemented using Rational<i64> would no longer fit into a single register, which can hurt performance.


fn convex_hull(points: &Vec<Point>) -> Vec<Point> {
    ...
    
    // Traverse points to form the upper part of CH
    let mut upper = Vec::new();
    upper.push(points[0]);
    upper.push(points[1]);

    ...
}

Just so you know, you can use the vec![] macro to make this shorter (e.g. vec![points[0], points[1]]).


/// Computes the convex hull of given points in O(nlogn)
fn convex_hull(points: &Vec<Point>) -> Vec<Point> {
    ...
    
    // Traverse points to form the upper part of CH
    let mut upper = Vec::new();
    upper.push(points[0]);
    upper.push(points[1]);
    for i in 2..n {
        let mut pos = 0;
        while upper.len() - pos >= 2 {
            let length = upper.len();
            if let Some(Turn::CW) = orientation(&upper[length-pos-2], &upper[length-pos-1], &points[i]) {
                pos = pos + 1;  // Middle point belongs to to the CH for now
            } else {
                upper.remove(length-pos-1);  // Middle point is not on the CH boundary: remove.
            }
        }
        upper.push(points[i]);
    }

It feels like you could express this more clearly using a functional style (e.g. iterators and combinators) instead of imperative (working with indices and pushing/popping). I may be a bit biased, but it took me a couple seconds to figure out what the algorithm is trying to do.

You should also pull the logic for upper and lower hulls out into a function. The two chunks look directly copy/paste'd and it took a while to realise the lower hull iterates in reverse. You might want to try using iterators, and if the two bits of code differ by a single operation (e.g. a + b instead of a - b) you can accept a closure and trust the compiler will inline it appropriately.


Construction of objects are usually preferred in rust using the builder-pattern, and naming a zero cost constructor using function named new(). The point struct is trivial and so this is not really necessary, but it is nicer than using bracket construction for every point.

I'd say this comment is pretty close to the truth :+1:

90% of the time your types will be really simple (e.g. less than 4 fields) and not need any complicated setup, so providing a static new() method works great as a constructor. It's only when you have things like default values or need to do some validation (e.g. parsing a URL when constructing a HTTP request) that you'll reach for the builder pattern.


I think you sell yourself short. Sure, it's a little more procedural than most Rust, and you aren't using iterators as much as you could (probably because iterators aren't as convenient in C++), but overall it fulfills the intended purpose and looks like code most people would write :slight_smile:

It's also not a bad thing that you were leaning on your C++ experience when putting this together. C++ and Rust are actually quite similar, and a lot of the concepts or techniques you use in one language are equally applicable in the other.

3 Likes

The notebook format works amazingly well here. Somehow this should be shared more widely as "best practices", particularly for nauplii (new Crustaceans).

In the notebook you had

    let mut points = points.clone();
    points.sort();  // How to make sorted AND immutable?

The easy way to sort points and then make it immutable is simply to redeclare points once again with an immutable shadowed version:

    let mut points = points.clone();
    points.sort();  // This instance of points is mutable
    let points = points;  // This instance of points is immutable

Trust the compiler to optimize away the apparent copy.

3 Likes

Wow thank you, Michael-F-Bryan and TomP! Learning a lot here. Will surely look into improving the code.

Also, thanks for appreciating the notebook format. It's very practical for learning/experimenting, and sometimes the aesthetics of working in article-format with markdown, equations etc. is really pleasing.

So I've started looking into handling floats and I'm considering following this idea:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=36c5bd7b035546136ce81113969e5399

I'd really appreciate some insights into whether this is a good way to attack the problem!

Thanks

If suggest to start out by internalizing the properties of NaNs. Specifically, any comparison involving a NaN is false. At first this is counterintuitive, but once you get used to it, you can arrange your comparisons such that they consistently do the right thing. You just need for true to be "good news" result. e.g. when minimizing, you check if a new value is less than the old one, and you automatically avoid NaNs. As I say, it takes a while to get used to, but floating point arithmetic is designed to allow you to efficiently do computations even with the possibility of NaNs and infinities, provided you pay attention to what you're asking of it. And this means you can very often get away with no is_nan checks.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.