Rust optimisation


I have just started to learn Rust and coming from a web developer background it can both be rewarding and frustrating. I learn best by trying out known algorithms in the new language. I have implemented a brute force solution of the Travelling Sales Person algorithm in several other languages and wanted to compare the performance of Rust with the other languages.

Right now Rust performs really bad (factor 10 slower than for instance nodejs) but I can only think that my limited experience is the culprit of the slow program.

I would be happy to receive pointers to where I can improve the performance of my Rust program.

The program can be seen in this gist.


Just to make sure, did you enable optimizations, i.e., compile with rustc -O or cargo build --release?


For Rust I also suggest you to read the manual, because there are parts that are harder to learn by trial and error.


Thanks, Made it a lot better, I had tried with cargo build --release (using cargo for a timing crate) it did not make any difference. So I hadn’t tried it with rustc -O. Now it performs similarly to nodejs.

Is there any of the data structures that are slow?


I have done some small improvements on your code, the performance is the same (compiled with: -C opt-level=3 -C target-cpu=native ):

The default HashMap/HashSet of Rust std library are quite slow. And in general it’s not a good idea to use a hash lookup inside the inner loop of this algorithm. So a better solution is to use just vectors and regular indexing.


I’ve seen that the f64::hypot function is very slow. Can I put your code with small differences in a bug report?


I’ll try to optimise away the hash map and see if there is any difference.

You can use the code in the bug report.

Thanks for the help.


OK, filed:


There’s going to be a lot of difference, of course. Having a hash lookup inside the inner loop is… icky. This is the first version with arrays:

Compiling with -C opt-level=3 -C target-cpu=native the precedent version of mine runs (with 10 cities) in about 1.96 seconds, while this one runs in about 0.18 seconds. And there are simple ways to further speed up this code a little.


Writing a permutation iterator would make the main code nicer to read. Wouldn’t help with performance though.


Have you tried a profiler to spot where is slow?


Or better yet, don’t write it, and use one that was already written. The module provides a documentation explaining crate’s usage (in particular, heap_recursive function sounds useful in this case, author claims that recursive version is faster, but your mileage may vary).


The main performance factor in those permutohedron algorithms is if they copy the array of items (or indexes) at each permutation or not. Currently in Rust not copying the array in a lazy iterator is not easy, I think.


heap_recursive is not technically an iterator, but rather a function accepting a callback function due to iterator limitations (but considering what this program does, it should be fine). Considering it takes &mut [T], and calls a callback with &mut [T], it suggests to me there is no copy involved. In fact, the source code shows it’s a simple pattern of calling a function and doing a swap, nothing more complex than that.


Right, in that crate there is also another way to generate permutations, and it copies the array at each permutation (and I think it’s an iterator).


The next obvious optimization is to pre-compute all pair distances:

With 11 cities the precedent version runs in about 1.90 seconds, this version runs in 0.58 seconds (and my first version is a bit too much slow for this). There are various ways to improve the performance of this code.