As promised, here’s my experience at porting my workplace’s official toy numerical code to Rust.
Overall, the Rust syntax had a nice feeling for numerical purposes. For math-heavy code, the conciseness brought by type inference proved to be priceless. In addition, immutability by default and clear “mut” labels made debugging MUCH easier.
Iterators, enums and pattern matching were not as useful as in other kinds of code, mostly because math is not very branchy and does not require very complex data structures, but absolutely loves multi-dimensional arrays, which standard Rust iterators are quite bad at. Still, they did find occasional use, mostly for IO purposes.
Rust’s general dislike of global variables made this porting exercise more difficult, since the original code made heavy use of a global random number generator. But considering that the associated refactoring would be the first step of any parallelization process anyhow, I considered this to be fair game.
Although the borrow and initialization checker proved generally useful in spotting bugs before they occur, their inability to understand arrays was the cause of much pain. It led me to various highly questionable practices, ranging from reimplementing mem::swap for array elements using indices, to initializing an array at creation time only to immediately trash the initial value and fully re-initialize it using a loop.
In general, Rust’s standard array support felt much like an afterthought. But I would expect something like ndarray, which I have not tried yet, to fare better in this department.
Unlike in other programming languages which are strongly biased towards a specific floating-point type, such as doubles, the Rust standard library organization made it remarkably easy to write code which is independent of the kind of floating-point type in use (f32 vs f64). Just put a typedef and an appropriate package renaming somewhere in your code and you’re done.
On the other hand, for scientific computation, Rust’s idea of turning unary math functions into methods of the corresponding floating point types proved quite questionable. It turns any nontrivial expression into an unholy mixture of regular and RPN notation which is very hard to parse. In addition, translating from the majority of programming languages which maps mathematical functions to free functions is made needlessly difficult.
Which of the following expressions do you find easiest to read?
let norm1 = (x.powi(2) + y.powi(2) + z.powi(2) - t.powi(2)).sqrt(); let norm2 = sqrt(powi(x, 2) + powi(y, 2) + powi(z, 2) - powi(t, 2)); let root1 = sqrt(42.); let root2 = (42.).sqrt();
And those are really simple compared to some of the expressions I had to deal with.
Rust’s default float output format is not well suited for scientific purposes. It deals very poorly with very large and very small numbers. A much better fit would be to use engineering notation (i.e. switch to scientific notation when appropriate), which the Rust standard library does not seem to support even as an option.
In general, my experience with std::fmt for scientific purposes was unpleasant. The formatting options that are provided are generally ill-suited to scientific computing (e.g. controlling the amount of decimals rather than the amount of significant digits), and it is quite difficult to obtain something that is pleasant to read out of them (e.g. there is no way to put an upper bound on the amount of digit of a number without being drowned in a sea of zeros).
In contrast, file IO was overall a pleasant experience, mostly thanks to Rust’s great iteration and text manipulation facilities.
To stay reasonably close to the original C++ code’s use of iostreams, I decided to write a couple of file readout wrappers using closures and iterators. This turned out to be quite a powerful combo, although it did leave quite a bit of duplication with respect to the original code, mostly for three reasons:
- Closures are not generic, so I needed one per input/output data type…
- …but they could not share mutable access to the original file or iterator…
- … so I had to separate file interation from content encoding/decoding, resulting in a more cumbersome and repetitive interface.
The end result was something like this for input…
let x = as_i32(next_item()); let y = as_real(next_item());
…and something like this for output…
write_i32(&mut output_file, "Some label", x); write_real(&mut output_file, "Some other label", y);
…at which point it felt like the only way to eliminate the repetition of types and files/iterators without violating Rust’s “one single mutable ref” rule was to make a single struct responsible for each kind of file IO, which was a bigger refactoring and would have, overall, made the code harder to understand.
Similarly, Rust’s error handling style did not mix well with the program’s decision to put all I/O in the main function, and to sidestep any serious error handling. Because the ? operator cannot be used in main(), the only option left to keep the complexity equivalent to the original code was to sprinkle unwraps everywhere, encapsulating them in closures whenever possible.
In general, it felt like Rust’s focus on software engineering best practices could be somewhat detrimental in the application area of numerical codes, where engineering constraints are generally more relaxed (seriously, it’s okay to crash if a config file is not found), and keeping code simple and easy to understand for non-specialists (e.g. physicists) is a paramount concern.
Rust’s standard library proved unusually bare-bones (e.g. no complex support, no way to print the current date/time), but that was pretty well compensated by how easy Cargo makes to pull in dependencies from crates.io.
I wish cargo was faster at its job though… I found myself waiting for cargo operations (updating repository, compiling dependencies…) much more than for the build process itself.
When it comes to missing features, aside from the aforementioned general poor array support in Rust’s standard tool (iterators, borrow and init checkers…), I felt longing for some performance profiling mechanism and for some manual vectorization constructs. But I think I’ve read around here that both of these might be coming reasonably soon.
And so, that’s my experience porting a simple Monte Carlo simulation to Rust. Hope you enjoyed it!