Experience porting a simple MC simulation to Rust


#1

As promised, here’s my experience at porting my workplace’s official toy numerical code to Rust.

Overall, the Rust syntax had a nice feeling for numerical purposes. For math-heavy code, the conciseness brought by type inference proved to be priceless. In addition, immutability by default and clear “mut” labels made debugging MUCH easier.

Iterators, enums and pattern matching were not as useful as in other kinds of code, mostly because math is not very branchy and does not require very complex data structures, but absolutely loves multi-dimensional arrays, which standard Rust iterators are quite bad at. Still, they did find occasional use, mostly for IO purposes.

Rust’s general dislike of global variables made this porting exercise more difficult, since the original code made heavy use of a global random number generator. But considering that the associated refactoring would be the first step of any parallelization process anyhow, I considered this to be fair game.

Although the borrow and initialization checker proved generally useful in spotting bugs before they occur, their inability to understand arrays was the cause of much pain. It led me to various highly questionable practices, ranging from reimplementing mem::swap for array elements using indices, to initializing an array at creation time only to immediately trash the initial value and fully re-initialize it using a loop.

In general, Rust’s standard array support felt much like an afterthought. But I would expect something like ndarray, which I have not tried yet, to fare better in this department.

Unlike in other programming languages which are strongly biased towards a specific floating-point type, such as doubles, the Rust standard library organization made it remarkably easy to write code which is independent of the kind of floating-point type in use (f32 vs f64). Just put a typedef and an appropriate package renaming somewhere in your code and you’re done.

On the other hand, for scientific computation, Rust’s idea of turning unary math functions into methods of the corresponding floating point types proved quite questionable. It turns any nontrivial expression into an unholy mixture of regular and RPN notation which is very hard to parse. In addition, translating from the majority of programming languages which maps mathematical functions to free functions is made needlessly difficult.

Which of the following expressions do you find easiest to read?

let norm1 = (x.powi(2) + y.powi(2) + z.powi(2) - t.powi(2)).sqrt();
let norm2 = sqrt(powi(x, 2) + powi(y, 2) + powi(z, 2) - powi(t, 2));

let root1 = sqrt(42.);
let root2 = (42.).sqrt();

And those are really simple compared to some of the expressions I had to deal with.

Rust’s default float output format is not well suited for scientific purposes. It deals very poorly with very large and very small numbers. A much better fit would be to use engineering notation (i.e. switch to scientific notation when appropriate), which the Rust standard library does not seem to support even as an option.

In general, my experience with std::fmt for scientific purposes was unpleasant. The formatting options that are provided are generally ill-suited to scientific computing (e.g. controlling the amount of decimals rather than the amount of significant digits), and it is quite difficult to obtain something that is pleasant to read out of them (e.g. there is no way to put an upper bound on the amount of digit of a number without being drowned in a sea of zeros).

In contrast, file IO was overall a pleasant experience, mostly thanks to Rust’s great iteration and text manipulation facilities.

To stay reasonably close to the original C++ code’s use of iostreams, I decided to write a couple of file readout wrappers using closures and iterators. This turned out to be quite a powerful combo, although it did leave quite a bit of duplication with respect to the original code, mostly for three reasons:

  • Closures are not generic, so I needed one per input/output data type…
  • …but they could not share mutable access to the original file or iterator…
  • … so I had to separate file interation from content encoding/decoding, resulting in a more cumbersome and repetitive interface.

The end result was something like this for input…

let x = as_i32(next_item());
let y = as_real(next_item());

…and something like this for output…

write_i32(&mut output_file, "Some label", x);
write_real(&mut output_file, "Some other label", y);

…at which point it felt like the only way to eliminate the repetition of types and files/iterators without violating Rust’s “one single mutable ref” rule was to make a single struct responsible for each kind of file IO, which was a bigger refactoring and would have, overall, made the code harder to understand.

Similarly, Rust’s error handling style did not mix well with the program’s decision to put all I/O in the main function, and to sidestep any serious error handling. Because the ? operator cannot be used in main(), the only option left to keep the complexity equivalent to the original code was to sprinkle unwraps everywhere, encapsulating them in closures whenever possible.

In general, it felt like Rust’s focus on software engineering best practices could be somewhat detrimental in the application area of numerical codes, where engineering constraints are generally more relaxed (seriously, it’s okay to crash if a config file is not found), and keeping code simple and easy to understand for non-specialists (e.g. physicists) is a paramount concern.

Rust’s standard library proved unusually bare-bones (e.g. no complex support, no way to print the current date/time), but that was pretty well compensated by how easy Cargo makes to pull in dependencies from crates.io.

I wish cargo was faster at its job though… I found myself waiting for cargo operations (updating repository, compiling dependencies…) much more than for the build process itself.

When it comes to missing features, aside from the aforementioned general poor array support in Rust’s standard tool (iterators, borrow and init checkers…), I felt longing for some performance profiling mechanism and for some manual vectorization constructs. But I think I’ve read around here that both of these might be coming reasonably soon.

And so, that’s my experience porting a simple Monte Carlo simulation to Rust. Hope you enjoyed it!


#2

Have you finished the port or is it still in progress?


#3

The port is feature complete. Now that I know it works, and got results that I can check against, I’d like to try out a couple of second-order refactorings to make the code look nicer, such as porting the linear algebra to one of the Rust BLASes, reviewing the general structure a bit, etc.


#4

One common way to avoid this is to have a separate main function that returns a result, then just unwrap that in the real main function, e.g.

fn try_main() -> io::Result<()> {
    foo()?;
    Ok(())
}

fn main() { try_main().unwrap() }

There is discussion around supporting returning Results directly from main, although I feel like the last comment I saw on that was a few weeks ago.


#5

It led me to various highly questionable practices, ranging from reimplementing mem::swap for array elements using indices, to initializing an array at creation time only to immediately trash the initial value and fully re-initialize it using a loop.

I would be interested in seeing the examples here, if you can share them. I recently had to deal with a few thousands lines of code making heavy use of arrays and also found some inconveniences. Regarding the second point: if it’s only about initializing, you can do what I did: use std::mem::uninitialized to get around writing initial values and then initialize with a for loop. Just make sure to not drop the value you overwrite, i.e. by using std::mem::forget on the result of std::mem::replace. Otherwise you’ll run into segfaults. This situation is not ideal. I wonder if there is a nicer way that doesn’t involve unsafe functions.


#6

but absolutely loves multi-dimensional arrays,

In pratice for such kind of code you use libraries to handle matrices, etc.

But considering that the associated refactoring would be the first step of any parallelization process anyhow, I considered this to be fair game.

Also removing global variables makes it simpler to reason about the code, helps you avoid bugs, makes the code more testable, and more.

reimplementing mem::swap for array elements using indices,

Can you show an example? Arrays have a swap() function that takes two indexes. And you can split an array with this (also mutably):
https://doc.rust-lang.org/beta/std/primitive.slice.html#method.split_at

to initializing an array at creation time only to immediately trash the initial value and fully re-initialize it using a loop.

If you are using a Vec you can often avoid that double initialization with iterators followed by a collect().

In general, Rust’s standard array support felt much like an afterthought. But I would expect something like ndarray, which I have not tried yet, to fare better in this department.

There’s lot to be improved in Rust arrays, but I am not seeing much movement forward on them. If you see missing things file enhancement requests or write enhancement proposals.

Which of the following expressions do you find easiest to read?

The first one. But they are quite similar for me.

A much better fit would be to use engineering notation (i.e. switch to scientific notation when appropriate), which the Rust standard library does not seem to support even as an option.

If that’s true then file an enhancement request.

The end result was something like this for input…

Probably you need to make your code more rustic, following more Rust idioms, and less closer to the original C++ code.

In general, it felt like Rust’s focus on software engineering best practices could be somewhat detrimental in the application area of numerical codes, where engineering constraints are generally more relaxed (seriously, it’s okay to crash if a config file is not found), and keeping code simple and easy to understand for non-specialists (e.g. physicists) is a paramount concern.

Julia does as you say, and it seems to have some success, but I’ve also read that Julia ecosystem and library is full of bugs and untidy code, that will require lot of work (and time) to be cleaned up. Rust is indeed for a more tidy style of coding (and I am exceptionally happy of this).


#7

Thank you all for all these wonderful tips and comments!

@Nemo157: Looks nice, I’ll try this as a way to avoid frightening my more mathematically inclined colleagues.

@janno: I’ll see what I can do. In my case, I believe std::mem::uninitialized would be perfectly safe, since the only times where the issue comes up is with arrays of floats and complex numbers, both of which are very simple types with no pointers inside. But resorting to unsafe code for something as simple as avoiding double array initialization felt somewhat wrong.

@leonardo:

Absolutely, and one of my plans coming next is to try out a couple of Rust linear alg / multi-dimensionnal array libraries and see how well they fit.

Things which I expect to put the libraries’ design under stress and make the choice easier… assuming that any library will face them particularly well:

  • Lots of very small matrices (the typical BLAS is optimized for larger ones)
  • Arrays of rank 3 tensors (i.e. 4-dimensional arrays)
  • Lorentzian scalar products (i.e. v1 x v2 is not x² + y² + z² + t² but x² + y² + z² - t²)
  • Hermitian matrices (Complex matrix such that Mab = conj(Mba). There are at least two ways to represent these, one which is is more time-efficient and one which is more space-efficient.)

Ooooh I missed this one. Thanks! Not as good as mem::swap understanding that I mean no harm by accessing two separate array indices, but will do for my purposes.

I think the price of running dynamic memory allocations on every iteration of a loop which currently takes 0.5µs on average to execute (which I know can go down further), and robbing the compiler from its knowledge of the array size, would offset the benefits of this approach. But truly, something like collect() for arrays would be nice…

Will try to collect thoughts and do so in a condensed form where appropriate. The last time I looked at it, the RFC process was a bit complex and went through several stages, was it simplified since? Or would you have a link handy to that page I looked at a year ago which explained how it’s done?

In HEP data processing, we have long given up on using the same programming language for the everyday code of physicist users and full-time software developers, and settled on the combination of C++ code for the core framework and performance-sensitive parts, coupled with Python code for high-level control and simple user extensions. But that dual-language organization comes with its own share of problems:

  • Now matter how much people attempt to fit that round peg in a square hole, Python was never designed for compute and is a very poor fit for such purposes. I would expect Julia to perform a lot better in this respect, but haven’t gotten around to trying it yet.
  • We have a lot more physicists than software engineers, and as planned they like to write Python code, so the Python part tends to grow organically over time to be larger than the C++ one, and much less well-organized. Maintaining it is a challenge, especially as many of its developers only stay for a short time (~2 months to 3 years).
  • The C++ part itself shows its age, as a Frankenstein monster of 90s-era object-oriented design patterns tacked together with C-style memory allocation and FORTRAN-style control flow duct tape. Even doing a major architectural review and rewriting the code in modern C++ would help, but with a codebase of MLOCs, short manpower, and tight deadlines, who has time for that? ^^
  • If you think that either C++ or Python in isolation is a mess, try working on the glue code between them.

The nice thing about numerical simulations is that the codebases tend to be much shorter and independent of one another, so we can safely use these programs as a testbed to try out new technology and rewrite stuff. The main challenge is that the physicist who wrote them expects to still understand them after we’re done, and is the only one who can provide valuable mathematical input on what the old code was doing for optimization and variable renaming purposes. So we must walk a fine line between writing very idiomatic and clean code on one hand, and keeping the code familiar/understandable on the other hand.


#8

assuming that any library will face them particularly well:

Probably you will need to implement lot of that stuff in a Rust crate of your choice. The ecosystem is still young, it’s not comparable to SciPy.

and robbing the compiler from its knowledge of the array size, would offset the benefits of this approach. But truly, something like collect() for arrays would be nice…

File enhancement requests. There are some alternative solutions to this problem, like making Rustc smarter (so it removed the first useless initialization), a macro like array comp of Python, and so on. Perhaps you can write a safe macro that inside uses uninitialized memory for POD data and takes a function to initialize the array.

The last time I looked at it, the RFC process was a bit complex and went through several stages, was it simplified since?

Right.

In HEP data processing, we have long given up on using the same programming language for code written by physicist users and full-time software developers, and settled on the combination of C++ code for

I agree with their desire to keep some distance from C++ :slight_smile:

If you think that either C++ or Python in isolation is a mess,

Well written Python code often isn’t a mess.

The nice thing about numerical simulations is that the codebases tend to be much shorter and independent of one another, so we can safely use these programs as a testbed to try out new technology and rewrite stuff.

If you want you can show the code of your MC simulation, so someone could give you some little or big suggestions on the code. Have you run Clippy on it? You can install it on a Nightly and use it very easely through Cargo.

The main challenge is that the physicist who wrote them expects to still understand them after we’re done, and is the only one who can provide valuable mathematical input on what the old code was doing for optimization and variable renaming purposes. So we must walk a fine line between writing very idiomatic and clean code on one hand, and keeping the code familiar/understandable on the other hand.

In theory the solution is to let a capable programmer interact with the physicists while they write code.


#9

Oh, by the way, one thing which I forgot in my initial post: Rust’s standard library proved unusually bare-bones (e.g. no complex support, no way to print the current date/time), but that was more than compensated by how easy Cargo makes to pull in dependencies from crates.io.

I wish cargo was faster at its job though… I found myself waiting for cargo operations (updating repository, compiling dependencies…) much more than for the application build process itself. Yet another area where I could try to offer my help once I manage to save up enough time.


#10

The same can be said of well-written code in pretty much any modern programming language, really. The real usability challenge for programming languages is to make it harder to write messy code in less ideal scenarios where the programmer has less skill, less time, and is generally more careless.

For this, I think Python is not the language that comes best prepared, because of (among other things):

  • The whole python2 vs python3 mess
  • No constants
  • No encapsulation/privacy
  • eval and friends (yay, input(), such a great idea…)
  • getattribute/setattr, and other ways of messing with live objects
  • TypeError and duck typing in general
  • Monkey patching and plenty of life before main
  • A generally careless attitude towards API design, which leads the standard library to solve the same problem many time, with compatibility forcing us to keep each suboptimal intermediate iteration around (e.g. the Python async story is getting ridiculous).
  • Finally, a true competitor to MATLAB in the quest for the slowest programming language interpreter ever used for numerical computing, forcing people to contort code into unholy shapes before it runs fast. Bonus points for the GIL fiasco.

Of course, this critique should be nuanced by recalling the fact that most of these are features:

  • I’m still waiting for C++ to fix its Unicode handling, or break compatibility with poor past design decisions in general :slight_smile: (Though C++17 starts going in the right direction here by throwing away a lot of STL cruft)
  • People focus less on types and more on algorithms, which is seen by many as a virtue.
  • Dynamic features, when well managed, lend to crazy and wonderful things such as generalized wrapper classes, nice GUI toolkits, good persistence…
  • The standard library evolves quickly, and everyone can fix it when it breaks.
  • If people focus less on performance, they have more time to think about other things such as cool abstractions.

I could do that, but would rather first finish my own cleanup and have the code reviewed by its author first, so that 1/you don’t have to tell me things that I already know and 2/you can enjoy more documentation than I did.

Not yet. In general, I’ve avoided nightly-only Rust features so far.

If only that could happen more frequently… In general, that’s hampered by

  1. Lack of software development manpower
  2. Lack of interest from physicists in software (it’s a mean, not an end)
  3. Lack of code review (though we’re working on this one by introducing the PR workflow in larger codebases)

#11

It would of course be very interesting to hear about using ndarray for this, but I can’t expect it to deliver any nicer syntax: it is a bit contorted to fit into the features of Rust as it is today, leaving us wanting both variadic generics and integer generic parameters.

It is as a library still very new and not very ergonomic, but I think it actually errs on the ergonomic side of things, in lieu of implementing expression templates and more compile time parameters. I guess expression templates can swing both ways there, though.

nalgebra should handle this better, a lot better than ndarray.


#12

We currently don’t have something akin to C’s %g syntax. {:e} or {:E} is close to what you want, but it’s unconditional. The current default syntax is ultimately decided arbitrarily [1], and we probably need to go through an RFC to get it changed or at least make the current situation clear.

[1] My belief is that the decision traces back to the very first commit introducing float formatting. Note that this is dead wrong anyway (supposed to be a hack at that time), it tries to extract the integral part by casting! :smiley:


#13

Ha, that takes me back. Fun fact: I implemented that so I could turn in an assignment for an AI course where the professor said we could use any language.


#14

Note that any method call can be written as a function call:

x.powi(2) => f64::powi(x, 2)

So, you can also write your example like this:

let norm2 = f64::sqrt(f64::powi(x, 2) + f64::powi(y, 2) + f64::powi(z, 2) - f64::powi(t, 2));

However, I don’t think it is possible to use a specific method so that you can get rid of the f64:: prefix (use std::f64::powi is not valid).


#15

So, inspired by your try_main() suggestion, I ended up going for a variant of it.

Similarly to your proposal I introduced a try_main() middleman, written in normal Rust error handling style (custom Error enum representing all things that can go wrong). In the regular main(), I put a pattern matching on the Error, which displays some nice error messages explaining what went wrong.

In this way, I got the best of both worlds: simple error handling without ugly panic dumps AND without cluttering the try_main() with lots of error handling code (just a bunch of question marks and a map_err here and there).

If ? was allowed in main(), I would probably have gone for implementing Display (well, Error really) for the error type instead. But this variant suits my needs as well.

Thanks for the help!


#16

I would have understood std::f64::powi as referring to something called “powi” in the package “std::f64”. In order to avoid namespace collision, a different syntax would probably be needed when importing functions from a built-in type, maybe something like “f64::powi”, “::f64::powi” or “std::builtin::f64::powi”.


#17

So, as you asked, here is a list of all the usability pain points which I’ve encountered with Rust arrays so far. Note that some of these also apply to Vec and other containers.

  1. Because there is no equivalent of collect() for arrays, I cannot initialize an array from an iterator. This is quite a serious handicap considering how omnipresent iterators are in idiomatic Rust.
  2. Because the initialization checker does not understand arrays well, I cannot initialize an array with a loop either (at least without resorting to unsafe code as we discussed).
  3. Replacing arrays with Vec<>, while possible, seems questionable
    • It makes the program less clear by removing useful information about array size
    • It makes it less performant as the compiler cannot use advance knowledge of array size to optimize performance (e.g. good auto-vectorization depends a lot on compilers knowing in advance about loop bounds)
    • It also greatly increases the frequency of dynamic memory allocations, unless the vectors are kept around and reused, which in turn means losing Rust’s safe initialization guarantees.
  4. Because the borrow checker does not understand arrays either, joint manipulation of multiple array elements can be difficult as well (but for my use case of swapping, leonardo correctly pointed out that a special-case stdlib method exists).
  5. Iterators do not cope well with multi-dimensional arrays, at least if they are implemented as arrays of arrays. You end up with something like a.iter().iter().iter().iter() or the deeply nested loop equivalent.
  6. For numerical computation, the fact that default array iteration consumes the original array even though the contents are Copy means that when iterating through an array, one must use iter() and sprinkle * throughout the loop. This causes unnecessary code visual complexity.
  7. The limitation which states that the [x; N] initialization syntax only works if x implements Copy is quite painful in practice. It would be REALLY nice it it worked with Clone types as well, even less efficiently.
  8. As it is currently not possible to define a generic array type in Rust, array handling is full of special-case code in rustc, which shows through arbitrary limitations (e.g. traits are only implemented for arrays of size <= 32).
  9. Array types could use more vectorized operations (e.g. it would be very nice if x.bitxor(y) was automatically understood as x.iter().zip(y.iter()).map(|(v1, v2)| v1.bitxor(v2)).collect_array()).

#18

Please show a tiny code example.


#19

I assume you mean code similar to

let a = [0; 64];
for i in a.iter() {
    let _ = *i * *i;
}

This could be written a little more succinctly as

for &i in &a[..] {
    let _ = i * i;
}

This takes advantage of two features of Rust,

  1. The provided impl<'a, T> IntoIterator for &'a [T], this gets called automatically on &a[..] to get the iterator for the loop (&a[..] is basically casting an array into a slice). This literally just forwards onto slice::iter. (By the way, there’s a similar impl IntoIterator that allows using &mut a[..] instead of a.iter_mut().
  2. Pattern matching to “unwrap” the reference. The items from the iterator are of type &i32 in this case, by matching that against the pattern &i it decides to make i: i32 and works out that since i32 implements Copy it can just copy the value out of the iterator item into i.

If you’re using an iterator chain you can also use the pattern matching to unwrap a reference, e.g. something like

a.iter().map(|&i| i * i)

instead of having to dereference the item at every use.


#20

I agree. Arrays are quite crippled in the language due to the inability to write functions that are polymorphic in the number of elements, so a lot of libraries end up re-inventing the wheel and/or using macros to duplicate definitions for every length. I hope this gets rectified soon with constant generics proposal.

Depending on how complicated your later initialization is, the earlier initialization can get optimized out by the compiler. If doesn’t get optimized out and the area makes a huge difference, consider writing a wrapper function that combines initialization with some sort of (Index) -> f64 closure.

IMO initializing things to zero is a good default to have, because I have run into a lot of annoying bugs in my numerical code because of unintentionally uninitialized data.

I’m fairly certain modules and types share the same namespace, so there should not be any ambiguity. I don’t see any reason why use std::f64::powi can’t be made to “just work”.

Personally I’m OK with x.powi(2), because it’s quasi-infix. But x.sin() is just weird.