A year ago I got fed up with runtime compatibility hell from C++/OpenMP in our Python bioinformatics package. This article lays out what I learned by upgrading to Rust:
Along with suggestions about PyO3, file layout, memory allocation, testing, etc, the article gives examples of
using Rayon/ndarray::parallel while returning all errors
letting users control the number of parallel threads
creating nice errors with thiserror and translating those Rust errors to nice Python errors
translating Python dynamic types to Rust generic functions
Some of this might be of interest to anyone using Rayon for parallel array processing.
The Python with Rust package is Bed-Reader and is open source. Thank you Rust & Rayon for letting me escape from C++/OpenMP runtime compatibility hell!
To ask you an oddball question, what was the most frustrating part of writing Bed-Reader? Were there any moments when you would run into something then need to stop and look out the window for a half an hour to figure out how to solve something?
Thanks for the question! Three frustrating parts of creating the Rust extension come to mind:
"nice" Rust errors - It took me a long time to figure out how to get nice errors in Rust. I wanted to return system errors (e.g. file open not finding a file) and custom errors (e.g. file in the wrong format). I also wanted all the errors to have nice error messages. As mentioned in the article, I ended up using the "thiserror" crate and creating an enum called BedErrorPlus. BedErrorPlus included std::io::Error, ThreadPoolBuildError, and BedError. Finally, BedError contains all my custom errors. This seems like the way everyone should handle errors, but I only found one resource that explained it. (Rust: Structuring and handling errors in 2020 - Nick's Blog and Digital Garden)
par_azip! and returning errors -- ndarray::parallel's par_azip! macro is, I think, the most readable way to data parallelize array-related code. However, the documentation and examples for it are very sparse AND maybe it should be part of Rayon, not just ndarray::parallel AND it doesn't have a direct way to return errors. (The article gives a work-around way to return errors).
traits for numerics -- it took a long time for me to figure out the traits to create a generic function that would cover i8, f32, and f64. I ended up (after a lot of trial and error) with Copy + Default + From<i8> + Debug + Sync + Send