Nmath (the R math & stats functions) as a rust crate

UPDATE I've finished wrapping all the functions, and have published the crate at r_stats.

I did some searching and although I found some work on statistics in rust, I could not find implementations of the key statistical functions: the PDF, CDF and quantile functions for the common distributions. They are needed for tasks like hypothesis testing and confidence interval generation. The R programming language provides these functions, and they are implemented in C.

The functions themselves are very complex and designed to handle extreme input, including subnormal floats and infinity, but using them is fairly simple. I converted a number of them to Rust, and the results of this can be seen in the riir branch. However it would be very easy to make a mistake, and a number of the functions use labels/goto which do not translate directly into rust.

This crate (provisionally called r_stats will provide the CDF PDF and quantile functions for the same distributions as R provides. The hard work is done on it, specifically building and linking the C library, and generating bindings. What is required now is implementing the wrapper functions that hide the unsafe and convert bools into c_ints. I would do this now but I am very tired, so just the normal distribution is implemented for now.

I believe that this work, and the crate that will eventually result, will be useful in the rust ecosystem, especially when working with data in jupyter notebook/evcxr. I just need to get the wrappers written. PRs welcome - if you need a function feel free to implement it and create a PR.

2 Likes

The list of functions is growing. I think I'm about 50% through them now.

2 Likes

All done now and published. The crate will probably suit my needs now, but it would benefit from some fuzzing/testing.

I have found statrs in the mean time (pun intended). It seems that that crate doesn't have the quantile functions. It would also be interesting to compare the results from both crates. :slight_smile:

You know you could do that in your tests? You introduce it as a dev-dependency, and then you can confirm that your answer matches theirs to good accuracy. Of course, that doesn't help you if the other crate is worse, but it seems valuable to try.

Oops I never actually uploaded the crate. Done now.