Statsrust : A comprehensive Rust library for statistical analysis, providing a wide range of descriptive statsrust, probability distributions, and non-parametric methods

semmyenator · August 28, 2025, 2:12pm

Since I couldn't find a simple yet comprehensive statistical library, I created a small one based on the Python standard statistics library. I hope you'll try it out and give me some feedback. Thanks all
https://github.com/semmyenator/statsrust
https://deepwiki.com/semmyenator/statsrust
https://crates.io/search?q=statsrust

Oumuamua · September 12, 2025, 7:01pm

How does it compare with statrs?

semmyenator · September 18, 2025, 6:24am

If you're familiar with Python's statistical libraries, statsrust is a tool worth checking out. It provides Rust developers with a lightweight statistical toolset that's high-performance and better suited for focused scenarios within the Rust ecosystem. It's not intended to replace the full functionality of larger toolkits like statrs in Rust or numPy or sciPy in Python.

ogau · September 28, 2025, 7:10am

Getting an enum variant from a string looks very strange; it's better to use the enum directly.
Instead of Box<dyn Fn(f64) -> f64>, you can use function pointers since your closures don't capture anything and can freely coerce to function pointers:

type KernelFn = fn(f64) -> f64;

impl Kernel {
    /// Returns the kernel function
    fn kernel(&self) -> KernelFn {
        match self {
            Kernel::Normal => |t| (-(t * t) / 2.0).exp() / (2.0 * std::f64::consts::PI).sqrt(),
            ...
        }
    }
}

Using ndarray as a dependency solely for primitive operations on 1D vectors is suboptimal; it would be better to move the necessary logic to a separate module for working with Vec<f64>.
Using statrs as a dependency for just a couple of functions is also expensive.

semmyenator · September 30, 2025, 7:32am

Response to StatsRust Improvement Feedback
Thanks for the actionable suggestions—they directly drove key optimizations: tighter Rust idioms, better performance, and reduced bloat. Here’s how we addressed each point:

Replaced string-to-enum conversion with direct enum usage
Removed Kernel::from_name(). Functions now take Kernel enums (e.g., Kernel::Normal) directly.
→ Why: Eliminates runtime string parsing, invalid input risks, and "magic string" ambiguity (e.g., gauss vs normal).
Switched Box<dyn Fn> to function pointers
Defined type KernelFn = fn(f64) -> f64; and updated all kernel methods to return it.
→ Why: Zero-cost abstraction—no heap allocation, faster calls, and inherent Send + Sync.
Dropped ndarray for Vec<f64> logic
Replaced all 1D array ops (mean, variance) with direct slice-based implementations.
→ Why: Avoids overkill dependency for trivial vector math; faster compilation, smaller binary.
Removed statrs via targeted manual implementations
Added minimal internal logic: Abramowitz-Stegun erf, custom NormalDist (PDF/CDF/sampling).
→ Why: Cuts heavy dependency for 2-3 niche functions; full control over numerical stability.

Tradeoff Clarification
Your point about "relying on mature libraries for stability" vs. "hand-rolled implementations" is spot-on. We prioritized:

Control & minimalism over broad-case robustness (e.g., our erf targets typical input ranges, not edge cases statrs handles).
Performance/scope fit over general-purpose safety (e.g., skipping ndarray’s multi-D checks for 1D-only needs).
Dependency hygiene over "free" maintenance (no upstream breakage risks, but we own all logic now).

Final Decision
Given this focused scope, we’re hosting this leaner version on GitHub only (not publishing to crates.io). It’s optimized for specific use cases—not a general-purpose replacement.
leaner version of statsrust

Topic		Replies	Views
Nmath (the R math & stats functions) as a rust crate announcements	5	1063	October 31, 2020
Rust and statistics: do libs exist? help	3	1168	September 21, 2020
Functional programming in Rust (code review) help	8	787	January 12, 2023
Looking for a code review on a KDTree project of mine before I go present it at a conference. I am unsure of the idiomatic-ness of it help	18	2288	January 12, 2023
K-means benchmark in Rust help	6	2550	January 12, 2023

Statsrust : A comprehensive Rust library for statistical analysis, providing a wide range of descriptive statsrust, probability distributions, and non-parametric methods

Related topics