statsrust is a minimal, high-performance statistical analysis library for Rust developers, designed with emphasis on numerical stability, mathematical correctness, and zero-cost abstractions. This lean version focuses on core statistical functionality without heavy dependencies.
- Direct enum usage instead of string-to-enum conversion
- Function pointers (
fn(f64) -> f64) instead of trait objects (Box<dyn Fn>) - Pure Vec implementation without ndarray dependency
- Hand-rolled core algorithms instead of statrs dependency
- GitHub-only distribution (not published to crates.io) for specific use cases
- Descriptive Statistics: Measures of central tendency, position, and variability
- Kernel Density Estimation: Multiple kernel functions with efficient implementation
- Normal Distribution Model: Complete algebraic operations on normal distributions
- Numerical Stability: Carefully designed algorithms to prevent overflow, underflow, and cancellation errors
- Comprehensive Error Handling: Detailed error messages for edge cases
- Zero-cost Abstractions: Function pointers and direct enum usage for optimal performance
This lean version is designed for direct integration via GitHub rather than crates.io:
[dependencies]
statsrust = { git = "https://github.com/semmyenator/statsrust" }use statsrust::*;
fn main() -> Result<(), StatError> {
let data = vec![1.0, 2.0, 3.0, 4.0, 5.0];
// Basic descriptive statistics
let mean = mean(&data)?;
let median = median(&data)?;
let variance = variance(&data, None)?;
println!("Mean: {:.2}, Median: {:.2}, Variance: {:.2}", mean, median, variance);
// Kernel Density Estimation (using enum directly)
let kde = kde(&data, 0.5, Kernel::Normal, false)?;
println!("Density at 3.0: {:.4}", kde(3.0));
// Normal distribution operations
let dist = NormalDist::from_samples(&data)?;
println!("Distribution: N(μ={:.2}, σ={:.2})", dist.mean(), dist.stdev());
Ok(())
}- Direct enum usage eliminates runtime string parsing
- Function pointers (
fn(f64) -> f64) replace trait objects for zero allocation overhead - No "magic strings" - kernels are specified via
Kernel::Normalenum variants
- Geometric mean uses logarithmic transformation to prevent overflow
- Variance calculation uses centered data to avoid catastrophic cancellation
- Inverse CDF approximations maintain precision while balancing performance
- Efficient implementation with multiple kernel functions:
- Gaussian, Epanechnikov, Triangular, Quartic, Triweight, and more
- Optimized for bounded kernels using binary search
- Direct function pointer implementation for minimal overhead
- Complete algebraic operations:
let dist1 = NormalDist::new(0.0, 1.0)?; let dist2 = NormalDist::new(1.0, 2.0)?; // Distribution operations let sum_dist = dist1 + dist2; // N(1.0, √5) let scaled_dist = dist1 * 2.0; // N(0.0, 2.0) let overlap = dist1.overlap(&dist2); // Calculate distribution overlap
For comprehensive documentation:
This leaner version of statsrust is intentionally minimalistic:
- It's optimized for specific use cases rather than being a general-purpose statistical library
- It prioritizes control and performance over broad-case robustness
- It's hosted on GitHub only (not published to crates.io)
- It's designed as a specialized tool, not a replacement for comprehensive statistical libraries
We welcome contributions! Please see our Contribution Guidelines for details on how to report bugs, suggest enhancements, or submit pull requests.
This project is dual-licensed under:
Documentation content is licensed under:
This document is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). Original author: statsrust Authors