linfa (Italian) / sap (English):
The vital circulating fluid of a plant.
linfa
aims to provide a comprehensive toolkit to build Machine Learning applications with Rust.
Kin in spirit to Python's scikit-learn
, it focuses on common preprocessing tasks and classical ML algorithms for your everyday ML tasks.
Where does linfa
stand right now? Are we learning yet?
linfa
currently provides sub-packages with the following algorithms:
Name | Purpose | Status | Category | Notes |
---|---|---|---|---|
clustering | Data clustering | Tested / Benchmarked | Unsupervised learning | Clustering of unlabeled data; contains K-Means, Gaussian-Mixture-Model, DBSCAN and OPTICS |
kernel | Kernel methods for data transformation | Tested | Pre-processing | Maps feature vector into higher-dimensional space |
linear | Linear regression | Tested | Partial fit | Contains Ordinary Least Squares (OLS), Generalized Linear Models (GLM) |
elasticnet | Elastic Net | Tested | Supervised learning | Linear regression with elastic net constraints |
logistic | Logistic regression | Tested | Partial fit | Builds two-class logistic regression models |
reduction | Dimensionality reduction | Tested | Pre-processing | Diffusion mapping and Principal Component Analysis (PCA) |
trees | Decision trees | Tested / Benchmarked | Supervised learning | Linear decision trees |
svm | Support Vector Machines | Tested | Supervised learning | Classification or regression analysis of labeled datasets |
hierarchical | Agglomerative hierarchical clustering | Tested | Unsupervised learning | Cluster and build hierarchy of clusters |
bayes | Naive Bayes | Tested | Supervised learning | Contains Gaussian Naive Bayes |
ica | Independent component analysis | Tested | Unsupervised learning | Contains FastICA implementation |
pls | Partial Least Squares | Tested | Supervised learning | Contains PLS estimators for dimensionality reduction and regression |
tsne | Dimensionality reduction | Tested | Unsupervised learning | Contains exact solution and Barnes-Hut approximation t-SNE |
preprocessing | Normalization & Vectorization | Tested / Benchmarked | Pre-processing | Contains data normalization/whitening and count vectorization/tf-idf |
nn | Nearest Neighbours & Distances | Tested / Benchmarked | Pre-processing | Spatial index structures and distance functions |
ftrl | Follow The Reguralized Leader - proximal | Tested / Benchmarked | Partial fit | Contains L1 and L2 regularization. Possible incremental update |
We believe that only a significant community effort can nurture, build, and sustain a machine learning ecosystem in Rust - there is no other way forward.
If this strikes a chord with you, please take a look at the roadmap and get involved!
At the moment you can choose between the following BLAS/LAPACK backends: openblas
, netblas
or intel-mkl
Backend | Linux | Windows | macOS |
---|---|---|---|
OpenBLAS | ✔️ | - | - |
Netlib | ✔️ | - | - |
Intel MKL | ✔️ | ✔️ | ✔️ |
For example if you want to use the system IntelMKL library for the PCA example, then pass the corresponding feature:
cd linfa-reduction && cargo run --release --example pca --features linfa/intel-mkl-system
This selects the intel-mkl
system library as BLAS/LAPACK backend. On the other hand if you want to compile the library and link it with the generated artifacts, pass intel-mkl-static
.
Dual-licensed to be compatible with the Rust project.
Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.