/sliced

sliced: scikit-learn compatible sufficient dimension reduction

Primary LanguagePythonMIT LicenseMIT

Travis AppVeyor Coveralls CircleCI License

sliced

sliced is a python package offering a number of sufficient dimension reduction (SDR) techniques commonly used in high-dimensional datasets with a supervised target. It is compatible with scikit-learn.

Algorithms supported:

  • Sliced Inverse Regression (SIR) [1]
  • Sliced Average Variance Estimation (SAVE) [2]

Documentation / Website: https://joshloyal.github.io/sliced/

Example

Example that shows how to learn a one dimensional subspace from a dataset with ten features:

from sliced.datasets import make_cubic
from sliced import SlicedInverseRegression

# load the 10-dimensional dataset
X, y = make_cubic(random_state=123)

# Set the options for SIR
sir = SlicedInverseRegression(n_directions=1)

# fit the model
sir.fit(X, y)

# transform into the new subspace
X_sir = sir.transform(X)

Installation

Dependencies

sliced requires:

  • Python (>= 2.7 or >= 3.4)
  • NumPy (>= 1.8.2)
  • SciPy (>= 0.13.3)
  • Scikit-learn (>=0.17)

Additionally, to run examples, you need matplotlib(>=2.0.0).

Installation

You need a working installation of numpy and scipy to install sliced. If you have a working installation of numpy and scipy, the easiest way to install sliced is using pip:

pip install -U sliced

If you prefer, you can clone the repository and run the setup.py file. Use the following commands to get the copy from GitHub and install all the dependencies:

git clone https://github.com/joshloyal/sliced.git
cd sliced
pip install .

Or install using pip and GitHub:

pip install -U git+https://github.com/joshloyal/sliced.git

Testing

After installation, you can use pytest to run the test suite via setup.py:

python setup.py test

References:

[1]: Li, K C. (1991) "Sliced Inverse Regression for Dimension Reduction (with discussion)", Journal of the American Statistical Association, 86, 316-342.
[2]: Shao, Y, Cook, RD and Weisberg, S (2007). "Marginal Tests with Sliced Average Variance Estimation", Biometrika, 94, 285-296.