Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.
The documentation for the latest release is at
The documentation for the development version is at
Recent improvements are highlighted in the release notes
Backups of documentation are available at http://statsmodels.github.io/stable/ and http://statsmodels.github.io/dev/.
- Linear regression models:
- Ordinary least squares
- Generalized least squares
- Weighted least squares
- Least squares with autoregressive errors
- Quantile regression
- Mixed Linear Model with mixed effects and variance components
- GLM: Generalized linear models with support for all of the one-parameter exponential family distributions
- GEE: Generalized Estimating Equations for one-way clustered or longitudinal data
- Discrete models:
- Logit and Probit
- Multinomial logit (MNLogit)
- Poisson regresion
- Negative Binomial regression
- RLM: Robust linear models with support for several M-estimators.
- Time Series Analysis: models for time series analysis
- Complete StateSpace modeling framework
- Seasonal ARIMA and ARIMAX models
- VARMA and VARMAX models
- Dynamic Factor models
- Markov switching models (MSAR), also known as Hidden Markov Models (HMM)
- Univariate time series analysis: AR, ARIMA
- Vector autoregressive models, VAR and structural VAR
- Hypothesis tests for time series: unit root, cointegration and others
- Descriptive statistics and process models for time series analysis
- Complete StateSpace modeling framework
- Survival analysis:
- Proportional hazards regression (Cox models)
- Survivor function estimation (Kaplan-Meier)
- Cumulative incidence function estimation
- Nonparametric statistics: (Univariate) kernel density estimators
- Datasets: Datasets used for examples and in testing
- Statistics: a wide range of statistical tests
- diagnostics and specification tests
- goodness-of-fit and normality tests
- functions for multiple testing
- various additional statistical tests
- Imputation with MICE and regression on order statistic
- Mediation analysis
- Principal Component Analysis with missing data
- I/O
- Tools for reading Stata .dta files into numpy arrays.
- Table output to ascii, latex, and html
- Miscellaneous models
- Sandbox: statsmodels contains a sandbox folder with code in various stages of developement and testing which is not considered "production ready". This covers among others
- Generalized method of moments (GMM) estimators
- Kernel regression
- Various extensions to scipy.stats.distributions
- Panel data models
- Information theoretic measures
The master branch on GitHub is the most up to date code
Source download of release tags are available on GitHub
Binaries and source distributions are available from PyPi
Binaries can be installed in Anaconda
conda install statsmodels
Development snapshots are also available in Anaconda (infrequently updated)
conda install -c https://conda.binstar.org/statsmodels statsmodels
See INSTALL.txt for requirements or see the documentation
Modified BSD (3-clause)
Discussions take place on our mailing list.
We are very interested in feedback about usability and suggestions for improvements.
Bug reports can be submitted to the issue tracker at