/linmix

Python port of B. Kelly's LINMIX_ERR IDL package

Primary LanguageIDLBSD 2-Clause "Simplified" LicenseBSD-2-Clause

linmix

A Bayesian approach to linear regression with errors in both X and Y.

Python port of B. Kelly's LINMIX_ERR IDL package (Kelly2007, arXiv:0705.2774). Paraphrasing from the LINMIX_ERR.pro IDL routine:

Perform linear regression of y on x when there are measurement errors in both variables. The regression assumes:

eta = alpha + beta * xi + epsilon

x = xi + xerr

y = eta + yerr

Here, (alpha, beta) are the regression coefficients, epsilon is the intrinsic random scatter about the regression, xerr is the measurement error in x, and yerr is the measurement error in y. epsilon is assumed to be normally-distributed with mean zero and variance sigsqr. xerr and yerr are assumed to be normally-distributed with means equal to zero, variances xsig^2 and ysig^2, respectively, and covariance xycov. The distribution of xi is modeled as a mixture of normals, with group proportions pi, means mu, and variances tausqr. The following graphical model illustrates, well..., the model...

linmix PGM

Bayesian inference is employed, and a Markov chain containing random draws from the posterior is developed. Convergence of the MCMC to the posterior is monitored using the potential scale reduction factor (RHAT, Gelman et al. 2004). In general, when RHAT < 1.1 then approximate convergence is reached.

Documentation

More detailed documentation can be found at http://linmix.readthedocs.org/en/latest/. In particular, the API is listed at http://linmix.readthedocs.org/en/latest/src/linmix.html, and a worked example (the same as in Kelly (2007) (arXiv:0705.2774)), is at http://linmix.readthedocs.org/en/latest/example.html.

Usage

import linmix
lm = linmix.LinMix(x, y, xsig=xsig, ysig=ysig, xycov=xycov, delta=delta, K=K, nchains=nchains)
lm.run_mcmc(miniter=miniter, maxiter=maxiter, silent=silent)
print("{}, {}".format(lm.chain['alpha'].mean(), lm.chain['alpha'].std()))
print("{}, {}".format(lm.chain['beta'].mean(), lm.chain['beta'].std()))
print("{}, {}".format(lm.chain['sigsqr'].mean(), lm.chain['sigsqr'].std()))

Installation

Currently, the best way to get linmix for python is to clone from github and build using normal setup.py facilities. (see http://linmix.readthedocs.org/en/latest/install.html) In the future, I hope to add linmix to PyPI.

License

This repo is licensed under the 2-line BSD license. See the LICENSE doc for more details.