roxy: | Regression and Optimisation with X and Y errors |
---|---|
Authors: | Deaglan J. Bartlett and Harry Desmond |
Homepage: | https://github.com/DeaglanBartlett/roxy |
Documentation: | https://roxy.readthedocs.io/en/latest/ |
roxy
(Regression and Optimisation with X and Y errors) is a python package for
fitting a function to data (regression) where the data have both x and y errors, using Markov Chain Monte Carlo (MCMC).
The common approach for this problem is to use a
Gaussian likelihood with a mean given by f(x_{\rm obs}, \theta) and a variance
\sigma_y^2 + f^\prime(x_{\rm obs}, \theta)^2 \sigma_x^2, but this ignores the underlying
distribution of true x values and thus gives biased results. Instead, this package allows
one to use the MNR (Marginalised Normal Regression) method which does not exhibit such
biases.
A range of likelihoods are available in roxy
, however we recommed you use the following one in
these situations:
- If you have x and y errors and DO wish to infer intrinsic scatter: use
method='mnr'
. - If you have x and y errors and DO NOT wish to infer intrinsic scatter: use
method='prof'
. - If you only have y errors: use
method='prof'
ormethod='unif'
(they are identical in this case).
The code uses automatic differentiation enabled by jax to both sample the
likelihood using Hamiltonian Monte Carlo and to compute the derivatives
required for the likelihood. We employ the NUTS method implemented in numpyro
for fast sampling. For the galaxy cluster example in the MNR paper
(which contains over 250 data points), a single chain run on a laptop performs
approximately 3500 iterations per second, such that a chain with 700 warm-up
steps and 5000 samples takes approximately 1.6 seconds to sample, and gives
over 3500 effective samples for each of the parameters, with Gelman Rubin statistics
equal to unity within less than 0.01. Given its efficiency and simplicity to use (one
need to just define the function of interest, the parameters to sample and their
prior ranges), we advocate for its use not just in the presence of x errors,
but also without these.
As well as returning posterior samples and allowing likelihood computations
(which can be integrated into the user's larger code), roxy
is interfaced with
arviz
to produce trace plots, corner
and getdist
to make two-dimensional
posterior plots, and fgivenx
for posterior predictive plots. See below for
the relevant citations one must use if one uses these modules.
Since roxy
is a python package, the user will need python3 installed.
We have tested roxy
using python3.11, so suggest that the user also uses
this python version.
The plotting functions supplied with roxy
require LaTeX to be installed, due to the
requirements of matplotlib
. If this is not already installed, check out the
matplotlib documentation
for more information. For Ubuntu, one simply needs to run
sudo apt-get update && sudo apt-get install texlive texlive-publishers texlive-science latexmk cm-super dvipng
to obtain the required dependencies.
If one wants to reproduce the results of the roxy
paper,
one needs to run the example given in roxy.examples.bias_example.py
, which requires
mpi4py
.
This is not installed by default (see below), but if you do want to install it, you will need
mpi to be installed beforehand. This can be done by running the following
MacOS:
brew install open-mpi
Ubuntu:
sudo apt-get install openmpi-bin libopenmpi-dev
To install roxy and its dependencies in a new virtual environment, run
python3 -m venv roxy_env
source roxy_env/bin/activate
git clone git@github.com:DeaglanBartlett/roxy.git
pip install -e roxy
These dependencies are:
- numpy
- jax
- jaxlib
- scipy
- numpyro
- matplotlib
- corner
- getdist
- arviz
- fgivenx
- sphinx>=5.0
- myst-parser
- sphinx-rtd-theme
- scikit-learn
- jaxopt
- prettytable
If you are unable to clone the repo with the above, try the https version instead
git clone https://github.com/DeaglanBartlett/roxy.git
To run the script roxy.examples.bias_example.py
, you will need to install mpi4py
which can be done alongside installing roxy
by, instead of using the pip install
instruction above, running
pip install -e "roxy[all]"
Users are required to cite the roxy
paper, for which the following bibtex can be used
@ARTICLE{roxy,
author = {{Bartlett}, D.~J. and {Desmond}, H.},
title = "{Marginalised Normal Regression: unbiased curve fitting in the presence of x-errors}",
journal = {arXiv e-prints},
keywords = {Astrophysics - Cosmology and Nongalactic Astrophysics},
year = 2023,
month = sep,
eid = {arXiv:2309.00948},
pages = {arXiv:2309.00948},
doi = {10.48550/arXiv.2309.00948},
archivePrefix = {arXiv},
eprint = {2309.00948},
primaryClass = {astro-ph.CO},
url = {https://arxiv.org/abs/2309.00948},
}
and are encourgaed to cite the numpyro
papers
@ARTICLE{numpyro1,
title={Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro},
author={Phan, Du and Pradhan, Neeraj and Jankowiak, Martin},
journal={arXiv preprint arXiv:1912.11554},
year={2019}
}
@ARTICLE{numpyro2,
author = {Eli Bingham and
Jonathan P. Chen and
Martin Jankowiak and
Fritz Obermeyer and
Neeraj Pradhan and
Theofanis Karaletsos and
Rohit Singh and
Paul A. Szerlip and
Paul Horsfall and
Noah D. Goodman},
title = {Pyro: Deep Universal Probabilistic Programming},
journal = {J. Mach. Learn. Res.},
volume = {20},
pages = {28:1--28:6},
year = {2019},
url = {http://jmlr.org/papers/v20/18-403.html}
}
Additionally, if you use the function roxy.plotting.posterior_predictive_plot
, then, as this used the fgivenx
package, you must cite
@article{fgivenx,
doi = {10.21105/joss.00849},
url = {http://dx.doi.org/10.21105/joss.00849},
year = {2018},
month = {Aug},
publisher = {The Open Journal},
volume = {3},
number = {28},
author = {Will Handley},
title = {fgivenx: Functional Posterior Plotter},
journal = {The Journal of Open Source Software}
}
We also provide simple routines to plot posterior distribtuions with roxy.plotting.triangle_plot
. If you use module="corner"
with this function, please cite
@article{corner,
doi = {10.21105/joss.00024},
url = {https://doi.org/10.21105/joss.00024},
year = {2016},
month = {jun},
publisher = {The Open Journal},
volume = {1},
number = {2},
pages = {24},
author = {Daniel Foreman-Mackey},
title = {corner.py: Scatterplot matrices in Python},
journal = {The Journal of Open Source Software}
}
and if you use module="getdist"
, please cite
@article{getdist,
author = "Lewis, Antony",
title = "{GetDist: a Python package for analysing Monte Carlo
samples}",
year = "2019",
eprint = "1910.13970",
archivePrefix = "arXiv",
primaryClass = "astro-ph.IM",
SLACcitation = "%%CITATION = ARXIV:1910.13970;%%",
url = "https://getdist.readthedocs.io"
}
MIT License
Copyright (c) 2023 Deaglan John Bartlett
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Below is a list of contributors to this repository.
Deaglan Bartlett (CNRS & Sorbonne Université, Institut d’Astrophysique de Paris and Astrophysics)
Harry Desmond (Institute of Cosmology & Gravitation, University of Portsmouth)
The documentation for this project can be found at this link
DJB is supported by the Simons Collaboration on "Learning the Universe."
HD is supported by a Royal Society University Research Fellowship (grant no. 211046).