regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
It is developed and supported by a team of scientists at the Regeneron Genetics Center.
The method has the following properties
- It works on quantitative and binary traits, including binary traits with unbalanced case-control ratios
- It can handle population structure and relatedness
- It can process multiple phenotypes at once efficiently
- It is fast and memory efficient 🔥
- For binary traits it supports Firth logistic regression and an SPA test
- It can perform gene/region-based tests, interaction tests and conditional analyses
- It supports the BGEN, PLINK bed/bim/fam and PLINK2 pgen/pvar/psam genetic data formats
- It is ideally suited for implementation in Apache Spark (see GLOW)
- It can be installed with Conda
Full documentation for the regenie can be found here.
Mbatchou, J., Barnard, L., Backman, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet 53, 1097–1103 (2021). https://doi.org/10.1038/s41588-021-00870-7
regenie is distributed under an MIT license.
If you have any questions about regenie please contact
If you want to submit a issue concerning the software please do so using the regenie Github repository.
Version 3.0.3 (Skip BTs where null model fit failed; Bug fix for BURDEN-ACAT; Bug fix when nan/inf values are in phenotype/covariate file)
Version 3.0.1 (Improve ridge logistic regression in Step 1; Add compilation with Cmake)
Version 3.0 (New gene-based tests: SKAT, SKATO, ACATV, ACATO and NNLS [Non-Negative Least Square test]; New GxE and GxG interaction testing functionality; New conditional analysis functionality; see release page for minor additions)
For past releases, see here.