/regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.

Primary LanguageC++OtherNOASSERTION

build GitHub release (latest by date) Regenie Github All Releases License: MIT

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.

It is developed and supported by a team of scientists at the Regeneron Genetics Center.

The method has the following properties

  • It works on quantitative and binary traits, including binary traits with unbalanced case-control ratios
  • It can handle population structure and relatedness
  • It can process multiple phenotypes at once efficiently
  • It is fast and memory efficient 🔥
  • For binary traits it supports Firth logistic regression and an SPA test
  • It can perform gene/region-based tests, interaction tests and conditional analyses
  • It supports the BGEN, PLINK bed/bim/fam and PLINK2 pgen/pvar/psam genetic data formats
  • It is ideally suited for implementation in Apache Spark (see GLOW)
  • It can be installed with Conda

Full documentation for the regenie can be found here.

Citation

Mbatchou, J., Barnard, L., Backman, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet 53, 1097–1103 (2021). https://doi.org/10.1038/s41588-021-00870-7

License

regenie is distributed under an MIT license.

Contact

If you have any questions about regenie please contact

If you want to submit a issue concerning the software please do so using the regenie Github repository.

Version history

Version 3.1 (Fixed bug in SKAT/SKATO tests when applying Firth/SPA correction; Improved SPA implementation by computing both tail probabilities; New option --set-singletons to specify variants to consider as singletons for burden masks; New option --l1-phenoList to run level 1 models in Step 1 in parallel across phenotypes; Several bug fixes)

Version 3.0.3 (Skip BTs where null model fit failed; Bug fix for BURDEN-ACAT; Bug fix when nan/inf values are in phenotype/covariate file)

Version 3.0.1 (Improve ridge logistic regression in Step 1; Add compilation with Cmake)

Version 3.0 (New gene-based tests: SKAT, SKATO, ACATV, ACATO and NNLS [Non-Negative Least Square test]; New GxE and GxG interaction testing functionality; New conditional analysis functionality; see release page for minor additions)

For past releases, see here.