Hail is a framework for scalable genetic data analysis. Hail is pre-alpha software and under active development. Hail is written in Scala (mostly) and uses Apache Spark and other Apache Hadoop projects. If you are interested in getting involved in Hail development, email hail@broadinstitute.org.
- Building
- Representation
- Hail's expression language
- Importing
- Splitting Multiallelic Variants
- Renaming Samples
- Annotating Variants
- Annotating Samples
- Annotating Global
- Quality Control
- PCA
- Annotating with the Variant Effect Predictor
- Filtering
- Querying using SQL
- Linear regression
- Mendel errors
- Exporting to TSV
- Exporting to VCF
- Exporting to Plink
- Persist
Here is a rough list of features currently planned or under development:
- generalized query language
- better interoperability with other Hadoop projects
- kinship estimation from GRM
- LMM
- burden tests, SKAT
- logistic regression
- dosage
- posterior (PP)
- LD pruning
- sex check
- TDT
- BGEN
- Kaitlin Samocha's de novo caller