The R package HSS
implements two methods LD score regression
and
XPASS
to estimate heritability only using summary statistics data.
You can install the development version of HSS from GitHub with:
# install.packages("devtools")
devtools::install_github("statwangz/HSS")
Here is a basic example:
# Download data
# Download BMI summary statistics from https://atlas.ctglab.nl/
# Download HapMap 3 data from Broad Institute
# wget https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2
# Download LD score data from Broad Institute
# wget https://data.broadinstitute.org/alkesgroup/LDSCORE/w_hm3.snplist.bz2
# Download 1000 Genomes data from https://www.cog-genomics.org/plink/2.0/resources
library(HSS)
file_sumstats <- "f.23104.0.0_res.EUR.sumstats.MACfilt.txt"
# read in HapMap 3 data
w_hm3.snplist <- readr::read_delim("w_hm3.snplist.bz2", "\t",
escape_double = F, trim_ws = T, progress = T)
# load MHC region SNPs, CHR = 6 & BP > 28000000 & BP < 34000000
data(snps_mhc)
# format BMI summary statistics data
BMI_sumstats <- format_sumstats(file_sumstats,
snps_merge = w_hm3.snplist,
snps_mhc = snps_mhc,
snp_col = "SNPID_UKB",
beta_col = "BETA",
se_col = "SE",
freq_col = "MAF_UKB",
a1_col = "A1",
a2_col = "A2",
p_col = "P",
n_col = "NMISS",
info_col = "INFO_UKB")
head(BMI_sumstats, n = 5)
# format LD score data
file_ldsc <- "eur_w_ld_chr"
ldsc <- format_ldsc(file_ldsc)
head(ldsc, n = 5)
# LD score regression
ldsc_fit(BMI_sumstats, ldsc)$h2
# format reference data
file_ref <- "1000G.EUR.QC.hm3.ind"
xpass_data <- format_ref(file_ref, BMI_sumstats)
z <- xpass_data$z
X <- xpass_data$X
# XPASS
xpass(z = pull(z, Z), X = X, n = pull(z, N))$h2
Please see here for more examples which analyze other traits.
The HSS
package is developed by Zhiwei Wang
(zhiwei.wang@connect.ust.hk).
Please feel free to contact Zhiwei Wang (zhiwei.wang@connect.ust.hk) if any inquiries.
Bulik-Sullivan, B.K., Loh, P.-R., Finucane, H.K., Ripke, S., Yang, J., Patterson, N., Daly, M.J., Price, A.L., Neale, B.M., 2015. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 47, 291–295. https://doi.org/10.1038/ng.3211
Cai, M., Xiao, J., Zhang, S., Wan, X., Zhao, H., Chen, G., Yang, C., 2021. A unified framework for cross-population trait prediction by leveraging the genetic correlation of polygenic traits. The American Journal of Human Genetics 108, 632–655. https://doi.org/10.1016/j.ajhg.2021.03.002
Hu, X., Zhao, J., Lin, Z., Wang, Y., Peng, H., Zhao, H., Wan, X., Yang, C., 2021. MR-APSS: a unified approach to Mendelian Randomization accounting for pleiotropy and sample structure using genome-wide summary statistics. https://doi.org/10.1101/2021.03.11.434915