Integrating disease and drug-related phenotypes for improved identification of pharmacogenomic variants
Paper published in Pharmacogenomics (https://www.futuremedicine.com/doi/abs/10.2217/pgs-2020-0130)
Links to the data used for discovery and replication:
Pharmacogenomics Research Network
wget https://www.pgrn.org/uploads/1/0/7/8/107807723/wenc_eur_sumstats_a2.txt_1.zip
wget https://www.pgrn.org/uploads/1/0/7/8/107807723/wheeler_et_al_ccr_2017_sum_stats.txt_5.zip
wget https://www.pgrn.org/uploads/1/0/7/8/107807723/meta_w_h_b_n1.tbl_1.zip
wget https://www.pgrn.org/uploads/1/0/7/8/107807723/adv_adno_assoc.assoc.txt.zip
[ Files in order of (1) allopurinol response, (2) cisplatin ototoxicity, (3) anti-hypertensive induced new-onset diabetes, and (4) celecoxib prevention of colorectal adenoma ]
UK Biobank
wget https://www.dropbox.com/s/k5o6xn6uw1hvwpm/K50.gwas.imputed_v3.both_sexes.tsv.bgz?dl=0 -O K50.gwas.imputed_v3.both_sexes.tsv.bgz
wget https://www.dropbox.com/s/x060zthr9agkp2h/J33.gwas.imputed_v3.both_sexes.tsv.bgz?dl=0 -O J33.gwas.imputed_v3.both_sexes.tsv.bgz
wget https://www.dropbox.com/s/p697in77wsizlvz/R31.gwas.imputed_v3.both_sexes.tsv.bgz?dl=0 -O R31.gwas.imputed_v3.both_sexes.tsv.bgz
wget https://www.dropbox.com/s/7l26kh3kfduu7gh/M10.gwas.imputed_v3.both_sexes.tsv.bgz?dl=0 -O M10.gwas.imputed_v3.both_sexes.tsv.bgz
[ Files in order of (1) Crohn's disease (K50), (2) nasal polyps (J33), (3) unspecified hematuria (R31), and (4) ]
EBI-GWAS Catalog
wget ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/KottgenA_23263486_GCST001791/GUGC_MetaAnalysis_Results_UA.csv.zip
wget ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/KottgenA_23263486_GCST001790/GUGC_MetaAnalysis_Results_Gout.csv.zip
wget ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/deLangeKM_28067908_GCST004131/ibd_build37_59957_20161107.txt.gz
wget ftp://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/deLangeKM_28067908_GCST004132/cd_build37_40266_20161107.txt.gz
[ Files in order of (1) serum urate Kottgen, (2) gout Kottgen, (3) inflammatory bowel disease de Lange, and (4) Crohn's disease de Lange ]
Step One
- All GWAS Summary statistics must have SNP, PVAL, CHR, POS, and N columns.
Step Two
- An information file with columns FILENAME, STUDY, FILTER, CHR, POS_MIN, POS_MAX, TYPE, SDY, N_CASE, S
- FILENAME is the name of the GWAS summary statistic file
- STUDY is a simpler name for the study
- FILTER is used to group specific GWAS together for analysis - base this off shared SNPS from comparison of all GWAS
- CHR is the chromosome you want to analyze colocalization on
- POS_MIN and POS_MAX are the range you want to examine colocalization on - we recommend choosing +/-100kb the top shared SNP between a group of traits
- TYPE must be either "quant" or "cc" to indicate study type; required for colocalization
- SDY is the population standard deviation of the quantitative trait; NA if trait == "cc" or if unavailable
- N_CASE is the number of cases for binary/case-control traits; NA if trait == "quant"
- S is the proportion of samples that are cases (i.e. N_CASES / Total N); required if type =="cc" else put NA for quant
Step Three
- Run MultiZoom on the command-line the following way
Rscript MultiZoom.R [~/Directory/of/GWAS/Data] [~/Link/to/INFO/File] [hg19 or hg38] [~/Output/Directory]
Example:
Rscript MultiZoom.R "~/Desktop" "~/Desktop/INFOFILE.tsv" hg19 "~/Desktop/OutputFolder"
Having trouble? Please contact t.ouellette@mail.utoronto.ca