Trans-ethnic fine-mapping of the MHC region using GWAS summary statistics

This repository is a storage of source codes used in our study of trans-ethnic MHC fine-mapping on Parkinson's disease risk.

Publication/Citation

The study is described in the following paper.

Naito, T. et al. Trans‐Ethnic Fine‐Mapping of the Major Histocompatibility Complex Region Linked to Parkinson's Disease. Mov. Disord. (2021). doi.org/10.1002/mds.28583

Please cite this paper if you use any material in this repository.

Requirements

Python 3.x (3.7.4) with the following modules.
- Numpy (1.17.2)
- Pandas (0.25.1)
- Scipy (1.3.1)
- Argparse (1.4.0)
  
  Our scripts were tested on the versions in parentheses, so we do not guarantee that it will work on different versions.
R
Plink [1]
DISH [2]
GCTA-COJO [3]

Installation

Just clone this repository as folllows.

git clone https://github.com/tatsuhikonaito/Trans-ethnic_MHC_finemapping_SS
cd ./Trans-ethnic_MHC_finemapping_SS

Process

0. Preparation

The access to the GWAS summary statistics and HLA imputation reference panels (Plink binary format) used in our study is described in the paper.

1. Fine-mapping in the MHC region using summary statistics

Run DISH to perform MHC fine-mapping for post-QC summary statistics data of each study.

$ Rscript DISH.r <summary filename> T hg19 EUR 0.005 T <output filename> 0.05

Inputs

<summary filename>

Filename of GWAS summary statistics to impute

Outputs

<output filename>

Filename of DISH output of HLA imputation.

2. Sample size-based meta-analysis of Z-score

Peform sample size-based meta-analysis of Z-score using imputed summary statistics files as follows.

$ python zcore_metaanalysis.py --dishfile-list DISHFILE_LIST.txt

Inputs

DISHFILE_LIST.txt

First and second columns are the DISH-output filename and number of cases and controls of each study.

e.g.)

study_1.dish.txt 1000 10000

study_2.dish.txt 2000 10000

Outputs

metaanalysis_result.txt

Second to last and last columns are Z-score and P-value of meta-analysis, following Z-scores of individual studies.

3. Summary statistics conditional analysis

Peform sample size-based meta-analysis of Z-score using imputed summary statistics files as follows.

$ python ss_conditional_analysis.py --dishfile-list DISHFILE_LIST.txt --ref-list REF_LIST.txt --allele-list ALLELE_LIST.txt

Inputs

DISHFILE_LIST.txt

Described above.
REF_LIST.txt

The filenames (prefix) of HLA imputation reference panels corresponding to individual studies in the same order as DISHFILE_LIST.txt.

e.g.)

T1DGC_REF

PAN-Asian_REF
ALLELE_LIST.txt

First column is the name of alleles to be conditioned in.

e.g.)

AA_DRB1_13_32660109_R AA_DRB1_13_32660109_H

Outputs

GCTA-COJO output files

Prefix of filename of each study + ".cma.cojo".

Conversion of file formats

Convert GCTA-COJO output file format to DISH output file format.

This enables iterative process of Z-score meta-analysis and conditional analysis.

$ python cojo_to_dish.py --cojofile-list COJOFILE_LIST.txt --ref-list REF_LIST.txt

Inputs

COJOFILE_LIST.txt

The filenames of HLA imputation reference panels corresponding to individual studies in the same order as DISHFILE_LIST.txt.

e.g.)

study_1.cma.cojo

study_2.cma.cojo
REF_LIST.txt

Described above.

Outputs

DISH output files

Prefix of filename of each study + ".dish.txt".

References

[1] Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007;81(3):559–575.

[2] Lim J, Bae S-C, Kim K. Understanding HLA associations from SNP summary association statistics. Sci. Rep. 2019;9(1):1337.

[3] Yang J, Ferreira T, Morris AP, et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 2012;44(4):369–375.

Contact

For any question, you can contact Tatsuhiko Naito (tnaito@sg.med.osaka-u.ac.jp)

tatsuhikonaito/Trans-ethnic_MHC_fine-mapping_SS

Trans-ethnic fine-mapping of the MHC region using GWAS summary statistics

Publication/Citation

Requirements

Installation

Process

0. Preparation

1. Fine-mapping in the MHC region using summary statistics

Inputs

Outputs

2. Sample size-based meta-analysis of Z-score

Inputs

Outputs

3. Summary statistics conditional analysis

Inputs

Outputs

Conversion of file formats

Inputs

Outputs

References

Contact