Scripts and data used in analyses for "An amino acid motif in HLA-DRβ1 distinguishes patients with uveitis in juvenile idiopathic arthritis." (Preprint: https://www.biorxiv.org/content/early/2017/08/14/140954)
This repository contains:
Summary-level datasets are available here:
Genome-wide data
- Phase 1 genome-wide association summary-level results: https://doi.org/10.5281/zenodo.1048977
- Phase 2 genome-wide association summary-level results: https://doi.org/10.5281/zenodo.1048979
MHC-specific data
- MHC-wide association results comparing uveitis-JIA cases to non-uveitis JIA samples: https://doi.org/10.5281/zenodo.1049020
- MHC-wide association results (uveitis-JIA cases vs. non-uveitis JIA samples), conditioning on aspartic acid (D) at position 11 in HLA-DRB1:https://doi.org/10.5281/zenodo.1049023
- MHC-wide association results (uveitis-JIA cases vs. non-uveitis JIA samples), conditioning on serine (S) at position 11 in HLA-DRB1: https://doi.org/10.5281/zenodo.1049025
MHC-specific data, split by sex
- MHC-wide association results (uveitis-JIA cases vs. non-uveitis JIA samples) in female samples only: https://doi.org/10.5281/zenodo.1049027
- MHC-wide association results (uveitis-JIA cases vs. non-uveitis JIA samples) in male samples only: https://doi.org/10.5281/zenodo.1049031
This text file can be useful for splitting up data into 5Mb windows or running imputation/GWAS in 5Mb chunks. The file contains:
- The chromosome number
- The starting position (in Mb) on that chromosome
- The ending position (in Mb) on that chromosome
- The total number of 5Mb chunks on that chromosome.
A parameters file to run a meta-analysis using METAL (https://genome.sph.umich.edu/wiki/METAL).
Usage:
/path/to/metal metal.mhc.txt
A bash script containing a series of Plink commands used to run sample and variant QC
Please see the Afib-Stroke-Overlap repository for this script. It uses EIGENSTRAT (https://www.hsph.harvard.edu/alkes-price/eigensoft-frequently-asked-questions/ to run principal component analysis (PCA) either using a reference set of data or just in your own samples.
Usage:
./run_smartPCA.sh data_prefix
A bash script to call SHAPEIT, which will phase your data before imputation.
Usage:
./prephase.sh chromosome
A bash script to call IMPUTE2, which will impute your phased data (produced by SHAPEIT)
Usage:
./impute.sh chromosome window_start window_stop
A bash script to call IMPUTE2, which will impute your phased data (produced by SHAPEIT) specifically for the X chromosome
Usage:
./impute.sh chromosome window_start window_stop
A bash script that will point Plink to the imputed data and run a genome-wide association study
Usage:
./run_gwas.sh chromosome phenotype window_start window_stop
A script for running the SNP2HLA imputation pipeline. Script authored by Xiaoming Jia
Usage:
./SNP2HLA.csh DATA (.bed/.bim/.fam) REFERENCE (.bgl.phased/.markers) OUTPUT plink {optional: java_max_memory[mb] marker_window_size}
DATA: bim/bed/fam files of your data
REFERENCE: your reference dataset (here, provided by the Type 1 Diabetes Genetics Consortium)
OUTPUT: name of your output files
A bash script to run the HLA imputation.
Uage:
./run_imputeHLA.sh input reference_data
A script to run association testing (using PLINK) on the imputed MHC data
Code used for calculating likelihood ratio tests in the uveitis data (requires individual-level data)
Code used for testing the interaction of sex and the HLA-DRB1 serine-11 genotype (requires individual-level data) and for plotting association results in male vs. female samples (requires summary-level data; please see above for links).