Anaconda Contributors Forks Stargazers Issues MIT License LinkedIn


GWAS GENEIE (GWAS Gene-Integrated Explorer)

Colocalize eQTL and GWAS hits from any study in openGWAS.


View Shiny App · View App Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. License
  5. Contact
  6. Acknowledgments

About The Project

GWAS GENEIE allows researchers to colocalize eQTL hits from eQTLGen and GWAS hits from any study in openGWAS. It operates under the assumption that some GWAS and eQTL hits in close proximity may be the same signal, linking phenotypes to gene expression and enabling the development of transcriptional risk scores (TRS) and polygenic-predicted transcriptional risk scores (PP-TRS) for these phenotypes. As input, the project takes any study ID from openGWAS as well as eQTL summary statistics from eQTLGen and runs of coloc's coloc.abf() function. The project has been implemented as an app using whole blood eQTL summary statistics, allowing users to download a list of GWAS and eQTL hits found to be the same signal. It was originally developed as a class project for BIOL8803 at Georgia Tech in Fall 2022.

(back to top)

Getting Started

Prerequisites

  • anaconda/mamba (only tested on Ubuntu 22.04.1 LTS in WSL2. Things might be different for Windows/Mac/non-virtual Linux)

Installation

  1. Clone the repo
    git clone https://github.com/knishiura3/BIOL8803_Transcriptional_Risk_Score
    
  2. Create a conda environment using the yaml file provided
    conda env create -f=/path/to/<env yml file>
    
  3. Open R within the conda environment & install R packages (includes packages needed for R kernel usage in jupyter nb)
  • R
    install.packages(c("devtools", "shiny", "shinythemes", "shinycssloaders", "DT", "slickR", 
                   "duckdb", "fs", "tidyverse", "DBI", "glue", "dplyr", "coloc", "ggplot2", "httpgd"))
    devtools::install_github("jrs95/gassocplot")
    devtools::install_github("mrcieu/gwasglue")
    devtools::install_github("IRkernel/IRkernel")
    IRkernel::installspec()
    

(back to top)

Usage

  1. Navigate to the cloned GWAS GENEIE directory.
  2. Download the ld reference
wget http://fileserve.mrcieu.ac.uk/ld/data_maf0.01_rs_ref.tgz -P ld; tar -zxvf ld/* -C ld/; rm ld/*tgz
mv ld/data_maf0.01_rs_ref.bed ld/EUR.bed
mv ld/data_maf0.01_rs_ref.bim ld/EUR.bim
mv ld/data_maf0.01_rs_ref.fam ld/EUR.fam

  1. Download the MAF information and build its parquet.
wget https://molgenis26.gcc.rug.nl/downloads/eqtlgen/cis-eqtl/2018-07-18_SNP_AF_for_AlleleB_combined_allele_counts_and_MAF_pos_added.txt.gz -P data/eqtl_MAF; gunzip data/eqtl_MAF/*
Rscript MAF_build_parquet.R

  1. Build the parquet from eQTLGen summary statistics.
wget https://molgenis26.gcc.rug.nl/downloads/eqtlgen/cis-eqtl/2019-12-11-cis-eQTLsFDR-ProbeLevel-CohortInfoRemoved-BonferroniAdded.txt.gz; gunzip ./*
python3 eQTL_build_parquet.py 2019-12-11-cis-eQTLsFDR-ProbeLevel-CohortInfoRemoved-BonferroniAdded.txt
rm ./2019-12-11-cis-eQTLsFDR-ProbeLevel-CohortInfoRemoved-BonferroniAdded.txt

  1. Query the parquet using the desired GWAS ID from openGWAS. Refer to https://github.com/knishiura3/BIOL8803_Transcriptional_Risk_Score/blob/main/demo.ipynb for usage.

A shiny app implementation is also available at https://genapp2022.biosci.gatech.edu/team1/.

Output

  • Txt file of combined summary statistics for GWAS and eQTL hits found to be the same signal. Refer to eQTLGen & openGWAS for column descriptions.
  • PNG files of combined Manhatten & LD plots of regions surrounding GWAS and eQTL hits found to be the same signal.

(back to top)

Contact

Kenji Nishiura - kenji@gatech.edu

Colin Naughton - Naughtoncolin@gmail.com

(back to top)

Acknowledgments

  • Many thanks to Kenji Gerhardt for his assistance with coding best practices and optimization

Team Members

  • Andy Chea
  • Colin Naughton
  • Kenji Nishiura
  • Jasmyn Pellebon

(back to top)