/SpliceAI-lookup

Website for checking SpliceAI and Pangolin scores:

Primary LanguagePythonMIT LicenseMIT

This repo contains:


The SpliceAI and Pangolin APIs are available at the following urls:

https://spliceai-37-xwkwwwxdwq-uc.a.run.app - SpliceAI for variants on GRCh37
https://spliceai-38-xwkwwwxdwq-uc.a.run.app - SpliceAI for variants on GRCh38
https://pangolin-37-xwkwwwxdwq-uc.a.run.app - Pangolin for variants on GRCh37
https://pangolin-38-xwkwwwxdwq-uc.a.run.app - Pangolin for variants on GRCh38

WARNING: the APIs are intended for interactive use only, and do not support more than several requests per user per minute. To process many variants in batch, please install and run the underlying models directly on your local infrastructure. Their source code is available @ https://github.com/bw2/SpliceAI and https://github.com/bw2/Pangolin.

To query the API, select the appropriate base url above, and then use the following endpoints and arguments:

/spliceai/?hg=38&distance=50&variant=chr8-140300616-T-G

Get SpliceAI scores for the given variant.

  • variant (required) a variant in the format "chrom-pos-ref-alt"
  • hg (required) can be 37 or 38
  • distance (optional) distance parameter of SpliceAI model (default: 50)
  • mask (optional) can be 0 which means raw scores or 1 which means masked scores (default: 0). Splicing changes corresponding to strengthening annotated splice sites and weakening unannotated splice sites are typically much less pathogenic than weakening annotated splice sites and strengthening unannotated splice sites. When this parameter is = 1 (masked), the delta scores of such splicing changes are set to 0. SpliceAI developers recommend using raw (0) for alternative splicing analysis and masked (1) for variant interpretation.

/pangolin/?hg=38&distance=50&variant=chr8-140300616-T-G

Get Pangolin scores for the given variant.

  • variant (required) a variant in the format "chrom-pos-ref-alt"
  • hg (required) can be 37 or 38
  • distance (optional) distance parameter of SpliceAI model (default: 50)
  • mask (optional) can be 0 which means raw scores or 1 which means masked scores (default: 0). Splicing changes corresponding to strengthening annotated splice sites and weakening unannotated splice sites are typically much less pathogenic than weakening annotated splice sites and strengthening unannotated splice sites. When this parameter is = 1 (masked), the delta scores of such splicing changes are set to 0. SpliceAI developers recommend using raw (0) for alternative splicing analysis and masked (1) for variant interpretation.

Local Install

The steps below describe how to install the API server on your local infrastructure. The details will vary depending on your OS, etc. If you run into issues, please submit them to the issue tracker.

  1. Install pytorch as described in the Pangolin installation docs
  2. Install the modified versions of SpliceAI and Pangolin from https://github.com/bw2/SpliceAI and https://github.com/bw2/Pangolin
  3. Install and start a redis server. It's used to cache previously computed API server responses so that they don't have to be computed again.
  4. Download reference fasta files: hg19.fa and hg38.fa
  5. Generate annotation files using the steps in the annotations README.
  6. Start the API server on localhost port 8080. To modify server options, edit the start_local_server.sh script:
$ git clone git@github.com:broadinstitute/SpliceAI-lookup.git  # clone this repo  
$ cd SpliceAI-lookup  
$ python3 -m pip install -r requirements.txt  # install python dependencies  
$ ./start_local_server.sh  

The server uses ~1.5 Gb RAM per server thread.


For Developers

The spliceailookup.broadinstitute.org front-end is contained within index.html. It uses ES6 javascript with Semantic UI and jQuery. Also, it uses a custom version of igv.js that includes new track types for visualizing the SpliceAI & Pangolin scores. The original server-side code is in server.py and uses the Flask library. It is designed to run on a plain Linux or MacOS machine. The new server-side code is in the google_cloud_run_services/ subdirectory and includes Dockerfiles and scripts for deploying SpliceAI and Pangolin API services to Google Cloud Run.