MiSplice Supplemental Scripts

Reyka Jayasinghe (reyka@wustl.edu) Last Edit: October 15th, 2018

Final Filtering

This script takes in an intermediate output of misplice(https://github.com/ding-lab/misplice) and does the following:

  1. Filter out subset of genes
  2. Combines samples that have the same mutation in the same cancer type into one line entry
  3. Filters out SCM events that have > 5% of controls having at least one read with the same SCM event
  4. Requires a minimum of 20 controls
  5. Combines mutations that are linked to the same SCM event. These are put into a separate file: CANCER.rgSCM.multiplemutations
  6. Annotates samples with genomic context information and splice score and saved to: CANCER.rgSCM.filtered.txt
USAGE: python Final_Filtering.py novel.splice.scores.rc.key.combined.noHLA.vaf.highexp ACC

TransVar Annotation

Annotate MiSplice post-filtered results with TransVar. Adds two columns to the end of the input file format that includes the canonical transcript results and all alternative transcript results.

Download TransVar

sudo pip install transvar #download transvar
transvar config --download_ref --refversion hg19 #Download reference
transvar config --download_anno --refversion hg19 #Set up databases
transvar config -k reference -v [path_to_hg19.fa] --refversion hg19 #link reference to transvar if you already have one

Annotation of Genomic Coordinates

-Error output file: error.transvar -Final output: CANCER.rgSCM.filtered.txt.transvar

USAGE: python TransVar_Annotation.py CANCER.rgSCM.filtered