/moalmanac

Molecular Oncology Almanac, a clinical interpretation algorithm paired with a novel underlying knowledge base for precision cancer medicine

Primary LanguageHTMLApache License 2.0Apache-2.0

Molecular Oncology Almanac

Molecular Oncology Almanac is a clinical interpretation algorithm for cancer genomics to annotate and evaluate whole-exome and transcriptome genomic data from individual patient samples. Specifically, Molecular Oncology Almanac can:

  • Identify mutations and genomic features related to therapeutic sensitivity and resistance and of prognostic relevance.
  • Annotate somatic and germline variants based on their presence in several datasources.
  • Sort and evaluate somatic mutations from single nucleotide variants, insertions and deletions, copy number alterations, and fusions based on clinical and biological relevance.
  • Integrate data types to observe which genes have been altered in both the somatic and germline setting.
  • Extract and evaluate germline mutations relevant to adult and hereditary cancers.
  • Identify overlap between somatic variants observed from both DNA and RNA, or any other source of validation sequencing.
  • Identify somatic and germline variants that may be related to microsatellite stability.
  • Calculate coding mutational burden and compare your patient to TCGA.
  • Calculate contribution of known COSMIC mutational signatures with deconstructsigs.
  • Identify genomic features that may be related to one another.
  • Create portable web-based actionability reports, summarizing clinically relevant findings.

Getting Molecular Oncology Almanac

The codebase is available for download through this Github repository, Dockerhub, and Terra. The method can also be run on FireCloud, without having to use FireCloud, by using our portal. Accessing Molecular Oncology Almanac through Github will require building some of the datasources but they also contained in the Docker container.

Installation

Molecular Oncology Almanac is a Python application using Python 3.6 but also utilizes R to run deconstructSigs as a subprocess. This application, datasources, and all dependencies are packaged on Docker and can be downloaded with the command

docker pull vanallenlab/moalmanac

Alternatively, the package can be built from this Github repository. To download via Github,

git clone https://github.com/vanallenlab/moalmanac.git

We recommend using a virtual environment and running Python with either Anaconda or Miniconda. After installing Anaconda or Miniconda, you can set up by running

conda create -n moalmanac python=3.6 -y
source activate moalmanac
pip install -r requirements.txt

You can install deconstructSigs after installing R with the following commands

Rscript -e 'install.packages("RCurl", repos = "http://cran.rstudio.com/")' \
    && Rscript -e 'source("http://bioconductor.org/biocLite.R"); biocLite("BSgenome"); biocLite("BSgenome.Hsapiens.UCSC.hg19"); biocLite("GenomeInfoDb")' \
    && Rscript -e 'install.packages("reshape2", repos = "http://cran.rstudio.com/")' \
    && Rscript -e 'install.packages("deconstructSigs", repos = "http://cran.rstudio.com/")'

Usage

Molecular Oncology Almanac will run based on any combination of input data but does require a few inputs: patient_id, config.ini, and colnames.ini.

Required arguments:

    --patient_id            <string>    patient identifier
    --config_ini            <string>    configuration file, default=config.ini
    --colnames_ini          <string>    configuration file for input column names and strings, default=colnames.ini

Optional arguments:

    --tumor_type            <string>    tumor ontology, default=Unknown
    --stage                 <string>    tumor stage, default=Unknown
    --snv_handle            <string>    handle for MAF file of somatic single nucleotide variants
    --indel_handle          <string>    handle for MAF file of somatic insertions and deletions
    --bases_covered_handle  <string>    handle for text file which contains the number of calcable somatic bases
    --cnv_handle            <string>    handle for annotated seg file for somatic copy number
    --fusion_handle         <string>    handle for STAR fusion output, .final.abridged
    --germline_handle       <string>    handle for MAF file of germline single nucleotide variants and insertions and deletions
    --validation_handle     <string>    handle for MAF file of somatic single nucleotide variant called from validation sequencing
    --ms_status             <string>    microsatellite status as deemed by MSI sensor, MSI or MSS, default=Unknown
    --purity                <float>     tumor purity
    --ploidy                <float>     tumor ploidy
    --wgd                   <boolean>   specify the occurence of whole genome duplication
    --disable_matchmaking   <boolean>   remove patient-to-cell line matchmaking from report
    --description           <string>    description of patient

Example

A file run_example.py was packaged with this application to run Molecular Oncology Almanac on data found in the folder example_data. The outputs produced are the same as those hosted in the folder example_output. From the application's folder, run

python run_example.py

Citation

If you find this tool or any code herein useful, please cite:

Reardon, B. et al. (2020). Clinical interpretation of integrative molecular profiles to guide precision cancer medicine. bioRxiv 2020.09.22.308833 doi:10.1101/2020.09.22.308833

Disclaimer - For research use only

DIAGNOSTIC AND CLINICAL USE PROHIBITED. DANA-FARBER CANCER INSTITUTE (DFCI) and THE BROAD INSTITUTE (Broad) MAKE NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT OR VALIDITY OF ANY INTELLECTUAL PROPERTY RIGHTS OR CLAIMS, WHETHER ISSUED OR PENDING, AND THE ABSENCE OF LATENT OR OTHER DEFECTS, WHETHER OR NOT DISCOVERABLE.

In no event shall DFCI or Broad or their Trustees, Directors, Officers, Employees, Students, Affiliates, Core Faculty, Associate Faculty and Contractors, be liable for incidental, punitive, consequential or special damages, including economic damages or injury to persons or property or lost profits, regardless of whether the party was advised, had other reason to know or in fact knew of the possibility of the foregoing, regardless of fault, and regardless of legal theory or basis. You may not download or use any portion of this program for any non-research use not expressly authorized by DFCI or Broad. You further agree that the program shall not be used as the basis of a commercial product and that the program shall not be rewritten or otherwise adapted to circumvent the need for obtaining permission for use of the program other than as specified herein.