Pipeline for calling somatic variants from multi-sampled genomic sequencing data of tumor specimen. This repository extends the deprecated pipeline VAP (https://github.com/cancersysbio/VAP).
- Examine the mapping features surrounding any genomic coordinates in the raw read alignment files (.bam).
- Sensitive SSNV classification: based on these extracted features, and multiple related samples, e.g., multi-region or -stage sampling, increase the sensitivity for calling low frequency variants.
- SCNA calling and tumor purity estimation: achived by TitanCNA, jointly estimate local CN and tumor purity.
Tumor Multi Region Sequencing (MRS) is becoming a valuable resource for inspecting intra tumor heterogeneity reflecting growth dynamics in the expansion after tumor initiation. However, such information is buried in subclonal variants which can be at low frequency even if the tumor sample is relatively pure. Detection of these events can be further complicated by the uneven read depth of coverage due to variable exome capture efficiency, sampling or amplification bias in the current WES experiments, and copy number changes in different genomic segments. Here we extract mapping features surrounding each genomic coordinates of interest, leveraging information across MRS, to strike a balance in the sensitivity and accuracy of the variant detection.
We recommend using HPC cluster or powerful workstation where multiple threads are available for executing the pipeline. The resource requirements are listed in our Star protocols paper.
Make sure you've conda or minionda pre installed and run the following commands to install required packages. If not, please check how to install miniconda
git clone https://github.com/SunPathLab/ith.Variant.git && cd ith.Variant
sh install_conda_env.sh
Once installed, active the ith.variant
conda env using.
conda activate ith.variant
- cpan modules:
Statistics::Basic
Math::CDF
Parallel::ForkManager
Text::NSP::Measures::2D::Fisher::right
- R libs:
TitanCNA
(included in folderpkgs/
)HMMcopy
caTools
KernSmooth
RColorBrewer
doMC
- gcc (5.4.0 tested)
- boost (1.54.0 tested)
- zlib (1.2.11 tested)
- Necessary annotation files are written in
confs/config.tsv
file. These include prebuilt genome indexes for mappersbwa
andbowtie
, other region based genome annotations (such as repetitive regions) and known polymorphisms/variants downloaded from dbSNP. - We share the annotation files based on UCSC hg38 (GRCh38.p13, GenBank assembly accession: GCA_000001405.28) Precompiled UCSC hg38 annotation files
Installed Bamtools (https://github.com/pezmaster31/bamtools). Also make sure you have zlib
(version 1.2.11 tested) and boost
(version 1.54.0 tested) installed.
run make in following way to install
make BAMTOOLS_ROOT=/bamtools_directory/ ZLIB_ROOT=/zlib_directory/ BOOST_ROOT=/boost_directory/
The binaries will be built at bin/
. xxx_directory
is where lib/ and include/ sub-directories of xxx (bamtools, zlib and boost) are located.
ith.Variant can be run with UNIX command-line interface.
Getting help message
$ perl ith.Variant/bin/DTrace.pl -h (or --help)
ith.Variant provides example scripts for running the pipeline in the Slurm job queueing system.
Getting help message for submitting jobs in Slurm
$ perl ith.Variant/pipeline/exome/submit_slurm.pl -h (or --help)
Pre-compiled annotation files (hg38)
A detailed protocol is under review in Star protocols.
To provide an alternative way to run ith.Variant
, we have precomplied a docker container of the current version of the pipeline (1.0). In order to run the containerized version of ith.Variant
- first pull the docker image:
## Pull the Docker image
docker pull asntech/ith.variant:v1.0
## Run it as an interactive container
docker run -it asntech/ith.variant:v1.0
You can mount a local directory to using -v
argument, for example: -v /path/to/local/dir:/path/to/mapped/dir
Once you're inside the container you can test the pipeline using:
perl /opt/ith.Variant/bin/DTrace.pl --help
OR
perl /opt/ith.Variant/pipeline/exome/submit_slurm.pl -h (or --help)
You can also run scripts/pipeline using Singularity
## Pull as Singularity image
singularity pull --name ith.variant.sif docker://asntech/ith.variant:v1.0
## The following may be needed if the tmp folder is not large enough, and users would like to pull the container into a specified directory
export SINGULARITY_TMPDIR=DIR_NAME_FOR_TMP_FILES
singularity pull --disable-cache --name ith.variant.sif --dir DEFINE_YOUR_DIRNAME_FOR_THE_DOCKER_IMAGE docker://asntech/ith.variant:v1.0
## Run Singularity image
singularity run ith.variant.sif DTrace.pl --help
## or
singularity run --cleanenv DEFINE_YOUR_DIRNAME_FOR_THE_DOCKER_IMAGE/ith.variant.sif DTrace.pl -h
Note that the slurm job-submission scripts (under folder pipeline/
) need to be modified to reflect the execution from container /opt/ith.Variant/pipeline/
instead of directly from the local environment.
Sun Ruping
Current Affiliation: Department of Laboratory Medicine and Pathology, University of Minnesota, MN, USA.
Email: ruping@umn.edu