/annotateVCF

Annotate vcf file using user supplied data

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

annotateVcf

cancerit This project hosts scripts to annotate VCF files using user defined driver genes and mutations

Design

Uses bcftools, tabix and [bgzip] in user's path , these are part of htslib or can be installed separately

Tools

annotateVcf has multiple command line options, listed with annotateVcf --help.

annotateVcf

Takes vcf file as input along with driver gene information, and optional unmatched normal panel vcf and outputs VCF with added DRV INFO field.

Various exceptions can occur for malformed input files.

inputFormat

  • input_vcf.gz snv or indel vcf file annotated using VAGrENT
  • normal_panel.vcf.gz normal panel to tag germline variants
  • lof_genes.txt list of known loss of function [LoF] genes along with previous gene symbols ( to make sure all gene synonyms were matched with input vcf)
  • cpg_variants.tsv.gz list of variants in cancer predisposition genes to tag germline predisposition variants
  • filters.json filters to be applied during driver annotations ( see default file filters.josn in config folder)
  • driver_mutations.tsv.gz tab separated driver mutations along with consequence type
  • info.header vcf header INFO line showing driver and cancer predisposition annotations...

outputFormat

  • <input>_drv.vcf.gz output vcf file with DRV info field and consequence type if known, LoF in case annotated using LoF gene list, CPV info field is added if variants in cancer predisposition genes are provided.

INSTALL

Installing via pip install. Simply execute with the path to the compiled 'whl' found on the release page:

pip install annotateVcf.X.X.X-py3-none-any.whl

Release .whl files are generated as part of the release process and can be found on the release page

Development environment

This project uses git pre-commit hooks. As these will execute on your system it is entirely up to you if you activate them.

If you want tests, coverage reports and lint-ing to automatically execute before a commit you can activate them by running:

git config core.hooksPath git-hooks

Only a test failure will block a commit, lint-ing is not enforced (but please consider following the guidance).

You can run the same checks manually without a commit by executing the following in the base of the clone:

./run_tests.sh

Development Dependencies

pytest radon pytest-cov

Setup VirtualEnv

cd $PROJECTROOT
hash virtualenv || pip3 install virtualenv
virtualenv -p python3 env
source env/bin/activate
python setup.py develop # so bin scripts can find module

For testing/coverage (./run_tests.sh)

source env/bin/activate # if not already in env
pip install pytest
pip install radon
pip install pytest-cov

Also see Package Dependancies

Cutting a release

Make sure the version is incremented in ./setup.py

Install via .whl (wheel)

Generate .whl

source env/bin/activate # if not already
python setup.py bdist_wheel -d dist

Install .whl

# this creates an wheel archive which can be copied to a deployment location, e.g.
scp dist/annotateVcf.X.X.X-py3-none-any.whl user@host:~/wheels
# on host
pip install --find-links=~/wheels annotateVcf

Reference