/igv-reports

Python application to generate self-contained igv.js pages that can be opened within a browser with "file" protocol.

Primary LanguagePythonMIT LicenseMIT

igv-reports

Python application to generate self-contained igv.js pages that can be opened within a browser with "file" protocol.

Installation

Prerequisites

igv-reports requires Python 3.6 or greater and pip. As with all Python projects use of a virtual enviornment is recommended. Instructions for creating a virtual environment using conda are below

Installing igv-reports

pip install igv-reports

igv-reports requires the package pysam which should be installed automatically. However on OSX this sometimes fails due to missing dependent libraries. This can be fixed following the procedure below, from the pysam docs;
"The recommended way to install pysam is through conda/bioconda. This will install pysam from the bioconda channel and automatically makes sure that dependencies are installed. Also, compilation flags will be set automatically, which will potentially save a lot of trouble on OS X."

conda config --add channels r
conda config --add channels bioconda
conda install pysam

Creating a report

A report consists of a table of sites or regions and an associated igv views for each site. Reports are created with the command line script create_report. Command line arguments are described below. Although --tracks is optional, a typical report will include at least an alignment track (BAM or CRAM) file from which the variants were called.

Arguments:

  • Required
    • sites vcf or bed file of genomic sites
    • fasta reference fasta file, must be indexed
  • Optional
    • --tracks space-delimited list of track files, see below for supported formats
    • --ideogram ideogram file in UCSC cytoIdeo format
    • --template html template file
    • --output output file name default="igvjs_viewer.html"
    • --info-columns space delimited list of VCF info field names to include in variant table
    • --info-columns-prefixes space delimited list of prefixes of VCF info field names to include in variant table
    • --sample-columns space delimited list of VCF sample/format field names to include in variant table
    • --flanking genomic region to include either side of variant, default=1000
    • --standalone embed all javascript referenced via <script> tags in the page

Track file formats:

Currently supported track file formats are BAM, CRAM, VCF, BED, GFF3, and GTF. FASTA. BAM, CRAM, and VCF files must be indexed. Tabix is supported for other file types and it is recommended that all large files be indexed.

Examples

Data for the examples are available for download.

Creating a variant report from a VCF file:

create_report examples/variants/variants.vcf.gz https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC --tracks examples/variants/variants.vcf.gz examples/variants/recalibrated.bam examples/variants/refGene.sort.bed.gz --output igvjs_viewer.html

Converting genomic files to data URIs for use in igv.js

The script ```create_datauri`` converts the contents of a file to a data uri for use in igv.js. The datauri will be printed to stdout.

Convert a gzipped vcf file to a datauri.

create_datauri examples/variants/variants.vcf.gz

Convert a slice of a remote cram file to a datauri.

create_datauri \
--region 8:127,738,322-127,738,508 \
https://s3.amazonaws.com/1000genomes/data/HG00096/alignment/HG00096.alt_bwamem_GRCh38DH.20150718.GBR.low_coverage.cram 

Creating a virtual environment

Instructions for creating a virtual environment using conda follow.

2. Create a virtual environment

conda create -n myenv python=3.7.1
conda install -n myenv pip
conda activate
conda install pip