/mummer2circos

Circular bacterial genome plots based on BLAST or NUCMER/PROMER alignments

Primary LanguagePython

Description

Generate circular bacterial genome plots based on BLAST or NUCMER/PROMER alignments. Generate SVG and PNG images with circos (http://circos.ca/).

Installation

Method 1: Installation with conda

# create the environment
conda env create -f env.yaml
# activate the environment
conda activate mummer2circos

Method 2: Singularity container

Dependency: Singularity

build the image

singularity build mummer2circos.simg docker://metagenlab/mummer2circos:1.2

example

singularity exec mummer2circos.simg mummer2circos -r <reference.fna> -q <query.fna>  -l

Alignment method

  • the whole genome alignments can be done with three different methods: megablast, nucmer or promer
  • use the parameter -a to indicate which method to use. Nucmer is the default option.

mummer2circos -l -a promer ...

Simple plot

  • -r reference fasta
  • -q other fasta with to compare with the reference fasta
  • -l mendatory option to build circular plots
  • genome tracks are ordered based on the order of the input query fasta files
mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*fna

Simple plot

Condensed tracks

mummer2circos -l -c -r genomes/NZ_CP008827.fna -q genomes/*fna

Simple plot

With gene tracks

  • the header of the reference fasta file chromosome (and eventual plasmids) should be the same as the locus accession of the genbank file. See example file NZ_CP008828.fna.

LOCUS NZ_CP008828 15096 bp DNA CON 16-AUG-2015

mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk

Simple plot

Label specific genes

  • given a fasta file of protein of interest, label the BBH of each amino acid sequence on the circular plot
  • the fasta headers are used as labels (see example file VF.faa)
mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa 

Simple plot

Show mapping depth along the chromosome (and plasmids)

  • depth files can be generated from bam file using samtools depth
  • the labels used in the .depth file should be the same as the fasta header (see example files)
  • regions with depth higher than 2 times the median are croped to that limit and coloured in green (deal with highly repeated sequences).
  • regions with depth lower than half of the median depth are coloured in red.
mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa -s GCF_000281535.depth

Simple plot

Add labels based on coordinate file

  • structure: LOCUS start stop label (see labels.txt)
  • labels can not include spaces
mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/NZ_FO834906.fna -gb GCF_000281535_merged.gbk -b VF.faa -s GCF_000281535.depth -lf labels.txt

Simple plot

show links between two genomes

mummer2circos -r genomes/NZ_CP012745.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa -s GCF_000281535.depth -lf labels.txt

Simple plot

Highlight specific ranges based on coordinate file

  • overlapping ranges will overlap on the figure
  • TODO: add the possibility to input multiple range files that would be displayed on different tracks