/VirusHunterGatherer

Snakemake pipeline for running Virushunter and Virusgatherer

Primary LanguagePerl

Summary

This is a two-stage computational workflow for data-driven virus discovery from sequencing data from the Sequence Read Archive or your own data. Stage 1 (Virushunter) searches the raw reads using profile Hidden Markov Models. Stage 2 (Virusgatherer) perform a seed-based, iterative viral genome assembly that specifically targets the sequences identified in the first stage.

Software dependencies

  • EMBOSS
  • seqtk
  • fastp
  • NCBI blast
  • NCBI SRA toolkit
  • HMMer
  • Genseed-HMM
  • CAP3
  • newbler
  • Bowtie 2
  • snakemake
  • vsearch >=2.15.2 <2.20.0

Blast databases

You need to install the following Blast databases and specify their file paths and names in the config.yaml:

NOTE: to download only RdRp-encoding RNA viruses, the following command can be used: esearch -db nucleotide -query "txid2559587[Organism:exp] AND refseq[filter] NOT txid2732397[Organism:exp]" | efetch -format fasta > riboviria.no_pararnavirae.genomic.fna

  • filter (see subfolder 4_databases; use makeblastdb command to create a BLAST database)

Note: For detailed instructions on downloading Blast databases, please refer to our GitHub Wiki.

Support

For questions or support, email chris.lauber at twincore.de

License

GPLv3

References

Lauber C*, Seitz S*, Mattei S, Suh A, Beck J, Herstein J, Börold J, Salzburger W, Kaderali L, Briggs JAG, Bartenschlager R. Deciphering the Origin and Evolution of Hepatitis B Viruses by Means of a Family of Non-enveloped Fish Viruses. Cell Host Microbe. 2017 Sep 13;22(3):387-399.e6. doi: 10.1016/j.chom.2017.07.019.

Lauber C, Vaas J, Klingler F, Mutz, M, Gorbalenya AE, Bartenschlager F, Seitz S. Deep mining of the Sequence Read Archive reveals bipartite coronavirus genomes and inter-family Spike glycoprotein recombination. bioRxiv. doi: https://doi.org/10.1101/2021.10.20.465146

Lauber C, Chong LC. Viroid-like RNA-dependent RNA polymerase-encoding ambiviruses are abundant in complex fungi. Frontiers Microbiology. 2023 May 12; Volume 14. https://doi.org/10.3389/fmicb.2023.1144003

* equal contribution