Input: assembly file NAME.fasta
1. Filtering contigs
Filter contigs by length threshold in kb (default: 5kb).
Input: NAME.fasta
Output: NAME_filt500bp.fasta
2.1. VirSorter
Mining viral signal from microbial genomic data. Tool generates folder Predicted_viral_sequences (relevant are VIRSorter_cat-[123].fasta and VIRSorter_prophages_cat-[45].fasta).
Input: NAME_filt500bp.fasta
Output: Predicted_viral_sequences
2.2. VirFinder
R package for identifying viral sequences from metagenomic data using sequence signatures.
Input: NAME_filt500bp.fasta
Output: VirFinder_output.tsv
3. Parsing virus files
According of results on previous steps script generates High_confidence, Low_confidence and Prophages files. Some of output files may be missing.
Input:
- NAME_filt500bp.fasta
- VirFinder_output.tsv
- Predicted_viral_sequences
Output:
- High_confidence.fna
- Low_confidence.fna
- Prophages.fna
4. Prodigal
Tool predicts proteins for each input fasta-file.
Input: output files of step #3
- High_confidence.fna
- Low_confidence.fna
- Prophages.fna
Output:
- High_confidence_prodigal.faa
- Low_confidence_prodigal.faa
- Prophages_prodigal.faa
5. HMMSCAN
HMMSCAN is used to search protein sequences against collections of protein profiles.
Input: output files of step #4
- High_confidence_prodigal.faa
- Low_confidence_prodigal.faa
- Prophages_prodigal.faa
Output:
- High_confidence_prodigal_hmmscan.tbl
- Low_confidence_prodigal_hmmscan.tbl
- Prophages_prodigal_hmmscan.tbl
6. Table(s) processing
Scripts add titles to columns and separate columns with tabs.
Input:
- High_confidence_prodigal_hmmscan.tbl
- Low_confidence_prodigal_hmmscan.tbl
- Prophages_prodigal_hmmscan.tbl
Output:
- High_confidence_prodigal_hmmscan_modified.faa
- Low_confidence_prodigal_hmmscan_modified.faa
- Prophages_prodigal_hmmscan_modified.faa
7. Ratio evalue table
Generates tabular file (File_informative_ViPhOG.tsv) listing results per protein, which include the ratio of the aligned target profile and the abs value of the total Evalue.
Input:
- High_confidence_prodigal_hmmscan_modified.faa
- Low_confidence_prodigal_hmmscan_modified.faa
- Prophages_prodigal_hmmscan_modified.faa
Output:
- High_confidence_prodigal_hmmscan_modified_informative.tsv
- Low_confidence_prodigal_hmmscan_modified_informative.tsv
- Prophages_prodigal_hmmscan_modified_informative.tsv
8. Annotation
Script generates tabular output for each viral prediction file which summarizes the ViPhOG annotations for all the corresponding predicted proteins.
Input:
- High_confidence.fna
- High_confidence_prodigal_hmmscan_modified_informative.tsv
- High_confidence.fna
- Low_confidence.fna
- Low_confidence_prodigal_hmmscan_modified_informative.tsv
- Low_confidence.fna
- Prophages.fna
- Prophages_prodigal_hmmscan_modified_informative.tsv
- Prophages.fna
Output:
- High_confidence_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
- Low_confidence_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
- Prophages_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
9.1. Mapping
Script creates an output directory for each viral prediction file and generates contig maps for each viral contig in pdf format, which are then stored in the created output director.
Input:
- High_confidence_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
- Low_confidence_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
- Prophages_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
Output:
- High_confidence_mapping_results
- Low_confidence_mapping_results
- Prophages_mapping_results
9.2. Assign taxonomy
Script generates tabular file with taxonomic assignment of viral contigs based on ViPhOG annotations.
Input:
- High_confidence_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
- Low_confidence_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
- Prophages_prodigal_hmmscan_modified_informative_prot_ann_table.tsv
Output:
- High_confidence_prodigal_hmmscan_modified_informative_prot_ann_table_tax_assign.tsv
- Low_confidence_prodigal_hmmscan_modified_informative_prot_ann_table_tax_assign.tsv
- Prophages_prodigal_hmmscan_modified_informative_prot_ann_table_tax_assign.tsv
Assembly
|
Length filter
| \
| \
VirFinder VirSorter
| /
| /
Parsing virus files
|
|
Prodigal -- S
| \ u
HMMscan \ b
| \ W
Modification | o
| / r
| / k
Annotation F
| \ l
| \ o
Mapping Assign -- w