First, the pipeline trims isolate sequencing data and assembles this using the Shovill wrapper around SPAdes. Kraken2 is used to detect species in the isolate sequencing data and Quast is used to evaluate the draft genome assembly metrics.
The draft genome is annotated in various ways:
- Virulence genes are detected using ABRicate with the VFDB database
- Resistance genes are detected using AMRfinderplus
- MLST is assessed using mlst
- General annotation using Prokka
Next, the trimmed metagenomic reads are mapped onto the draft genome using Snippy. Reads that could not be mapped are written to a separate file and checked for species using Kraken2.
Results are collected and summarised using basic bash/awk scripts.