
Mining and annotation plasmids contigs and AMR genes from metagenome datasets

plasmids-inspect: Mining and annotation Plasmids and AMR gene from Metagenome Datasets.

1. Introduction / Workflow Summary

The metagenomic clean reads were assembled to contigs using metaSPAdes with parameter "-meta –threads 40 -k 21,33,55,77,99,127", MetaProdigal was employed for gene prediction of the assembled contigs.

We used PlasForest[12] a homology-based random-forest classifier and PlasClass (parameter: score ≥ 0.99 and minimal contig length ≥ 500bp), a kmer-based logistic regression classifier to identify plasmid sequences in assembled contigs. The plasmid contigs were aligned to NCBI Refseq [14] plasmid database to identify the taxonomy origin using BLASTN (version 2.10.1) [15].

All clean reads were aligned to assembled contigs and predicted ORF with Bowtie2 (parameter: --end-to-end --sensitive -I 200 -X 400), ORFs were quantified with transcripts per million (TPM), TPM is calculated as:


where Ng is the read count, the reads number mapped to the g gene, and Lg is the gene length. The index j stands for the set of all predicted gene in sample, and g is an index indicating a particular gene.

CoverM was used for contigs abundance quantification.

2. Other Tools Dependence

#software versions link
FastQC 0.11.9 https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Trimmomatic 0.39 http://www.usadellab.org/cms/?page=trimmomatic
BMTagger 3.102 ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/bmtagger/
MetaSPADes 3.15.3 https://github.com/ablab/spades
MetaProdigal 2.6.3 https://github.com/hyattpd/Prodigal
bowtie2 2.4.4 https://github.com/BenLangmead/bowtie2
coverm 0.6.1 https://github.com/wwood/CoverM
Kallisto 0.46.2 https://github.com/pachterlab/kallisto
mmseqs2 r13 https://github.com/soedinglab/MMseqs2
PlasForest 1.3 https://github.com/leaemiliepradier/PlasForest
seqtk 0.1 https://github.com/lh3/seqtk
tabtk 0.1 https://github.com/lh3/tabtk
csv2tsv 2.2.0 https://github.com/eBay/tsv-utils
R 4.2.1 https://www.r-project.org/

3. Reference

