/MIDAS2

Metagenomic Intra-Species Diversity Analysis 2

Primary LanguagePythonMIT LicenseMIT

Metagenomic Intra-Species Diversity Analysis 2

DOI

Metagenomic Intra-Species Diversity Analysis (MIDAS) is an integrated pipeline for profiling strain-level genomic variations in shotgun metagenomic data. The standard MIDAS workflow harnesses a reference database of 5,926 species extracted from 30,000 genomes (MIDAS DB v1.2). MIDAS2 used the same analysis workflow as the original MIDAS tool, and is engineered to work with more comprehensive MIDAS Reference Databases (MIDASDBs), and to run on collections of thousands of samples in a fast and scalable manner.

For MIDAS2, we have already built two MIDASDBs from large, public, microbial genome databases: UHGG 1.0 and GTDB r202.

Publication is available in Bioinformatics. User manual is available at ReadTheDocs.

The performance of reads mapping based metagenotyping pipeline depends on (1) how closely related the DB reference genomes are to the strains in the samples being genotyped, and (2) post-alignment filter options, and etc. Pitfalls of genotyping microbial communities with rapidly growing genome collections can be found here.