dRep is a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.
Manual, installation instructions, and API are at available at ReadTheDocs
Publication is available at ISMEJ
Open source pre-print publication is available at bioRxiv
$ pip install drep
$ dRep compare output_directory -g path/to/genomes/*.fasta
$ dRep dereplicate outout_directory -g path/to/genomes/*.fasta
$ dRep check_dependencies
- Mash is used to rapidly compare all genomes in a pair-wise manner
- MUMmer is used to perform more actuate comparisons between genomes which are shown to be similar with Mash
- CheckM is used to determine the contamination and completeness of genomes (used during de-replication)
- gANI (aka ANIcalculator) is an optional alternative to MUMmer
- Prodigal is a dependency of both checkM and gANI
- Centrifuge can be used to perform rough taxonomic assignment of bins
- Mash - Makes primary clusters (v1.1.1 confirmed works)
- MUMmer - Performs default ANIm comparison method (v3.23 confirmed works)
- fastANI - A fast secondary clustering algorithm
- CheckM_ - Determines contamination and completeness of genomes (v1.0.7 confirmed works)
- gANI (aka ANIcalculator) - Performs gANI comparison method (v1.0 confirmed works)
- Prodigal - Used be both checkM and gANI (v2.6.3 confirmed works)
- NSimScan - Only needed for goANI algorithm (open source version of gANI)