Collection of papers and tools that are helpful for bioinformatic & biostatistic analysis.
Category | Name | Description | Link |
---|---|---|---|
Training collection | SIB | A curated list of bioinformatics training material | 861 |
Tutorial | Python Tutorial | Python Tutorial | 862 |
Category | Name | Description | Link |
---|---|---|---|
Dockerfile | Singularity in Docker | The resulting Docker image can be used on any system with Docker to build Singularity images | 710 |
Tutorial | Singularity | Containerization | 711 |
Hub | SingularityHub | Encapsulation of Environments with Containers | 712 |
Category | Name | Description | Link |
---|---|---|---|
Graph Platform | neo4j | is a graph database management system | 476, 477 |
LBD | SemNet | provides an adoptable method for efficient Literature-Based-Discovery (LBD) of PubMed that extends beyond omics-only relationships to true multi-scalar connections that can provide actionable insight for predictive medicine, research prioritization, and clinical care | 478 |
Graph Database | Het | Heterogeneous Network Edge Prediction: A Data Integration Approach to Prioritize Disease-Associated Genes | 479 |
Graph Database | BioGraph | an online service and a graph DB for querying and analyzing bioinformatics resources | 481, 482 |
Graph Database | edge2vec | Learning Node Representation Using Edge Semantics" | 483, 484 |
Graph Database | NGLY1 Deficiency Knowledge Graph | NGLY1 Deficiency Knowledge Graph, the reasoning context to support hypothesis discovery for NGLY1 Deficiency-CDDG | 485, 486, 487 |
Graph Database | StarPepDB | is a Neo4j graph database resulting from an integration process by which data from a large variety of bioactive peptide databases are cleaned, standardized, and merged so that it can be released into an organized collection | 488, 489 |
Knowledgebase | NeXtProt | is an integrative resource providing both data on human protein and the tools to explore these | 557, 558 |
Graph Database | Cayley | is an open-source database for Linked Data. It is inspired by the graph database behind Google's Knowledge Graph (formerly Freebase) | 559, 560 |
Tutorial | Neo4j | Importing CSV Files in Neo4j | 791 |
Tutorial | Neo4j | Getting Started with Graph Embeddings in Neo4j | 792 |
Category | Name | Description | Link |
---|---|---|---|
WGS | GTDB-Tk | GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes. It is computationally efficient and designed to work with recent advances that allow hundreds or thousands of metagenome-assembled genomes (MAGs) to be obtained directly from environmental samples. It can also be applied to isolate and single-cell genomes. | 10, 11 112 |
16S | pbHITdb | PharmaBiome manually curated HITdb | 12 |
HumanMicrobiome R Data | curatedMetagenomicData | Dataset that can be loaded into R which contains human microbiome data from several body sites | 35 229 230 |
HMP 16S | HMP16SData | R/Bioconductor package to simplify access to and analysis of HMP 16S data | 63 64 |
tRNA | GtRNAdb | The genomic tRNA database contains tRNA gene predictions made by tRNAscan-SE on complete or nearly complete genomes. Unless otherwise noted, all annotation is automated, and has not been inspected for agreement with published literature. | 75 |
16S Database | EzBioCloud 16S | Unlike other public databases, EzBioCloud’s 16S database can be used for species-level identification of OTUs and is freely available for academic, not-for-profit purposes | 90 91 |
WGS core gene database | UBCG | UBCG stands for the Up-to-date Bacterial Core Gene. It is a method and software tool for inferring phylogenetic relationship using bacterial core gene set that is defined by up-to-date bacterial genome database. | 94 95 |
Mouse gut gene catalog | iMGMC | integrated Mouse Gut Metagenomic Catalog | 98 99 |
WGS | Paper | 737 WGS from high-throughput culturomics | 109 110 |
Database | MicrobiomeDB | A data-mining platform for interrogating microbiome experiments | 113 |
Database | MGnify | Public Datasets of Metagenomic samples and 16S data of various clinical studies (UHGG,UHGP) | 114, 115, 495, 496, 497, 603 |
Database | dbBact | Microorganisms Knowledge Database | 117 |
Database | BIGSI | BIGSI can search a collection of raw (fastq/bam), contigs or assembly for genes, variant alleles and arbitrary sequence. It can scale to millions of bacterial genomes requiring ~3MB of disk per sample while maintaining millisecond kmer queries in the collection | 124 125 126 |
Database | GutCyc | GutCyc is a publicly-available and licenced resource and portal providing pathway annotation data for environmental metagenomic samples derived from the metagenomic studies of the human gut. | 134 135 136 |
Database | miBC | This collection includes all cultivable bacterial strains isolated from the intestine of mice (Mus musculus) that are publicly available to date. | 141 142 |
Ortholog Database | OrthoMCL DB | is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. | 148 |
Database | KiMoSys | Data repository for KInetic MOdels of biological SYstems | 150 |
Database | BiGG Database | BiGG Models is a knowledgebase of genome-scale metabolic network reconstructions. BiGG Models integrates more than 70 published genome-scale metabolic networks into a single database with a set of stardized identifiers called BiGG IDs. Genes in the BiGG models are mapped to NCBI genome annotations, and metabolites are linked to many external databases (KEGG, PubChem, and many more). | 151 |
Database | embl_gems | This is a collection of genome-scale models built for all reference and representative bacterial genomes of NCBI RefSeq (release 84) using CarveMe | 160 |
Database | BioCyc | BioCyc is a collection of 14560 Pathway/Genome Databases (PGDBs), plus software tools for exploring them | 168 |
Database | MetaCyc | MetaCyc is a curated database of experimentally elucidated metabolic pathways from all domains of life. MetaCyc contains 2666 pathways from 2960 different organisms | 169 |
Database | KEGG | A set of annotation maps for KEGG assembled using data from KEGG | 171 172 |
Database | VMH | The VMH database captures information on human and gut microbial metabolism and links this information to hundreds of diseases and nutritional data | 176 177 |
Database | ggkbase | an online database that offers users several options for retrieving data of interest: by projects, names, description, by genome completion or class | 189 |
Database Cohorts | IGGdb | integrated genomes from the gut microbiome and other environments | 192 193 |
Database | Genome Properties | Genome properties is an annotation system whereby functional attributes can be assigned to a genome, based on the presence of a defined set of protein signatures within that genome | 215 216 217 218 |
Database | YANA | a software tool for analyzing flux modes, gene-expression and enzyme activities | 219 |
Database | Clinical Trials | is a database of privately and publicly funded clinical studies conducted around the world | 222 |
Database | microcontax | R package of microclass: The consensus taxonomy for prokaryotes is a package of data sets designed to be the best possible for training taxonomic classifiers based on 16S rRNA sequence data | 231 232 |
Database | ExperimentHub | ExperimentHub provides a central location where curated data from experiments, publications or training courses can be accessed. | 252 253 |
Database | curated MetagenomicData | Bioconductor package with thousands of curated metagnome datasets based on the ExperimentHub publication | 257, 258 |
Database | Knomics-Biota | Online service for exploratory analysis of human gut metagenomes | 265 266 |
Database | Terra | Terra is a cloud-native platform for biomedical researchers to access data, run analysis tools, and collaborate. | 267 |
Knowledgebase | Grakn | Grakn is an intelligent database: a knowledge graph engine to organise complex networks of data and make it queryable | 268 269 270 271 |
Database | HiMapDB | HiMAP database contains more unique species and strains than any major database | 272 |
Database | HGTree | an explicit evolutionary approach that is generally considered to be a reliable way to detect HGT | 276 277 |
Database | Pfam | a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs) | 281 |
Database | ISFinder | provides a list of insertion sequences (IS) isolated from bacteria and archae (MGEs) | 313 |
Database | ICEberg2.0 | an updated database of bacterial integrative and conjugative elements | 318 319 |
Database | microscope | Microbial Genome Annotation & Analysis Platform | 329 |
Database | CARD | Comprehensive Antibiotic Resistance Database that is used to identify resistance genes (used in seres patent) | 335 |
Database | Raes Reference Genomes | Reference genomes from HMP project but filtered and assembled by Raes lab as new resource | 358 |
Database | proGenomes | Currated database by Sunagawa about with genomes and very good functional annotation on bacteria and archea | 371, 372 |
Database | PATRIC | the Pathosystems Resource Integration Center, provides integrated data and analysis tools to support biomedical research on bacterial infectious diseases | 416 |
Database | FARMEDB | is a database of DNA and protein sequences derived exclusively from environment sequences showing AR in laboratory experiments. The Functional Antibiotic Resistant Metagenomic Element (FARME) database is a compilation of publically available DNA sequences, predicted protein sequences conferring antibiotic resistance and additional regulatory and mobile genetic elements and predicted proteins flanking the antibiotic resistant genes | 442, 443 |
Database | VMH | The VMH database captures information on human and gut microbial metabolism and links this information to hundreds of diseases and nutritional data | 474 |
Database | MetaNetX | Automated Model Construction and Genome Annotation for Large-Scale Metabolic Networks | 475 |
Database | Microbiome Database (old Integrated Gene Catalogue) | Microbiome database involves the sequencing resource and metadata of ecological community samples of microorganisms, including both host-associated or environmental microbes. This database provides detailed and accurate metadata of these metagenomics samples, as well as gene catalogs for host-associated microbiome, and moreover, well-characterized isolated strains can be found in our database too | 490, 491 |
Database | Human Gut metabolic Models | Human curated database by Raes lab to link pathway identifiers to metabolic functions which can be used for metagenomic samples to get metabolic functions | 510 |
Database | CAZy | The Carbohydrate-Active enZYmes Database CAZy database describes the families of structurally-related catalytic and carbohydrate-binding modules (or functional domains) of enzymes that degrade, modify, or create glycosidic bonds. | 18, 19 |
Database | ImmeDB | Intestinal microbiome mobile element is a database dedicated to the collection, classification, and annotation of mobile genetic elements (MGEs) from gut microbiome | 595, 596 |
LIMS | openBIS | open source Laboratory Notebok & Inventory manager | 707 |
Database | probeBase | probeBase is a curated database of rRNA-targeted oligonucleotide probes and primers | 724 |
Database | bugsigdb | A Comprehensive Database of Published Microbial Signatures | 766, 767 |
Webapp | GMGC | Global Microbial Gene Catalog | 772, 773 |
Webapp | MAP | The Microbe Atlas Project aims to shed new light on the ecology of these elusive microbes by leveraging the large amounts of sequenced microbial communities | 821 |
Database | proGenomes2 | an improved database for accurate and consistent habitat, taxonomic and functional annotations of prokaryotic genomes | 851 |
Category | Name | Description | Link |
---|---|---|---|
Bioinformatic Tools | OmicTools | Collection of many many tools that can be useful for some bioinformatic anlyses | 4 |
16S pipeline | Gloor Lab dada2 pipeline | This pipeline will take your paired fastq reads (from Illumina MiSeq or HiSeq) and generate an OTU counts table with an approximate taxonomy assignment. The reads have to have been generated using Gloor Lab Illumina SOP so that the reads are paired, overlapping, and contain the barcode and primer information (have not been demultiplexed or had primers or barcodes removed). | 8 |
Metagenomics | SingleM | SingleM is a tool to find the abundances of discrete operational taxonomic units (OTUs) directly from shotgun metagenome data, without heavy reliance on reference sequence databases. It is able to differentiate closely related species even if those species are from lineages new to science. | 13 |
Gene annotation | Pulpy | An automated, reproducible and scalable prediction of Polysaccharide Utilisation Loci (PUL) in 5414 public Bacteroidetes genomes. The predictions are fully open and can be accessed and used by any researcher, commercial or otherwise. | 17, 18, 19; preprint 20 |
16S pipeline | mare | The mare R package is an easy-to-use pipeline for microbiota analysis based on 16S-amplicon reads. It takes the raw reads, creates taxonomic tables, visualises the results, and finally identifies organisms significantly associated with variables of interest. For read processing, OTU clustering, and taxonomic annotation | 32 |
WGS assembly pipeline | pgap | The official bacterial whole genome assembly pipeline of NCBI | 33, 674 |
r-package | picante | Phylocom integration, community analyses, null-models, traits and evolution in R | 39 |
tree-modeling | iq-tree | Fast and effective stochastic algorithm to reconstruct phylogenetic trees by maximum likelihood. IQ-TREE compares favorably to RAxML and PhyML in terms of likelihood while requiring similar amount of computing time | 45 |
modeling | PartitionFinder2 | PartitionFinder2 is a program for selecting best-fit partitioning schemes and models of evolution for nucleotide, amino acid, and morphology alignments. | 47 |
Function Prediction | PICRUST | Predicts functions of total genomes based on 16S sequences | 49 |
Function Prediction | Tax4Fun | Predicts functions of total genomes based on 16S sequences | 50 |
ML-classifier | MicroPheno | is a reference- and alignment-free approach for predicting the environment or host phenotype from microbial community samples based on k-mer distributions in shallow sub-samples of 16S rRNA data. | 54, 55 |
OTU-generator | DiTaxa | alignment- and reference- free subsequence based 16S rRNA data analysis, as a new paradigm for microbiome phenotype and biomarker detection | 56 |
OTU-geneartor | HmmUFOtu | An HMM and phylogenetic placement based ultra-fast taxonomic assignment and OTU picking tool for microbiome amplicon sequencing studies | 58 |
OTU-generator | otu2ot | Oligotyping for R | 59 |
Microbiomics SOP | Microbiome_helper | Microbiome Helper is a repository that contains several resources to help researchers working with microbial sequencing data | 62 |
16S Pipeline | SeekDeep | is one command line program that contains several programs within that all combined together make up the SeekDeep targeted sequencing analysis pipeline | 67, 68 |
R Package - ShinyApp | FastqCleaner | An interactive web application for quality control, filtering and trimming of FASTQ files. | 81, 82 |
Preprocessing tool | fastp | A tool designed to provide fast all-in-one preprocessing for FastQ files mainly used to correct R1 and R2 reads for better merging | 83, 84 |
Python tool | ncbi-genome-download | Some script to download bacterial and fungal genomes from NCBI after they restructured their FTP a while ago. | 85 |
Pipeline | phyloFlash | phyloFlash is a pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an Illumina (meta)genomic or transcriptomic dataset. | 86 |
Tool | DUDE-Seq | DUDE-Seq: Fast, flexible, and robust denoising of nucleotide sequences | 92, 93 |
Python tool | RAMBL | A tool for the assembly of full-length 16S genes in metagenomic shotgun data | 100, 101 |
Classification tool | CAMITAX | Taxonomic assignment workflow based on multiapproach | 105, 106 |
Docker container | speciesprimer | The SpeciesPrimer pipeline is intended to help researchers finding specific primer pairs for the detection and quantification of bacterial species in complex ecosystems | 111 |
tool | EnaBrowserTools | enaBrowserTools is a set of scripts that interface with the ENA web services to download data from ENA easily, without any knowledge of scripting required | 116 |
Toolkit | NCBI Toolkit | NCBI C++ Toolkit provides free, portable, public domain libraries with no restrictions use - on Unix, MS Windows, and Mac OS platforms | 119 |
tool | FastANI | Fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI) | 120, 121 |
toolbox | EzBio tools | OrthoANI, UBCG and other useful tools for WGS analyses | 122 |
data wrangling | Bioinformatics one-liners | Useful bash one-liners useful for bioinformatics | 133 |
web-workbench | imngs | Integrated Microbial NGS platform | 143, 144 |
Pipeline | Roary | Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome. | 147 |
Tool | OrthoFinder | It finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of the gene duplcation events in those gene trees. | 149 |
Tool | CarveMe | CarveMe is a python-based tool for genome-scale metabolic model reconstruction. | 152, 153, 154 |
Tool | SMETANA | Species METabolic interaction ANAlysis is a python-based command line tool to analyse microbial communities | 155, 156 |
Tool | FRAMED | a python package for analysis and simulation of metabolic models. The main focus is to provide support for different modeling approaches | 157 158 162 |
Tool | cobrapy | COBRA methods are widely used for genome-scale modeling of metabolic networks in both prokaryotes and eukaryotes. cobrapy is a constraint-based modeling package that is designed to accommodate the biological complexity of the next generation of COBRA models and provides access to commonly used COBRA methods, such as flux balance analysis, flux variability analysis, and gene deletion analyses | 159 |
Tool | GPRTransform | It contains an implementation of the method that transforms an SBML model by integrating the GPR associations directly into the stoichiometric matrix. This enables gene-based analysis using several constraint-based methods 163 164 | |
Tool | eggnog-mapper | a tool for fast functional annotation of novel sequences (genes or proteins) using precomputed eggNOG-based orthology assignments | 165 166 |
Pipeline | miQTL-cookbook | This is the cookbook for performing the GWAS analysis of microbial abundance based on analysis of 16S rRNA sequencing dataset | 167 |
Tool | DuctApe | The final purpose of the program is to combine the genomic informations (encoded as KEGG pathways) with the results of phenomic experiments (Phenotype Microarrays) and highlight the genes that may be responsible for phenotypic variations | 170 |
Tool | VFFVA | FVA is the workhorse of metabolic modeling. It allows to characterize the boundaries of the solution space of a metabolic model and delineates the bounds for reaction rates | 174 175 |
Pipeline | BACTpipe | Automatic Assembly and Annotation from raw reads in a very clean implemented nextflow pipeline | 178 |
Pipeline | MAG core | Automatic assembly and annotation from raw reads of metagenomic data implemented in nextflow pipeline | 179 |
Pipeline | Tychus Nextflow | Automatic whole genome assembly and annotation of isolate strain. Uses multiple assemblers and takes consensus | 180 |
Pipeline | IMP | Reference-independent metagenomic and metatranscriptomic bacterial assembly | 182, 183 |
Tool | DESMAN | de novo extraction of strains from metagenomes, enables strain inference from frequency counts on contigs across multiple samples | 184 185 |
SOP | MicroBiome Quality Control (MBQC) | MBQC is a collaborative effort to comprehensively evaluate methods for measuring the human microbiome | 187 |
Pipeline | MIDAS | an integrated pipeline that leverages >30,000 reference genomes to estimate bacterial species abundance and strain-level genomic variation, including gene content and SNPs, from shotgun metagnomes | 196 197 |
Tool | MAGpurify | algorithms to identify contamination in metagenome-assembled genomes (MAGs) | 198 |
Tool | MicrobeCensus | a fast and easy to use pipeline for estimating the average genome size (AGS) of a microbial community from metagenomic data | 199 |
Tool | IGGsearch | it accurately quantifies species presence-absence and species abundance by mapping reads to a database of species-specific marker genes | 200 |
Tool | MIDAS-strains | Estimate strains from reads mapped to pan-genomes from the MIDAS database | 201 |
Tool | AssemblyEvaluator | Evaluate the completedness and precision of a (meta)genomic assembly by mapping contigs to a complete reference genome | 202 |
Tools | Biobakery Workflows | Set of tools by Huttenhower that can be fairly easily executed with pre-defined workflows, useful for metagenomics and metatranscriptomics | 204 |
Tools | Anvi'o | Anvi’o is an open-source, community-driven analysis aation platform for ‘omics data | 208 209 210 211 |
Tool | WAFFLE | the Workflow to Annotate Assemblies and Find Lateral Gene Transfer (LGT) Events | 212 |
Tool | AUTOGRAPH | AUtomatic Transfer by Orthology of Gene Reaction Associations for Pathway Heuristics, is a semi-automatic approach to accelerate the process of genome-scale metabolic network reconstruction by taking full advantage of already manually curated networks | 214 |
Tool | pyTARG | a library that contains functions to work with Genome Scale Metabolic Models with the goal of finding drug targets against cancer | 223 224 |
Assembler | Unicycler | An assembler for short and long read hybrid assembly, works with SPADES and then something else for long reads. | 227 |
R package | microclass | an R-package for 16S taxonomy classification | 231 232 |
Tool | Prodigal | Fast, reliable protein-coding gene prediction for prokaryotic genomes | 233 234 |
Tool | STAMP | a graphical software package that provides statistical hypothesis tests and exploratory plots for analysing taxonomic and functional profiles | 235 236 |
Tool | CheckM | an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes | 237 238 |
R script | consenTRAIT | Phylogenetic conservatism of functional traits in microorganisms. a phylogenetic metric that estimates the clade depth where organisms share a trait | 239 240 |
NIH Tools | NIH Genome Inforamtics Section | Tools for various bioinformatic tasks, assembly, Mash, metagenomes, Krona, MUMmer alignment | 242 |
R package | mmgenome | Tools for extracting individual genomes from metagneomes | 243 244 |
Tool | SPAdes | St. Petersburg genome assembler – is an assembly toolkit containing various assembly pipelines | 254,365 |
Tool | SqueezeMeta | a fully automated metagenomics pipeline, from reads to bins | 261 262 |
Tool | MetaWRAP | a flexible pipeline for genome-resolved metagenomic data analysis | 263 264 |
R Package | HiMap | High-resolution Microbial Analysis Pipeline to Strain level with dada2 and curated HiMapDB | 273 274 |
Research Group | van nimwegenlab | a range of software tools, web-services, and databases in regulatory and comparative genomics for WGS | 275 |
Tool | Rnammer | predicts 5s/8s, 16s/18s, and 23s/28s ribosomal RNA in full genome sequences | 278 |
Tool | RANGER-DTL | Rapid ANalysis of Gene family Evolution using ReconciliationDTL is a software package for inferring gene family evolution by speciation, gene duplication, horizontal gene transfer, and gene loss | 279 |
Tool | Darkhorse | a bioinformatic method for rapid, automated identification and ranking of phylogenetically atypical proteins on a genome-wide basis | 280 |
Tool | ABRicate | Mass screening of contigs for antimicrobial resistance or virulence genes. It comes bundled with multiple databases: Resfinder, CARD, ARG-ANNOT, NCBI BARRGD, NCBI, EcOH, PlasmidFinder, Ecoli_VF and VFDB | 286 334 |
Tool | MetaCompare | MetaCompare is a computational pipeline for prioritizing resistome risk by estimating the potential for ARGs to be disseminated into human pathogens from a given environmental sample based on metagenomic sequencing data | 287 |
Tool | DeepARG | DeepARG is a machine learning solution that uses deep learning to characterize and annotate antibiotic resistance genes in metagenomes | 288 |
Tool | SSTAR | Sequence Search Tool for Antimicrobial Resistance combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying antimicrobial resistance (AR) genes from genomic data | 289 290 |
Tool | ProtCNN ProtENN | Predicting the function of a protein from its raw amino acid sequence is the critical step for understanding the relationship between genotype and phenotype | 295 |
Benchmarking | Long-read-assembler-comparison | Benchmarking of long-read assembly tools for bacterial whole genomes | 298 |
conda | bioconvert | is a collaborative project to facilitate the interconversion of life science data from one format to another | 299 |
Tool | bin3C | Extract metagenome-assembled genomes (MAGs) from metagenomic data using Hi-C | 303 304 |
Tool | MAGpy | Snakemake pipeline for downstream analysis of metagenome-assembled genomes (MAGs) (pronounced mag-pie) | 305 306 |
Tool | graftM | a tool for scalable, phylogenetically informed classification of genes within metagenomes | 307 308 |
Tool | GFinisher | a tool for refinement and finalization of prokaryotic genomes assemblies using the bias of GC Skew to identify assembly errors and organizes the contigs/scaffolds with genomes references | 311 312 |
Tool | Autometa | automated extraction of microbial genomes from individual shotgun metagenomes | 314 315 |
Tool | iMGEins | detecting novel mobile genetic elements inserted in individual genomes (MGEs) | 316 317 |
Tool | McClintock | an Integrated Pipeline for Detecting Transposable Element Insertions in Whole-Genome Shotgun Sequencing Data (MGEs) | 320 321 |
Webtool | PHASTER | a better, faster version of the PHAST phage search tool | 322 323 |
Tool | ISQuest | identifies bacterial ISs and their sequence elements—inverted and direct repeats—in raw read data or contigs using flexible search parameters (MGEs) | 324 325 |
Tool | VirSorter | mining viral signal from microbial genomic data | 326 327 |
Tool | RAST | (Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating complete or nearly complete bacterial and archaeal genomes | 329 330 |
Tool | ShortBRED | Tool by Huttenhower group that identifies protein families in metagenomic samples. Useful for protein profiling?? | 336 |
Tool & R package | GSEA | Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes) | 337 338 |
Tool, Database | GMMs Omixer | Tool with curated database by raes lab that links metagenomic samples to functions and metabolic capabilities | 342, 343, 344, 523 |
Tool | GRASP2 | fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing data | 345, 346 |
Tool | Picrust2 | a software for predicting functional abundances based only on marker gene sequences | 347, 348 |
Pipeline | Antimicrobial Resistance Finder | Nextflow pipeline to identify antimicrobial resistances protein sequences, looks simple to use | 350 |
Tool | Geptop2 | a gene essentiality prediction tool for complete-genome based on orthology and phylogeny | 351, 352 |
Tool | Asgan | [As]sembly [G]raphs [An]alyzer – is a tool for analysis of assembly graphs | 353 |
Tool | PopCOGenT | Identifying microbial populations using networks of horizontal gene transfer | 355 |
Tool | PhiSpy | a novel algorithm for finding prophages in bacterial genomes that combines similarity-and composition-based strategies | 356, 357 |
Tool | MetaCurator | Software for curating reference sequence databases used in barcoding, metabarcoding and metagenomics | 359, 360 |
Tutorial | astrobiomike | This site aims to be a useful resource for bioinformatics beginners | 361,362 |
Tool | (sour)Mash | fast genome and metagenome distance estimation using MinHas | 363,364 |
Tool | (meta)pasmidSpades | for plasmid assembly in metagenomic data sets that reduced the false positive rate of plasmid detection compared with the state-of-the-art approaches | 364,365 |
Tool | IslandViewer4 | integrates four different genomic island prediction methods: IslandPick, IslandPath-DIMOB, SIGI-HMM, and Islander | 366,367 |
Tool, Server | Specl | Web server (but also stand-alone tool) to determine species classification of whole genome based on ~40 universal single copy marker genes. | 370 |
Tool | iRep | is a method for determining replication rates for bacteria from single time point metagenomics sequencing and draft-quality genomes | 374,375 |
Tool | antiSMASH | allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes | 376,377,378 |
Tool | NeuRiPP | a neural network framework designed for classifying peptide sequences as putative precursor peptide sequences for RiPP biosynthetic gene clusters | 379,380 |
Tool | PhyloMagnet | Pipeline for screening metagenomes, looking for arbitrary lineages, using gene-centric assembly methods and phylogenetics | 381,382 |
Tool | KrakenUnique | Kraken based tool for classifying metagenomic reads with an additional algorithm that checks for unique Kmer matches - maybe similar to cosmosID approach | 383 |
Tool | Mash | Tool for classifying metagenomic reads similar to kraken which uses min Hash to identify species | 384 |
Tool | RefSeq_mash | Tool for checking what NCBI reference genomes raw reads match to or overall which reference genome fits the best, should be very fast. | 385 |
Pipeline | Hybrid Assembler | Hybrid Assembly pipeline in Nextflow thats coupled with a plasmIDent which identifies plasmids and resistance genes | 390, 391 |
Tool | RMI | Comprehensive antimicrobial resistance (AMR) gene finder tool online for quick analysis of genome sequences | 392 |
Pipeline | SqueezeMeta | A full automatic pipeline for metagenomics/metatranscriptomics, covering all steps of the analysis | 394, 395 |
Review | Identifying repeats and transposable element | Nice nature review that describes various software for finding these things but a bit oldated | 395 |
Tool | ARDaP | Antimicrobial Resistance Detection and Prediction) is a genomics pipeline for the comprehensive identification of antibiotic resistance markers from whole-genome sequencing data | 399 |
Tool | Flye | New long read assembler thats faster and often better than others published by USCD | 400 |
Tool | Ra | Overlap-layout-consensus based DNA assembler of long uncorrected reads (short for Rapid Assembler) | 403, 404 |
Tool | Metagenomics-Index-Correction | This repository contains scripts used to prepare, compare and analyse metagenomic classifications using custom index databases, either based on default NCBI or GTDB taxonomic systems | 405, 406 |
Tool | drep | a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set | 407 |
Tool | strainProfiler | Program to analyze strain-level diversity within a population | 408 |
Tool | seqtk | Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format | 409, 410 |
Tool | anvio to bandage tools | converts output from Anvi'o, a MAG binning tool, to the coloring scheme preferred by Bandage, an assembly visual tool, to improve binning especially for mobile genes (tranposons, recently horizontally transferred, etc.) 413 | |
Tool | OPERA-MS | OPERA-MS is a hybrid metagenomic assembler which combines the advantages of short and long-read | 414, 415 |
Tool | traitar | Traitar is a software for characterizing microbial samples from nucleotide or protein sequences. It can accurately phenotype 67 diverse traits. | 418, 419, 420 |
Tool | PhyloRank | PhyloRank provide functionality for calculating the relative evolutionary divergence (RED) of taxa in a tree and for finding the best placement of taxonomic labels in a tree. | 421 |
Tool | AnnoTree | is a web tool for visualization of genome annotations across large phylogenetic trees. | 422, 423, 424 |
Tool | AMRfinderPlus | Antibiotic resistance gene finder from NCBI | 425, 426, 678 |
Tool | nanotext | This library enables the use of embedding vectors generated from a large corpus of protein domains to search for similar genomes, where similar is the cosine similarity between one genome's vector and another's. Think about protein domains as words, genomes as documents, and search as a form of document retrieval based on the notion of topic. | 427, 428, 453 |
Tool | biomartr | Download genomes from NCBI or other databases by specifying species or group name automatically in R | 429 |
Tool | Starmr | Tool in bioconda to scan for through plasmidfinder, Resfinder, pointfinder and then produce nice summary files with the results | 430 |
Tool | TRF | Tandem Repeat Finder and Tandem Repeats Database (TRDB) | 432, 433 |
Tool | MIST | a tool for rapid in silico generation of molecular data from bacterial genome sequences | 434, 435 |
Tool | mummer | Visualization of correct aligment between genomes | 436, 887, 888, 889 |
Tool | Dot2dot | accurate whole-genome tandem repeats discovery | 437, 438 |
Tool | miCompletete | An "easy" to use tool to quickly assess the completeness and quality of new genome assemblies, kind of like checkM but with some tweaks | 439 |
Tool, Database | ARO | Antibiotic resistance ontology database and webserver to quickly get phenotype information based on genes IDs | 440, 441 |
Webapp | LINbase | a database designed for the purpose of accelerating and simplifying the description of Earth's microbial diversity at a precision that includes, but also goes beyond, named species | 447, 448 |
R package | RbioRXN | facilitate retrieving and processing biochemical reaction data such as Rhea, MetaCyc, KEGG and Unipathway, the package provides the functions to download and parse data, instantiate generic reaction and check mass-balance. The package aims to construct an integrated metabolic network and genome-scale metabolic model | 450 |
Tool | Mumame | Mutation Mapping in Metagenomes is a software tool that allows mapping of shotgun metagenomic reads to point mutations. Designed for Antibiotic Resistance mutations | 451, 452 |
Tool | Cobra | Constraint-based reconstruction and analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental molecular systems biology data and quantitative prediction of physicochemically and biochemically feasible phenotypic states | 460, 461, 462, 467 |
Tool | METABOLIC | (METabolic And BiogeOchemistry anaLyses In miCrobes), a scalable high-throughput metabolic and biogeochemical functional trait profiler based on microbial genomes | 463, 464 |
Tool | PhenotypeSeeker | Identify phenotype-specific k-mers and predict phenotype using sequenced bacterial strains | 465, 466 |
R-package | MetaboAnalystR | An R Package for Comprehensive Analysis of Metabolomics Data | 468, 472, 473 |
Shiny-App | MetaboShiny | a novel R and RShiny based metabolomics data analysis package | 469, 470, 471 |
Tool | micom | micom is a Python package for metabolic modeling of microbial communities | 492, 493, 494 |
Tool | Struo | a pipeline for building custom databases for common metagenome profilers | 498, 499 |
Tool | ubialSim | This is µbialSim (pronounced microbialsim), a dynamic Flux-Balance-Analysis-based simulator for complex microbial communities. Batch and chemostat operation can be simulated | 500, 501 |
Tool | ConFindr | to find bacterial intra-species contamination in raw Illumina data. It does this by looking for multiple alleles of core, single copy genes. | 507, 508, 722 |
Tool | MetaSanity | a wrapper-script for genome/metagenome evaluation tasks. This script will run common evaluation and annotation programs and create a BioMetaDB project with the integrated results | 509 |
Tool | REAPR | From Sanger institute, it maps paired-end reads to de-novo assembly to check for assembly errors and can break up wrong scaffolds | 511 |
Tool | Kaiju | Metagenomic read classification based on Amino acid sequences. Suggested by Gabi that it works well | 512 |
Tool | mOTU2 | The mOTUs profiler is a computational tool that estimates relative abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data. | 513, 514 |
Tool | fetchMG | it extracts the 40 MGs from genomes and metagenomes in an easy and accurate manner. | 515 |
Tool | Metage2Metabo | is a Python3 (Python >= 3.6) tool to perform graph-based metabolic analysis starting from annotated genomes (reference genomes or metagenome-assembled genomes). It uses Pathway Tools in a automatic and parallel way to reconstruct metabolic networks for a large number of genomes | 518, 519 |
R package | AMR | simplify the analysis and prediction of Antimicrobial Resistance (AMR) | 520, 521, 878 |
Tool | GRASE | Genome Relative Abundance to Sequencing Effort (GRASE) | 522 |
Tool | FMAP | Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies | 524, 525, 526 |
Tool | ResPipe | A nextflow-pipeline for interrogating metagenomes for Antimicrobial Resistance Genes (CARD-based), Insertion Sequences and Enterobactericeae Plasmids | 527, 528 |
Tool | epa-ng | A tool to place a sequence among an already calculated tree such as SILVA. Similar to pplacer | 535 |
Tool | ngs-less | A toolbox for metagenomics analyeses by Peer Bork at Embl. Has MOCAT integrated with mOTUs and functional profiling | 536 |
R package | Castor | Interesting to calculate relative evolutionary divergence (RED) with get_reds to calculate relative evolutionary divergences in a tree | 537, 538 |
R package | themetagenomics | themetagenomics provides functions to explore topics generated from 16S rRNA sequencing information on both the abundance and functional levels. It also provides an R implementation of PICRUSt and wraps Tax4fun, giving users a choice for their functional prediction strategy | 543, 544 |
Tool | prokka2kegg | This script is used to assign KO entries (K numbers in KEGG annotation) according to UniProtKB ID in the .gbk file generated by Prokka | 546 |
Toolset | PAGIT | From Wellcome Sanger Institute a set of tools to polish draft genomes and correct annotation | 547 |
Tool | DFAST | a flexible and customizable pipeline for prokaryotic genome annotation as well as data submission to the INSDC | 552, 553 |
Tool | DeepVariant | is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data | 556 |
Tool | Apollo | Apollo is an assembly polishing algorithm that attempts to correct the errors in an assembly. It can take multiple set of reads in a single run and polish the assemblies of genomes of any size | 563, 564 |
Tool | Minipolish | A tool for Racon polishing of miniasm assemblies | 566 |
Tool | AMON | A command line tool for predicting the compounds produced by microbes and the host | 567 |
Tool | Coinfinder | A tool for the identification of coincident (associating and dissociating) genes in pangenomes | 568, 569, 570 |
Tool | wtdbg2 | Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT) | 571, 572 |
Tool | freebayes | a haplotype-based variant detector | 573, 574, 578 |
Tool | qualimap | to facilitate the quality control of alignment sequencing data and its derivatives like feature counts; like FastQC for WGS and MAGs | 579, 580 |
Tool | picard | A set of Java command line tools for manipulating high-throughput sequencing (HTS) data and formats from broadinstitute | 581, 582 |
Tool | Diamond | is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data | 583, 584 |
Tool | vcftools | a set of tools for working with the variant call format (VCF) and binary variant call format (BCF) | 585, 586, 587 |
Tool | Gretel | An algorithm for recovering haplotypes from metagenomes | 589, 590 |
Tool | Hansel | Computational haplotype recovery and long-read validation identifies novel isoforms of industrially relevant enzymes from natural microbial communities | 591, 592 |
Tool | metabolisHHM | a tool for exploration of microbial phylogenies and metabolic pathways | 593, 594 |
Tool | ConjScan | MacSyFinder-based detection of Conjugative elements using systems modelling and similarity search | 597 |
Tool | MacSysFinder | A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems | 598, 599, 600 |
Tool | LEMON | It is a software takes use of existing shotgun NGS datasets to detect HGT breakpoints, identify the transferred genome segments, and reconstructs the inserted local strain | 601, 602 |
Tool | MMseqs2 | Many-against-Many sequence searching is a software suite to search and cluster huge protein and nucleotide sequence sets | 604, 605, 606 |
Pipeline | MicrobiomeBestPracticeReview | Current Challenges and Best Practice Protocols for Microbiome Analysis using Amplicon and Metagenomic Sequencing | 607, 608 |
Tool | Medaka | is a tool to create a consensus sequence using neural networks from nanopore sequencing data | 609, 610 |
Software | ARB | a graphically oriented package comprising various tools for sequence database handling and data analysis | 611 |
Tool | Piphillin | a software package that predicts functional metagenomic content based on the frequency of detected 16S rRNA gene sequences corresponding to genomes in regularly updated, functionally annotated genome databases | 613, 614 |
Tool | BlastFrost | a highly efficient method for querying 100,000s of genome assemblies. BlastFrost builds on the recently developed Bifrost, which generates a dynamic data structure for compacted and colored de Bruijn graphs from bacterial genomes | 617, 618 |
Tool | BioNode | Command line tool for handy NGS data procedures, searching NCBI, downloading SRA stuff or handling fasta files. | 622 |
Tool | Biopieces | Command line tool for a lot of NGS data procedures, fastq files, mapping, SNPs, etc. but has some dependencies... | 623 |
Tool | GrabSeqs | Command line tool to download sequence files from SRA, iMicrobes, MG-rast easily | 626 |
Tool | fARGene | (Fragmented Antibiotic Resistance Gene iENntifiEr ) is a tool that takes either fragmented metagenomic data or longer sequences as input and predicts and delivers full-length antiobiotic resistance genes as output | 627, 628 |
Tool | GTDBTk-Script | various useful scripts related to GTDB | 629 |
Tool | Cello | the code is parsed to generate a truth table, and logic synthesis produces a circuit diagram with the genetically available gate types to implement the truth table. The gates in the circuit are assigned using experimentally characterized genetic gates. | 633,634,635 |
Tool | URMAP | The Ultra-fast Read Mapper (URMAP) is a fast, accurate read mapping with highly compressed output. It is ~10x faster than BWA and Bowtie with comparable accuracy on benchmark tests | 636, 637 |
Tool | Artemis | The Artemis Software is a set of software tools for genome browsing and annotation | 640 |
Tool | EDGAR 2.0 | "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" is an enhanced software platform for comparative gene content analyses | 641, 642, 643 |
Tool | ASA3P | an automatic and scalable assembly, annotation and analysis pipeline for closely related bacterial genomes | 644, 645, 646 |
Tool | BIGSdb | a software designed to store and analyse sequence data for bacterial isolates | 647, 648, 649, 650 |
Tool | OrthoVenn2 | is a web platform for comparison and annotation of orthologous gene clusters among multiple species | 651, 652 |
Tool | genomeribbon | easy to use website to assess a genome assembly with raw reads, long reads and short reads | 653 |
R package | FindMyFriends | Fast alignment-free pangenome creation and exploration | 654, 655 |
R package | dadasnake | is a Snakemake workflow to process amplicon sequencing data, from raw fastq-files to taxonomically assigned "OTU" tables, based on the DADA2 method | 660, 661 |
Tool | AMRtime | Metagenomic AMR detection using hierarchical machine learning models | 662 |
Tool | panaroo | An updated pipeline for pangenome investigation | 663, 664 |
Pipeline | TORMES | An automated pipeline for whole bacterial genome analysis of genomes and/or raw Illumina paired-end sequencing data, regardless the number, origin or species | 665, 666 |
Pipeline | ASAP3 | Automatic Bacterial Isolate Assembly, Annotation and Analyses Pipeline | 667, 668 |
Pipeline | nullarbor | Pipeline to generate complete public health microbiology reports from sequenced isolates | 669 |
Pipeline | Bactopia | Bactopia is a flexible pipeline for complete analysis of bacterial genomes | 670, 671 |
Pipeline | Common Workflow Language | an open standard for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments | 673 |
Metric | bacterialEvolutionMetrics | Consistent Metagenome-Derived Metrics Verify and Delineate Bacterial Species Boundaries | 675, 676 |
Tool | NGSpeciesID | is a tool for clustering and consensus forming of targeted ONT reads | 677, 678 |
Catalogue | long-read-tools | A CATALOGUE OF LONG READ SEQUENCING DATA ANALYSIS TOOLS | 681 |
Tool | fARGene | Fragmented Antibiotic Resistance Gene iENntifiEr | 682, 683 |
Pipeline | PathoFac | a pipeline for the prediction of virulence factors and antimicrobial resistance genes in metagenomic data | 684, 685 |
Tool | MFEprimer | a functional primer quality control program for checking non-specific amplicons, dimers, hairpins and other parameters | 686, 687, 688 |
Pipeline | STRONG | STRONG resolves strains on assembly graphs by resolving variants on core COGs using co-occurrence across multiple samples | 689, 690, 691,704 |
Tool | NanoClust | De novo clustering and consensus building for ONT 16S sequencing data | 694 |
Tool | mVIRs | a tool that locates integration sites of inducible prophages in bacterial genomes | 697 |
Tool | Metagenome-Atlas | a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, to Annotation | 698, 699, 700, 701 |
Tool | VIRify | a recently developed pipeline for the detection, annotation, and taxonomic classification of viral contigs in metagenomic and metatranscriptomic assemblies | 702 |
Plattform | BioContainers | is a community-driven project that provides the infrastructure and basic guidelines to create, manage and distribute bioinformatics packages (e.g conda) and containers (e.g docker, singularity) | 705, 706 |
Tool | DeepMAsED | deep-learning based evaluating the quality of metagenomic assemblies | 708, 709 |
Tool | minMLST | a machine-learning based methodology for identifying a minimal subset of genes that preserves high discrimination among bacterial strains | 713, 714 |
Tool | hAMRonization | CLI parser tools combine the outputs of disparate antimicrobial resistance gene detection tools into a single unified format | 715 |
Tool | PPanGGOLiN | Depicting microbial species diversity via a Partitioned PanGenome Graph Of Linked Neighbors | 717, 718 |
Webtool | OGB | OpenGenomeBrowser is a dynamic and scalable web platform for comparative genomics | 719, 720 |
Pipeline | Bakta | a tool for the rapid & standardized annotation of bacterial genomes & plasmids | 721 |
Tool | MentaLiST | The MLST pipeline developed by the PathOGiST research group | 725, 726 |
Webapp | TyphiNET | The TyphiNET dashboard collates antimicrobial resistance (AMR) and genotype (lineage) information extracted from whole genome sequence (WGS) data from the bacterial pathogen Salmonella Typhi, the agent of typhoid fever. | 727 |
Webapp | Pathogenwatch provides species and taxonomy prediction for over 60,000 variants of bacteria, viruses, and fungi. MLST prediction is available for over 100 species using schemes from PubMLST, Pasteur, and Enterobase | 728 | |
Tool | mlst | Scan contig files against traditional PubMLST typing schemes | 729 |
Tool | snippy | Rapid haploid variant calling and core genome alignment | 733 |
Tool | MUFFIN | a hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis. | 734, 735, 736 |
Tool | Pandora | a tool for bacterial genome analysis using a pangenome reference graph (PanRG) | 738, 739, 740 |
Tool | cgmlst | Fork of Torsten Seemanns excellent mlst tool modified for cgMLST | 741 |
Tool | Phandango | a fully interactive tool to allow visualisation of a phylogenetic tree, associated metadata and genomic information such as recombination blocks, pan-genome contents or GWAS results | 741, 742 |
R package | Enriched heatmap | is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions | 747, 748 |
R package | Pagoo | is an encapsulated, object-oriented class system for analyzing bacterial pangenomes | 752, 753, 754, 834 |
R package | simurg | Simulate a Bacterial Pangenome in R | 754, 755 |
Nextflow | Porefile | a Nextflow full-length 16S profiling pipeline for ONT reads | 757 |
Tool | MLSTar | R package allows you to easily determine the Multi Locus Sequence Type (MLST) of your genomes | 758, 759 |
Tool | MOB-suite | for clustering, reconstruction and typing of plasmids from draft assemblies | 760, 761 |
Tool | PlasForest | a random forest classifier of contigs to identify contigs of plasmid origin in contig and scaffold genomes | 763, 764 |
Tool | GMGC-mapper | Command line tool to query the Global Microbial Gene Catalog (GMGC) | 774 |
Tool | MetaGraph | Ultra Scalable Framework for DNA Search, Alignment, Assembly of bacterial sequences | 775, 776, 777, 778 |
Tool | MIND | Microbial Interaction Network Database | 786 |
Pipeline | microPIPE | a pipeline for high-quality bacterial genome construction using ONT and Illumina sequencing | 787 |
Tool | giraffe | variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods | 795, 796 |
Tool | SquiggleKit | A toolkit for accessing and manipulating nanopore signal data | 798, 799, 800 |
Tool | FlowerPlot | A Python 3.9+ function that makes flower plots for pangenomics | 804 |
Tool | Poppunk | a tool for clustering genomes | 807 |
Tool | PATO | a R package designed to analyze pangenomes (set of genomes) intra or inter species | 810, 811 |
Tool | PanX | is a software package for comprehensive analysis, interactive visualization and dynamic exploration of bacterial pan-genomes | 812 |
Tool | 3mcor | Metabolome-Microbiome-Metadata-Correlation Analysis | 814 |
Tool | GenAPI | a program for gene presence absence table generation for series of closely related bacterial genomes from annotated GFF files | 829, 830 |
Tool | bammix | Summarise nucleotide counts at a set of positions in a BAM file to search for mixtures | 835 |
Tool | Wolka | (Web of Life Toolkit App), is a bioinformatics package for shotgun metagenome data analysis | 836, 837 |
Tool | ECTyper | is a standalone versatile serotyping module for Escherichia coli | 838, 839 |
Tool | Serotypefinder | is a serotyping module for Escherichia coli | 840, 841 |
Tool | SRST2 | Short Read Sequence Typing for Bacterial Pathogens | 842, 843 |
Tool | KEMET | a python tool for KEGG Module evaluation and microbial genome annotation expansion (Metabolic) | 844, 845 |
Tool | SIAMCAT | Statistical Inference of Associations between Microbial Communities And host phenoTypes | 846, 847 |
Collection | EMBL | Microbiome Analysis Tools Developed at EMBL | 848 |
Tool | BacDist | Snakemake pipeline for bacterial SNP distance, recombination and phylogenetic analysis | 849 |
Tool | PacTyper | Snakemake pipeline for continuous clone type prediction for WGS sequenced bacterial isolates based on their core genome | 850 |
Pipeline | CulebrONT | a streamlined long reads multi-assembler pipeline for prokaryotic and eukaryotic genomes | 857, 858 |
Tool | gapseq | Informed prediction and analysis of bacterial metabolic pathways and genome-scale networks | 859, 860 |
Tool | MicrobiomeAnalysis | This package provides common methods for microbiome analysis | 863, also see 852 |
Tool | MiMiC | proposes minimal microbial consortia from the functional potential of a given metagenomic sample | 864, 865 |
Tool | PIRATE | identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds | 867, 868 |
Tool | bacterial_strain_definition | Contains the code and workflow for the bacterial strain definition paper with Kostas Kostantinidis | 869, 870 |
Tool | CheckM2 | Rapid assessment of genome bin quality using machine learning | 876 |
Tool | Gubbins | Genealogies Unbiased By recomBinations In Nucleotide Sequences | 879, 880 |
Tool | SKA | a toolkit for prokaryotic DNA sequence analysis (phylogeny) using split kmers | 881, 882 |
Tool | Mashtree | a rapid comparison of whole genome sequence files | 883, 884 |
Pipeline | mGEMS | Bacterial sequencing data binning on strain-level based on probabilistic taxonomic classification | 885, 886 |
Tool | D-GENIES | Dot plot large Genomes in an Interactive, Efficient and Simple way | 893, 894, 895 |
Tool | nanotimeparse | parses an Oxford Nanopore fastq file on read sequencing start times found in the fastq headers | 897 |
Tool | ClonalFrameML | package that performs efficient inference of recombination in bacterial genomes | 899, 900 |
Tool | minidot | Quickly produce pretty dotplots from minimap mappings using R/ggplot2 | 903 |
Category | Name | Description | Link |
---|---|---|---|
Data-Types | Microbiome Datasets Are Compositional: And This Is Not Optional | Why OTU tables need to be handled more carefully - They are compositional! | 1 |
Compositional approach | CoDa | This directory contains the readings, materials, and examples for a workshop originally offered at the Exploring Human Host-Microbiome Interactions in Health and Disease 2016 conference. | 6; wiki 7 |
Compositional approach | Frontiers_supplement.Rmd | The document is the supplement and companion to the "Microbiome datasets are compositional: and this is not optional." review article. | 9 |
R package | CoDaSeq | This is the ongoing work to put together a complete suite of functions for CoDa analysis of microbiome, transcriptome and metagenome data | 16 |
Compositional approach | PhILR | PhILR is short for “Phylogenetic Isometric Log-Ratio Transform” This R-package provides functions for the analysis of compositional data (e.g., data representing proportions of different variables/parts). | 25 26 |
R package | PathoStat | The purpose of this package is to perform Statistical Microbiome Analysis on metagenomics results from sequencing data samples. In particular, it supports analyses on the PathoScope generated report files. | 28 |
R package | microbiome | Tools for microbiome analysis; with multiple example data sets from published studies; extending the phyloseq class. The package is in Bioconductor and aims to provide a comprehensive collection of tools and tutorials, with a particular focus on amplicon sequencing data. | 29 |
R package | phyloseq | phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data. | 30 31 |
Stat Comparing | DA test | Package to check various statistical methods to find "spike-ins" in 16S microbiome data | 36 |
R Package | Mare | Promising easy microbiome analysis - find out what taxa correlate with certain metadata (Not validated yet) | 41 |
R Package | PCAexplorer | Package to make interactive PCA plots in browser, originally for RNA-seq but maybe adaptable | 43 |
R Package | Glimma | Interactive visualization of DEseq2 results, might be very helpful in exploration | 44 |
R Package | CoDaSeq | Compositional Data Analysis Package written by Greg Gloor | 53 |
dimensionality reduction | Adaptive gPCA | A method for structured dimensionality reduction | 61 |
R Package | theseus | Add-on for phyloseq | 62 |
R Package | decotam | implements a statistical classification procedure that identifies contaminants in MGS data based on two widely reproduced patterns: contaminants appear at higher frequencies in low-concentration samples and are often found in negative controls | 65 |
Analysis Tutorial | Workflow by Holmes Lab | A nice tutorial/ workflow for a suggested workflow in microbiome analysis by the Holmes lab | 69 |
R Package | phyloseqGraphTest | Convinient and easy to use package for graphical testing with phyloseq objects | 70 |
R Package | ccrepe | Compositionality Corrected by PErmutation and REnormalization (ccrepe) is a package for analysis of sparse compositional data. Specifically, it determines the significance of association between features in a composition, using any similarity measure (e.g. Pearson correlation, Spearman correlation, etc.) | 77,78 |
Network Analysis | NetShift | To visualize community shufflings in microbial association networks between healthy and diseased states and identify 'driver' nodes observed between the states. | 79,80 |
R Markdown | Differential Abundance tests Microbiome | Fairly well documented implementations of many different Differential Abundance tools, useful to take some function. | 87 |
Statistics Approach | Percentile-normalization method | A novel & easy way to deal with batch effects when comparing multiple experiments | 88, 89 |
Software | Latent Variable Modeling for the Microbiome | probabilistic latent variable models are a cornerstone of modern unsupervised learning, they are rarely applied in the context of microbiome data analysis, in spite of the evolutionary, temporal, and count structure that could be directly incorporated through such models | 107 108 |
Python | HAllA | Hierarchical All-against-All association testing (HAllA) is computational method to find multi-resolution associations in high-dimensional, heterogeneous datasets | 117 |
tutorial | Transformation vs Standardization | Data Standardization and Transformation | 127 |
R package | BioCor | Calculates functional similarities based on the pathways described on KEGG and REACTOME or in gene sets. These similarities can be calculated for pathways or gene sets, genes, or clusters and combined with other similarities | 130 131 |
R package | Phylofactor | The package phylofactor will help you break apart the phylogeny with a variety of contrasts & objective functions, summarize the splits, and visualize the tree. | 137 138 139 |
R package | themetagenomics | provides functions to explore topics generated from 16S rRNA sequencing information on both the abundance and functional levels. It also provides an R implementation of PICRUSt and wraps Tax4fun, giving users a choice for their functional prediction strategy. | 145 146 |
R package | selbal | an R package for selection of balances in microbiome compositional data. It implements a forward-selection method for the identification of two groups of taxa whose relative abundance, or balance, is associated with the response variable of interest | 173 |
R package | microPop | a dynamic model based on a functional representation of different microbiota | 225 226 |
Article | Networks for Microbiota Analysis | A nice summary of a lot of network theory and how it is used for microbiota analysis and what the open questions are | 255 |
R package | metamicrobiomeR | implements Generalized Additive Model for Location, Scale and Shape (GAMLSS) with zero inflated beta (BEZI) family for analysis of microbiome relative abundance data (with various options for data transformation/normalization to address compositional effects) and random effect meta-analysis models for meta-analysis pooling estimates across microbiome studies | 282 283 |
R package | metagenomeSeq | is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples | 292 |
R package | PIME | a package for discovery of novel differences among microbial communities | 300 301 |
R package | GLMM-MiRKAT | A Distance-Based Kernel Association Test Based on the Generalized Linear Mixed Model for Correlated Microbiome Studies | 309 310 |
R package | MIMOSA | Model-based Integration of Metabolite Observations and Species Abundances | 339, 340 |
Tool | new mmvec (old:rhapsody) | Neural networks for estimating microbe-metabolite co-occurence probabilities | 354 |
R package | BacArena | an open source software for simulating cellular communities. It combines agent-based modeling, flux balance analysis, and statistical analysis | 503, 504, 542 |
Tool | BOFdat | is a three step workflow that allows modellers to generate a complete biomass objective function de novo from experimental data: Obtain stoichiometric coefficients for major macromolecules and calculate maintenance cost; Find coenzymes and inorganic ions; Find metabolic end goals | 505, 506 |
R package | Corncob | beta-binomial regression on covariates - might be a nice statistical test on abundance data and variables of interest | 531 |
R package | rtsne | T-Distributed Stochastic Neighbor Embedding (t-SNE) using a Barnes-Hut Implementation | 539, 540, 541 |
R package | microbiomeDASim | A toolkit for simulating differential microbiome data designed for longitudinal analyses. Several functional forms may be specified for the mean trend | 548 |
R package | MMUPHin | an R package for meta-analysis tasks of microbiome cohorts. It has function interfaces for: a) covariate-controlled batch- and cohort effect adjustment, b) meta-analysis differential abundance testing, c) meta-analysis unsupervised discrete structure (clustering) discovery, and d) meta-analysis unsupervised continuous structure discovery | 549 |
R package | ReactomeGSA | uses Reactome's online analysis service to perform a multi-omics gene set analysis | 550 |
R package | LinkHD | a general R software to integrate heterogeneous dataset focusing on micribial communities | 554, 555 |
R, Python | Rest API | Fast Scalable Machine Learning API | 576, 577 |
R package, Webapp | Metaboanalyst | a user-friendly, web-based analytical pipeline for high-throughput metabolomics studies | 618, 619 |
R package | SIAMCAT | R package for easy microbiome analysis - confounder analysis - phenotype prediction - Zeller group | 620 |
R package | Breakaway | R package for r functions for alpha diversity measurements | 621 |
R package | seqgroup | The seqgroup R package offers a collection of functions that support the analysis of microbial sequencing data with a group structure | 631, 632 |
R package | ranomaly | R package for statistical analyses and visualization of 16S data | 656, 657 |
R package | RioNorm2 | A Novel Normalization and Differential Abundance Test Framework for Microbiome Data | 658, 659 |
R package | phylosmith | A conglomeration of functions that I have written, that I find useful, for analyzing phyloseq objects. Phyloseq objects are a great data-standard for microbiome and gene-expression data | 692, 762 |
R package | MicEco | Various functions for analysis for microbial community data | 693 |
R package | MaAsLin2 | A comprehensive R package for efficiently determining multivariable association between phenotypes, environments, exposures, covariates and microbial meta’omic features | 730, 731 |
R package | micropml | User-Friendly R Package for Supervised Machine Learning Pipelines | 749, 750, 751 |
R package | shinyML | Compare Supervised Machine Learning Models Using Shiny App | 789 |
R package | UMAP | Uniform Manifold Approximation and Projection for Dimension Reduction | 793, 794 |
R package | MIMOSA2 | summarizes paired microbiome-metabolome datasets to support mechanistic interpretation and hypothesis generation | 813 |
R package | microViz | for analysis and visualization of microbiome sequencing data 825, 826 | |
R package | mia | implements tools for microbiome analysis based on the SummarizedExperiment | 852 |
R package | CARlasso | Conditional Auto-Regressive LASSO in R | 853, 854 |
R package | microPopGut | R package for simulating microbial populations in the human colon | 871 |
Category | Name | Description | Link |
---|---|---|---|
Analysis tool | Calour | an Interactive, Microbe-Centric Analysis Tool | 102 103 104 |
R package | KEGGgraph | graph approach to KEGG PATHWAY in R and Bioconductor | 128 |
R package | pathview | Pathview is a tool set for pathway based data integration and visualization based on KEGG data | 129 |
R package | annotate | Annotation for microarrays and GOs | 132 |
Tool | SegmentalDuplicationsCircos | plots circular genomes | 186 |
Tool | Keanu | A tool for viewing the contents of metagenomic samples | 194 195 |
R package | ampvis2 | An R package to visualise amplicon data | 245 |
Python Tool | Bokeh | Creating interactive low-level visualizations with Python, kind of like ggplotly | 246 |
Tool | icarus | Icarus is a novel genome visualizer for accurate assessment and analysis of genomic draft assemblies, which is based on QUAST genome quality assessment tool | 247 |
Tool | metaQuast | MetaQUAST evaluates and compares metagenome assemblies based on alignments to close references. It is based on QUAST genome quality assessment tool, but addresses features specific for metagenome datasets | 248 249 |
Web-App | ITOL | Interactive Tree Of Live | 259 |
R package | magick | The new magick package is an ambitious effort to modernize and simplify high-quality image processing in R | 285 |
App | Lucid Align | A modern sequence alignment viewer | 297 |
R package | HTML Widgets | Very nice packages to create more interactive visualizations like plots and tables in HTML Rmd output | 302 |
R package | ggpubr | an excellent and flexible package for elegant data visualization in R and publication ready figures | 396 |
R package | metacoder | parsing, plotting, and manipulating large taxonomic data sets | 397 |
Tool | Krona | Visualization tool to show hirarchical datasets such as metagenomic samples. Used by cosmosID and other services. Created in Excel or dedicated import tools | 444 |
Shiny Webapp | PlotTwist | a web app for plotting and annotating time series data | 445, 446 |
R package | KEGGREST | A package that provides a client interface to the KEGGREST server | 516 |
Webapp | iPath | Interactive Pathways Explorer (iPath) is a web-based tool for the visualization, analysis and customization of various pathway maps. Covers microbial metabolism in diverse environments | 533, 534 |
R package | Cowplot | The cowplot package provides various features that help with creating publication-quality figures, such as a set of themes, functions to align plots and arrange them into complex compound figures, and functions that make it easy to annotate plots and or mix plots with images | 561 |
R package | patchwork | The goal of patchwork is to make it ridiculously simple to combine separate ggplots into the same graphic | 562 |
R Package | karyoploteR | R package to visualize genomic features on genomes - can plot anything that has genomic coordinates - maybe read depth of sequencing too | 565 |
R tutorial | kateto | Network visualization with R | 575 |
R package | rayshader | is an open source package for producing 2D and 3D data visualizations in R | 638, 639 |
Webapp | biorender | a webapp for scientific illustrations with template icons to use | 672 |
App | SnapGeneViewer | SnapGene Viewer includes the same rich visualization, annotation, and sharing capabilities as the fully enabled SnapGene software | 679 |
R script | AnnVis | Tutorial to visualize prokka output using gggenes package | 680 |
R package | ggseqlogo | a versatile R package for drawing sequence logos | 695, 696 |
R Markdown | webpage | Creating websites in R | 716 |
App | TreeViewer | Flexible, modular software to visualise and manipulate phylogenetic trees | 723 |
Software | Graphia | a powerful open source visual analytics application developed to aid the interpretation of large and complex datasets | 732 |
R package | ComplexHeatmap | provides a highly flexible way to arrange multiple heatmaps and supports self-defined annotation graphics | 744, 856 |
R package | circlize | circular visualization in R and circular heatmaps | 745, 746, 823, 824 |
R package | ggsci | Scientific Journal and Sci-Fi Themed Color Palettes for ggplot2 | 768 |
R package | colorblindr | Simulate colorblindness in production-ready R figures | 769 |
R package | scico | 17 colorblind safe palettes | 770, 771 |
R package | plumbertableau | Integrating Dynamic R and Python Models in Tableau Using plumbertableau | 784, 785 |
R package | Boruta | Feature selection with the Boruta algorithm | 788 |
R package | camcorder | to track and record the ggplots that are created across one or multiple sessions with the eventual goal of creating a gif showing all the plots created sequentially | 790 |
R package | ggiraph | a tool that allows you to create dynamic ggplot graphs | 797 |
R package | ggsvg | is an extension to ggplot to use arbitrary SVG as points | 817 |
R package | gtsummary | provides an elegant and flexible way to create publication-ready analytical and summary tables using the R programming language | 819 |
Webapp | Datawrapper | lets you show your data as beautiful charts, maps or tables with a few clicks | 820 |
R package | mmtable2 | Create and combine tables with a ggplot2/patchwork syntax | 822 |
Webapp | Lucidchart | is the intelligent diagramming application that brings teams together to make better decisions and build the future | 833 |
R Package | ampvis2 | an R-package to conveniently visualise and analyse 16S rRNA amplicon data in different ways from phyloseq data | 831, 832 |
Webpage | From Data to Viz | is a classification of chart types based on input data format | 855 |
Cheat Sheet | Graphics Principles | Cheat Sheet for correct graphics visualization | 867 |
R Package | GenoVi | generates circular genome representations for complete or draft bacterial and archaeal genomes | 872, 873 |
R Package | ggcoverage | Visualize and annotate genome coverage with ggplot2 | 874, 875 |
R package | ggside | to enable users to add metadata to their ggplots with ease | 877 |
R package | dotplotly | Create an interactive or static dot plot from mummer output OR PAF format | 890 |
R package | ganttrify | nice-looking Gantt charts | 901, 902 |
R package | fastbaps | The fast BAPS algorithm is based on applying the hierarchical Bayesian clustering (BHC) algorithm to the problem of clustering genetic sequences using the same likelihood as BAPS | 906, 907 |
Category | Name | Description | Link |
---|---|---|---|
R package | targets | Managing bioinformatics pipelines with R | 779, 780, 781 |
Tutorial | bioinformatics-workflows | Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers | 783 |
Category | Name | Description | Link |
---|---|---|---|
Article/Paper | Butyrate & Propionate Pathways | Paper by Flint group that describes the pathways that lead to butyrate and propionate product and crossfeedbing in anaerobic bacteria | 228 |
Tool | curveball | Predicting competition results from growth curves | 368,369 |
Database | Virtual Metabolic Human | Genome wide metabolic models for 800+ different type strains from the human gut ready for extension | 449 |
Tool | MICOM | Python package using COBRApy for microbial community modelling - sent by lacroix/tomas from recent publication | 615 Publication |
Category | Name | Description | Link |
---|---|---|---|
Stastic Methods Explanations | GUSTA ME | Wesbite with intuitive explainations of why to use some methods and how to use them | 2 Publication: 3 |
Helpful R Scripts | DECIPHER | A website where he describes several helpful bioinformatic analyses & how to implement them | 5 |
Primer Design | RUCS | RUCS - Rapid Identification of PCR Primers Pairs for Unique Core Sequences | 14; webapp: 15 |
DB for Bac. | |||
Genome Annotation | The SEED | DB curated by experts to annotate the genome features in bacteria. Hopefully useful to quickly scan what pathways our bacteria have or don't have. | 331 332 333 |
DB of Reference Genomes | HumanMicrobiomeProject | Collection of many bacterial genomes sequenced up to 'draft-quality' and some up to 'gold-standard', probably helpful to analyze gene content of microbiomes and compare with PB | Catalog: 21, DataBrowser: 22 |
Classical Microbiome Pipeline | Applied Bioinformatics Book | An open-source book on applied bioinformatics - it has a great chapter on classical diversity analysis (UniFrac etc.) | Diversity Chapter: 23, Whole Book: 24 |
PhyloSeq Extension | MetagMisc | R package to export phyloSeq object easily into dataframes, etc. | 27 |
Download NCBI Genomes | ncbi-genome-download | Some script to download bacterial and fungal genomes from NCBI after they restructured their FTP a while ago. | 34 |
Phylo Trees | Randi Griffin Blog | Great blog to show some examples on how to create useful phylo trees and heatmaps etc. | 37 |
R-package | biomartr | The Biological Sequence Retrieval package allows users to retrieve biological sequences in a very simple and intuitive way. Using biomartr, users can retrieve either genomes, proteomes, CDS, RNA, GFF, and genome assembly statistics data using the specialized functions | 38 |
Rmd Templates | rticles | A package that includes templates for many journal articles | 40 |
Tutorial | DEseq2 for microbiome | DEseq2 analysis tutorial with PhyloSeq by Susan Holmes! | 42 284 291 293 294 |
Data | MicrobiomeHD | Human Microbiome Data from healthy and diseased people by MIT lab - Eric Alm | 45 |
Datasets | Google Datasets Search | Nice way to search for available datasets | 51 |
Survey Statistics | MultiTable Data Analysis for Microbiome | Survery of methods in multi table statistics from Holmes Lab | 52 |
Datasets | Qiita | open-source microbial study management platform. It allows users to keep track of multiple studies with multiple ‘omics data | 71,72,73 |
Workflow | Holmes Microbiome Workflow | Complete workflow from raw fastq files to fancy multivariate statistics workflow with dada2, DESeq2, etc. with code! | 74 |
R Package | ampvis2 | useful tool for nice visualization of amplicon data. Easy & nice ordinations! | 76 |
R-Markdown | Workshop | OPEN & REPRODUCIBLE MICROBIOME DATA ANALYSIS SPRING SCHOOL 2018 | 96 97 |
SOPs | IMMSA | The International Metagenomics and Microbiome Standards Alliance (IMMSA) is a non-hierarchical association of microbiome-focused researchers from industry, academia, and government | 123 |
CNGBdb | China National GeneBank DataBase | Archive of a lot of chinese sequencing projects with very nice search function | 140 |
Collection | nf-Core nextflow pipeline | A collection of high quality pipelines for bioinformatic analyses built with nextflow | 181 |
Collection | Awesome Nextflow Pipelines | A collection of a bunch of bioinformatic pipelines in nextflow: 16S, assembly, etc. | 188 |
Competition, SOP | Critical Assessment of Metagenome Assessment | Competition where tools are tested on accuracy for strain level binning and assembly (CAMI) | 189 |
Tools | Sanger Pathogen Tools | A collection of tool made by Sanger institute for pathogen/antimicrobial resistance screening, visualization, assembly, annotation | 190 |
Tool | Melonnpan by Biobakery Huttenhower | Method to predict metabolites from metagenomic reads, should be pre-trained but can also be tried with standard model | 205 |
Tool | ARepA Huttenhower | Tool to download information from specific data repositories: gene interaction, functional association | 206 |
Tool | PysraDB | Python library to quickly and systematically download data from NCBI Sequence Read Archive | 207 |
Journal Article | Pangenome & Metagenome | Nice article from Meren Lab describing how Anvi'o is used to create pangenomes and analyze core genes vs. assesory genes | 213 |
Tools | Chiron | Docker images and pipelines for metagenomic processing developed for HMP project workshops, includes Huttenhower software like humann2, strainphlan, qiime2 | 220 |
Tool | PANDA | Quick prediction of GO term annotation from Amino Acid sequence - only online service so far | 221 |
Review | What is good genome assembler | A nice comparison of several genome assemblers for de-novo assembly, hybrid, short and long reads are all compared | 241 |
Tool | NCBI Downloader | Command line tool to download genomes from NCBI and specify by all kinds of metadata | 256 |
Collection | Microbiome_notes | A continually expanding collection of microbiome analysis tools | 260 |
Blog | GoogleComputeEngineR | Blog with a lot of tutorials related to using R and google cloud instances | 296 |
Tool | KOMODO | Online tool to predict on what media a bacterial strain will grow. Based on DSMZ databases and gene predictions | 328 |
CheatSheet | Stanford Machine learning Cheatsheet | Cheat sheet that covers all basics and advanced methods in machine learning - summary of stanford course | 341 |
Blog | Genomics Tools List | List of tools that are installed on a bioinformatics clusters, could have some interesting tools in there | 349 |
SOPs | Microbiome-Standards | List of SOPs made by microbiome community aimed at coming up with very good standard SOPs for a wide array of microbiome analysis and data creating | 373 |
Website | R Graph Library | Very cool website with all kinds of visualizations and how to create them in R - great inspiration | 386 |
Blog | Shiny Examples | Example dashboards that were built with shiny R. Good for inspiration with source code | 387 |
Tool | TrueBac ID | Online tool to do whole genome taxonomic identification using ANI and 16S depending whats more accurate | 388, 389 |
Tools | Pathogen Informatics Sanger | Many tools by Sanger institute for pathogen analysis: Resistance genes, circulizing genomes, rapid pan genome generation | 398 |
Blog | Klebsiella assembly and analysis | Nice Blog post describing up-to-date genome assembly and annotation and analysis of a virulent bacteria | 401 |
Tool | PlasFlow | Neural Network for identifying whether contig sequences are from a plasmid or chromosome | 402 |
Blog | Comparison of long-read assemblers | Comparison by rrwick of newest long read assemblers on how they can assemble bacterial genomes with plasmids | 407 |
Tutorial-Blog | Tyler Barnum | How to Use Assembly Graphs with Metagenomic Datasets | 412 |
Tutorial | Phylogenetic Tree visualization | Nice and complete tutorial about visualizaing data on phylogenetic trees in R with ggtree, very nice example figures | 417 |
Tutorial | Functional enrichment analysis | Anvi'o v5.1: Functional Enrichment Analaysis and Computing ANI | 431 |
Repository | Kipoi | repository of pre-made deep learning models for genomics | 454 |
Knowledgebase | KBase | a DOE Systems Biology Knowledgebase, an open-source software and data platform that enables data sharing, integration, and analysis of microbes, plants, and their communities | 455, 456, 457, 458, 459 |
Book | Computational Genomics with R | fundamentals for data analysis for genomics | 502 |
Blog | Rmarkdown help | A nice guide to make rmarkdown documents beautiful and nice | 517 |
Tutorial | Microbiome Analysis 2018 | A nice tutorial website for statistical microbiome analysis from Leo Lathi | 529 |
Tutorial | Microbiome Utilities | a wrapper tool R package for phyloseq | 530 |
Review | Data Science in Microbiome | A nice review by Leo Lathi for various tools and methods available for microbiome analysis with references to the specific tools that implement methods | 532 |
Tool, API | IPATH python wrapper | A nice wrapper in python for the IPATH3 API to computationally create graphs | 545 |
R Package | formattable | nice package to make nice table in Rmarkdown for nicer formatted output | 551 |
R Packages | awesome-r | A curated list of awesome R frameworks, libraries and software | 588 |
Book | R for Data Science | This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it | 612 |
Website | Git stuff explained | Nice website that easily explains all the git commands for command line | 624 |
Tool | Type Strain genome Server DSMZ | Web tool by DSMZ to type novel genomes based on their collection of type strains | 625 |
Website | Beta diversity distances | Nice website that has the math equations for most of the beta diversity distances | 630 |
App | Pitch | Collaborative presentation software for modern teams | 703 |
Reporting | conflr | an R package to post R Markdown documents to Confluence, a content collaboration tool by Atlassian | 737 |
Tutorial | Galaxy Training | Collection of tutorials developed and maintained by the worldwide Galaxy community | 765 |
R package | thesisdown | package to write thesis in Rmarkdown | 782 |
R package | blogdown | is an R package that makes blogging for R users as straightforward as possible | 801, 802, 803 |
Webpage | postsyoumighthavemissed | Search 000's of R & Python articles and packages! | 805 |
Tutorial | shell-how | Write down a command-line to see how it works | 806 |
Webpage | webpage-repository | the website of AllanLab academic research group at Leiden University | 808 |
Tutorial | Machine Learning | Machine Learning for Everyone | 809 |
R package | RPushbullet | a package to send messages to your devices from R | 815, 816 |
R package | portfoliodown | makes it painless for data scientists to create a polished professional website so they can host their project portfolios, get great job interviews, and launch their data science careers | 818 |
Scripts | blantyreESBL | This document contains reproducing analysis code which generates the tables and figures for the manuscript: Dynamics of gut mucosal colonisation with extended spectrum beta-lactamase producing Enterobacterales in Malawi | 891, 892 |
codes | nf-modules | A repository for hosting Nextflow DSL2 module files containing tool-specific process | 896 |
Tutorial | Kaggle | Data Science competition | 898 |
Tutorial | Perfect-bacterial-genome-tutorial | Assembling the perfect bacterial genome using Oxford Nanopore and Illumina sequencing | 904, 905 |
Website to look up Markdown Syntax [https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet]
Save