TreeSummarizedExperiment (TreeSE) and phyloseq (pseq) objects are alternative containers for microbiome data. Here we evaluate their computational efficiency in terms of varying sample and feature set sizes.
Multiple data sets, either in the form of a TreeSE or a phyloseq object, were processed through a few common analytical routines:
The data sets were splitted by taxonomic ranks to get variations in feature counts, while keeping the data set and sample sizes constant. The execution times were measured and recorded for the different methods and sample/feature count combinations.
Standard data sets:
- Melting
- CLR transformation
- Agglomeration to Phylum level
- Alpha diversity estimation (Shannon)
- Beta diversity estimation (Bray-Curtis / MDS)
Big data set:
To reproduce the analyses, start R from within your local copy of this repository and run:
source("main.R")
This work is part of miaverse. The code and results in this repository are open source with Artistic License 2.0.