This is the GitHub page for the dacomp
package. The dacomp
package implements non parametric tests for differential abundance in microbiome counts data.
For lastest development version run:
install.packages("devtools")
devtools::install_github("barakbri/dacomp", build_opts = c("--no-resave-data", "--no-manual"))
The package vignette breifly goes over the model, assumptions and background. There are also code snippets showing how to analyze data for several study designs: 2- sample groups,
vignette('dacomp_main_vignette')
The full paper is found on arXiv: Brill, Barak, Amnon Amir, and Ruth Heller. 2019. “Testing for Differential Abundance in Compositional Counts Data, with Application to Microbiome Studies.” arXiv Preprint arXiv:1904.08937.
The introduction in the package vignette cites other highly relevant works.
A github repository for reproducing paper results paper is found here
Version 1.1 of the package adds an addtional variant of the DACOMP method called "normalization by ratio". This variant of the test procedure is disabled by default. See the above paper for additional details on this variant of the DACOMP
method.
Version 1.2 of the package adds multiplicity adjusted P-values for the DS-FDR procedure. Adjusted P-values are returned from the dacomp.test(...)
function, under the field p.values.test.adjusted
. Users using the normalization by rarefaction option, (which is not the default), will find the DS-FDR adjusted P-values under the field p.values.test.adjusted.ratio.normalization
.
Version 1.24 improves the computation time of the Kruskal Wallis test statistic (test statistics should be identical up to machine error), and allows to compute the DSFDR multiplicity adjustment for only a part of the taxa,i.e. specify which taxa are tested, see documentation for dacomp.test(...)
for additional details.
the BioConductor package 'phyloseq' is now a suggested package, rather than a required package. It is required only for running the functions that generate a mock datasets in the example scripts.
- The function
dacomp.test(...)
now returns effect sizes. - The function
dacomp.select_references(...)
now has a default value of0
for the parametermedian_SD_threshold
meaning the reference taxa will be selected be default to be the lowest number such that the sequencing depth is at leastminTA
reads. - The function
dacomp.select_references(...)
can now run in parallel mode. - The function
dacomp.select_references(...)
now returns for each value cutoff value for reference selection the sample id for which the minimal number of reads was obtain in reference taxa. This can be used to determine if a specific sample is limiting the rarefaction depth, due to a small number of reads under the reference taxa. - The function
dacomp.check_reference_set_is_valid.k_groups
has been removed. See next item for the recommended procedure for assessing the validity of the reference set. - Added
dacomp.validate_references
a function for validating the reference set of taxa: the function checks if the reference taxa contain signal. If a signal is identified, taxa are removed from the reference set based on the order of their reference selection scores. The procedure continues iteratively, until no signal is found in the reference, or The function was added to the documentation for reference selection and testing functions, and also to the vignette.