/DMM

Code for simulation experiment testing the performance of Dirichlet-multinomial modelling for data of varying attributes.

Primary LanguageR

DMM

The scripts provided in this repo allow reproduction of the simulation experiment conducted in the manuscript, "Dirichlet-multinomial modelling outperforms alternatives for analysis of microbiome and other ecological count data" by Joshua G. Harrison, W. John Calder, Vivaswat Shastry, and C. Alex Buerkle.

For the impatient: the SimulationCode.R script is what we used to conduct our simulation experiment.

Description of files (alphabetical):

competeModelPlot.R - script to create panels in Figures 4 & S3.

data/allv4.csv – output from simulation experiment. File created by SimulationCode.R. This is wrangled in plotting scripts.

DM.stan – Stan DMM model

effectSizeBoxplot.R – script to creat Figures 3, S1, S2, and panel b in Fig. 5.

lm_of_simulation_params.R – script to perform linear modeling analysis of the influence of data attributes on model performance. Results are shown in Tables S2 & S3.

mcmcComparisonPlot.R – script to make Figure 5.

README.md – a highly entertaining document!

reanalysisLungsHMC.R – analysis of a portion of the data published in Duvallet et al 2019. See main text. This script requires "patient_clinical_metadata.csv" and the directory "rosen_mincount10_maxee2_trim200_results_forpaper" from Duvallet et al. These data can be accessed at: https://doi.org/10.5281/zenodo.2678107. If you use these data please cite Duvallet et al. 2019.

simParamMaker.R – a simple script to make a file that is read by SimulationCode.R

SimulationCode.R – the most important file here. This is the code used to conduct the simulation experiment. It is a collection of functions that are presented alphabetically. Each function is called from a "main" function, which is the place to start for those perusing this script.