microbiome/mia

RDA speedup

Opened this issue · 0 comments

Running mia::runRDA can be very slow for large data sets. This is problem in particular when we want to calculate alternative RDA models with different formula (e.g. assay ~ BMI + AGE vs. assay ~ BMI vs. assay ~ AGE etc), as in:

mia::runRDA(tse, 
                assay.type = "relabundance",
                formula = assay ~ BL_AGE + MEN,
                distance = "bray",
                na.action = na.exclude)

One problem is that the beta diversity is here re-calculated for every combination.

Speedups could be obtained by using pre-calculated beta diversity matrix, stored in TreeSE object and then supporting the use of that instead, e.g. something like:

mia::runRDA(meta(tse)$betadiv, 
                assay.type = "relabundance",
                formula = assay ~ BL_AGE + MEN,
                distance = "bray",
                na.action = na.exclude)

Implementation details can be discussed but this would be a substantial improvement.