skembel/picante

randomizeMatrix with null.model = "independentswap" fails on small communities

joelnitta opened this issue · 2 comments

Thanks for the great package! I discovered this while trying to write tests for a function that uses randomizeMatrix().

5 x 5 seems to be the minimum size for the community when null.model = "independentswap".

This works:

library(picante)
data(phylocom)
randomizeMatrix(phylocom$sample[1:5,1:5], null.model="independentswap")

But this doesn't:

library(picante)
data(phylocom)
randomizeMatrix(phylocom$sample[1:4,1:5], null.model="independentswap")

It just hangs without finishing or generating an error message (I have to manually force R to quit).

sessionInfo()
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] picante_1.8.2   nlme_3.1-152    vegan_2.5-7     lattice_0.20-44 permute_0.9-5   ape_5.5        

loaded via a namespace (and not attached):
 [1] MASS_7.3-54    compiler_4.1.0 Matrix_1.3-4   parallel_4.1.0 tools_4.1.0    mgcv_1.8-36   
 [7] Rcpp_1.0.7     splines_4.1.0  grid_4.1.0     cluster_2.1.2 

Hi, what is likely happening is that the smaller matrix does not contain any checkerboard co-occurrences. An issue with the independent swap null model is that it will continue looping until it has performed the specified number of swaps. If there are no checkerboard co-occurrences in the matrix, it will just go on forever. I suggest you use the trial swap null model - this approach will attempt to swap a number of times before stopping. An issue in general is that if your matrix is too small and there are no checkerboard co-occurrences to be swapped, the 'randomized' matrix will not change, and you'll probably get non-sensical output (standard deviation of zero, NA/Inf for the SES metrics).

Thanks.

Although it's a edge-case, I think it would be nice to preempt this behavior with an error. It would be simple to add a check for minimum number of sites/taxa, but I suppose that isn't getting at the actual cause of the problem. If there is a way to check for any checkerboard co-occurrences before running the randomization, then an informative error could be issued instead of hanging.