sylvainschmitt/SSDM

Model complexity

adelaide-ui opened this issue · 4 comments

First of all, this is not a problem.

I'm starting with species distribution modeling and it seems to me that an emerging issue is determining the optimal complexity of each model (number of parameters, response curves, collinearity of environmental variables).

I would like to know if there is a way to test this on SSDM.

Best,
Adelaide

SSDM is not specifically designed to handle directly the collinearity of environmental variables but you can have a look to what we did with BIOCLIM variables in Schmitt, S., Pouteau, R., Justeau, D., de Boissieu, F., & Birnbaum, P. (2017, December 1). ssdm: An r package to predict distribution of species richness and composition based on stacked species distribution models. (N. Golding, Ed.), Methods in Ecology and Evolution, pp. 1795–1803. https://doi.org/10.1111/2041-210X.12841 .

And no there is no specific methodology about complexity in SSDM, besides the package can help you explore it with evaluation metrics. But the question is to broad so I invite you to do your own literature review. Have a look to #97 for some references. I'll let @lukasbaumbach adding some refs if he have other ideas as he is more up to date than me. But shcolar is also a good place to start.

Hi,
as Sylvain said, SSDM doesn't come shipped with a function for estimating model complexity. I can also only invite you to dive in the topic yourself. A good starting point would indeed be the papers from issue 97 (Araujo as an overview, Warren impressively shows how bizarre response curves may appear) and you may further want to look at the Akaike Information Criterion as a relative indicator for model complexity. Good luck!

Thanks for the answer, I am diving into this field now. I was testing some model configurations in ssdm and measuring complexity and AUC, but I can't elaborate more complex models by changing the parameters. It seems that there are few configurable parameters in each model, right?

All algorithms can be configured with the full set of parameters that are available in their source functions. Please carefully read the algorithm sections of the help page for modelling. For GLM for example, you would supply a list named glm.args=list(arg1=val1, arg2=val2, etc.) to the modelling function.