/Orlov_Sheludkov_paper_Worldclim

Bioclamatic Data Optimization for Spatial Distribution Models

Primary LanguageR

Orlov M., Sheludkov A. (2019) Bioclimatic Data Optimization for Spatial Distribution Models. In: Bychkov I., Voronin V. (eds) Information Technologies in the Research of Biodiversity. Springer Proceedings in Earth and Environmental Sciences. Springer, Cham. Pp. 86-95.

https://doi.org/10.1007/978-3-030-11720-7_13

Abstract

Spatial distribution models (SDMs) are successfully used across various aims of biology, ecology, environment protection, etc. as means to predict distribution areas of living species. This includes changes in the distribution upon environmental changes, invasions, and other dramatical alterations affecting both biota and humans. For the purpose of SDMs training Maximization of Entropy (Maxent) machine learning algorithm is most applicable one. As for predictors set, climatic variables are among widely used. Numerous works addressing the problem in general have shaped the commonly used workflow. Here we consider the possibility to expand the workframe by applying unsupervised machine learning techniques (clusterization, PCA, and correlation analysis) fon input SDMs data for their optimization as well as exploration of the bioclimatic dataset. The need is connected to the fact that highly correlated predictors and excessively large data are likely to decrease machine learning performance. Having obtained the list of less contributing variables, we derived the new reduced dataset from the initial one by removing predictors from the list. Both datasets served as predictor sources for training of classifiers based on various machine learning methods. This allowed to produce better performance for some methods including Maxent while having dataset size decreased. Additionally, good agreement was evidenced for distribution areas predicted by Maxent and by the rest algorithms used, which implies that their simultaneous usage might help better robustness.

Keywords

SDM, Crimea, Ecoligical, modeling, Data optimization