Evaluating the predictive performance of presence–absence models : Why can the same model appear excellent or poor?
Abrego, N., & Ovaskainen, O. (2023). Evaluating the predictive performance of presence–absence models : Why can the same model appear excellent or poor?. Ecology and Evolution , 13(12), Article e10784. https://doi.org/10.1002/ece3.10784
Published inEcology and Evolution
© 2023 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd.
When comparing multiple models of species distribution, models yielding higher predictive performance are clearly to be favored. A more difficult question is how to decide whether even the best model is “good enough”. Here, we clarify key choices and metrics related to evaluating the predictive performance of presence–absence models. We use a hierarchical case study to evaluate how four metrics of predictive performance (AUC, Tjur's R2, max-Kappa, and max-TSS) relate to each other, the random and fixed effects parts of the model, the spatial scale at which predictive performance is measured, and the cross-validation strategy chosen. We demonstrate that the very same metric can achieve different values for the very same model, even when similar cross-validation strategies are followed, depending on the spatial scale at which predictive performance is measured. Among metrics, Tjur's R2 and max-Kappa generally increase with species' prevalence, whereas AUC and max-TSS are largely independent of prevalence. Thus, Tjur's R2 and max-Kappa often reach lower values when measured at the smallest scales considered in the study, while AUC and max-TSS reaching similar values across the different spatial levels included in the study. However, they provide complementary insights on predictive performance. The very same model may appear excellent or poor not only due to the applied metric, but also how predictive performance is exactly calculated, calling for great caution on the interpretation of predictive performance. The most comprehensive evaluation of predictive performance can be obtained by evaluating predictive performance through the combination of measures providing complementary insights. Instead of following simple rules of thumb or focusing on absolute values, we recommend comparing the achieved predictive performance to the researcher's own a priori expectations on how easy it is to make predictions related to the same question that the model is used for. ...
PublisherJohn Wiley & Sons
Publication in research information system
MetadataShow full item record
Related funder(s)European Commission; Research Council of Finland
Funding program(s)Academy Research Fellow, AoF
The content of the publication reflects only the author’s view. The funder is not responsible for any use that may be made of the information it contains.
Additional information about fundingAcademy of Finland, Grant/Award Number: 309581 and 342374; H2020 European Research Council, Grant/Award Number: 856506; Jane ja Aatos Erkon Säätiö; Norges Forskningsråd, Grant/Award Number: 223257
Showing items with similar title or keywords.
Davidson, Pavel; Trinh, Huy; Vekki, Sakari; Müller, Philipp (MDPI AG, 2023)Oxygen uptake (V̇O2) is an important metric in any exercise test including walking and running. It can be measured using portable spirometers or metabolic analyzers. Those devices are, however, not suitable for constant ...
Predicting the working alliance over the course of long-term psychodynamic psychotherapy with the Rorschach Ego Impairment Index, self-reported defense style, and performance-based intelligence : An evaluation of three methodological approaches Stenius, Jaakko; Knekt, Paul; Heinonen, Erkki; Holma, Juha; Antikainen, Risto; Lindfors, Olavi (American Psychological Association (APA), 2021)Better therapeutic alliances are known to predict better treatment outcomes, but little knowledge still exists on the patient characteristics that lead to better alliances. In a sample of 128 outpatients assigned to long-term ...
Linguistic, Contextual, and Experiential Equivalence Issues in the Adaptation of a Performance-Based Assessment of Generic Skills in Higher Education Ursin, Jani; Hyytinen, Heidi; Silvennoinen, Kaisa; Toom, Auli (Frontiers Media SA, 2022)This qualitative study investigated the various linguistic, contextual, and experiential equivalence issues embedded in a performance-based instrument aimed at assessing generic skills in higher education. A rigorous ...
Species distributions models may predict accurately future distributions but poorly how distributions change : A critical perspective on model validation Piirainen, Sirke; Lehikoinen, Aleksi; Husby, Magne; Kålås, John Atle; Lindström, Åke; Ovaskainen, Otso (Wiley, 2023)Aim Species distribution models (SDMs) are widely used to make predictions on how species distributions may change as a response to climatic change. To assess the reliability of those predictions, they need to be critically ...
Forsberg, Joonas; Frantti, Tapio (Elsevier, 2023)This research introduces a novel framework for creating metrics intended for security operations centers (SOCs). The framework is developed using the design science research methodology and has been validated by generating ...