Evaluating the ability of habitat suitability models to predict species presences
Introduction
Models predicting the spatial distribution of species (Boyce and McDonald, 1999, Guisan and Zimmermann, 2000, Manly et al., 2002, Pearce and Boyce, 2006) – sometimes called resource selection function or habitat suitability models – are currently gaining interest. As they often help both in understanding species niche requirements and predicting species potential distribution, their use has been especially promoted to tackle conservation issues, such as managing species distribution, assessing ecological impacts of various factors (e.g. pollution, climate change), risk of biological invasions or endangered species management (Scott et al., 2002, Guisan and Thuiller, 2005). These models statistically relate field observations to a set of environmental variables, presumably reflecting some key factors of the niche, like climate, topography, geology or land-cover. They produce spatial predictions indicating the suitability of locations for a target species, community or biodiversity. Different types of modelling techniques are used to fit different types of biological information recorded at each sample site: (1) presence-only: occurrences of the target species are recorded; (2) presence/absence: each sample site is carefully monitored so as to assert with sufficient certainty whether the species is present or absent. With plants, for instance, it is commonly done by listing exhaustively all species present in each sample site. The reliability of absences depends on the species’ characteristics (e.g. biology, behaviour, history) (Hirzel et al., 2001), their local abundance and ease of detection (Kéry, 2002), and the survey design (Mackenzie and Royle, 2005). More rarely, data record information about species’ abundance or demography (e.g. growth rate, survival).
Although models based on presence-only and presence/absence data provide the same kind of predictions (e.g. habitat suitability scores), they generally cannot use the same technique. This is because presence-only methods cannot contrast their predictions with the characteristics of places where the species is absent. This partly explains why presence/absence methods have known a greater development. These differences, and the lack of absences, make comparison of the two model types difficult (Zaniewski et al., 2002).
Assessing the predictive power of a model is of paramount importance, both for theoretical and applied issues. However, while presence/absence models have received a lot of attention and many evaluators are available for them (Fielding and Bell, 1997), evaluation of presence-only models is lagging behind. There is therefore a crucial need for reliable presence-based evaluation measures, as well as an assessment of how they compare to the presence/absence measures.
The main problem of presence-only evaluation measures is the lack of absences to counterbalance the presences. It is thus difficult to discriminate a model predicting presence everywhere from a more contrasted model. Attempts to solve this problem have followed two main approaches: (1) a first approach is to generate pseudo-absences and then apply the standard presence/absence techniques (e.g. Zaniewski et al., 2002, Anderson et al., 2003). (2) A second approach is to assess how much the model predictions differ from random expectation (e.g. Boyce et al., 2002, Hirzel et al., 2002, Reutter et al., 2003). In this category, the index recently proposed by Boyce et al. (2002) offers new insights. We tested it thoroughly and derived a new evaluator from it, which does not depend on the choice of boundaries between habitat suitability classes. A third original approach, proposed by Ottaviani et al. (2004), is based on compositional analysis. However, it is restricted to cases where evaluation data are in the form of polygons or large mapping units (e.g. large grid cells in an atlas), and thus does not apply here.
In this paper, we present various presence-only evaluation measures. To validate them, we build 114 presence/absence models chosen for the reliability of their absences and evaluate them with presence-only and presence/absence evaluators. We test correspondence between them and discuss how the new “Boyce indices” can improve the interpretation and utilisation of habitat suitability models.
Section snippets
Materials and methods
We define a habitat suitability (HS) map as composed of cells (or pixels) whose quantitative values range from 0 to 1. These values indicate how close the local environment is to the species’ optimal conditions, higher values standing for the most suitable areas. This map may result from any statistical analysis (Guisan and Zimmermann, 2000, Pearce and Boyce, 2006). The models’ evaluation consists in quantifying how accurately the map is predicting the presence and absence of the species (
Results
The chosen species cover a wide spectrum of ecological niche types and sample size. The quality of their habitat suitability models range from very bad to excellent. All the investigated evaluation measures convey similar information, with Pearson correlation coefficients greater than 0.5 in most cases (Table 2a). In particular, for the models where more than 50 presence points were available, most evaluators show more than 70% of correlation (Table 2b).
Except for those based on very wide
Discussion
On the range covered by the 114 studied plant species, and according to the environmental characteristics of our study area, all evaluators convey correlated information. This is an important result meaning that the presence-only evaluators can be trusted.
Acknowledgements
We wish to thank Mark S. Boyce, Gretchen G. Moisen and all the participants of the Riederalp Workshop, Switzerland, 2004, for stimulating discussions about the model evaluation, Patrick Patthey, Julie Jacquiéry and Pietro Persico for the first explorations of the Boyce index. We also wish to thank Jane Elith and two anonymous reviewers who helped improve this article. We are grateful to Fabien Fivaz for the help with R scripts as well as to all those who contributed to the field work: Pascal
References (40)
- et al.
Evaluating predictive models of species’ distributions: criteria for selecting optimal models
Ecol. Model.
(2003) - et al.
Relating populations to habitats using resource selection functions
Trends Ecol. Evol.
(1999) - et al.
Evaluating resource selection functions
Ecol. Model.
(2002) - et al.
Generalized linear and generalized additive models in studies of species distributions: setting the scene
Ecol. Model.
(2002) - et al.
Predictive habitat distribution models in ecology
Ecol. Model.
(2000) - et al.
Assessing habitat-suitability models with a virtual species
Ecol. Model.
(2001) - et al.
Two statistical methods to validate habitat suitability models using presence-only data
Ecol. Model.
(2004) - et al.
SPECIES: a spatial evaluation of climate impact on the envelope of species
Ecol. Model.
(2002) - et al.
Predicting species spatial distributions using presence-only data: a case study of native New Zealand ferns
Ecol. Model.
(2002) Maximum likelihood identification of Gaussian autoregressive moving average models
Biometrika
(1973)