Population density estimates for terrestrial mammal species

Aim: Population density is a key parameter in ecology and conservation, and estimates of population density are required for a wide variety of applications in fundamental and applied ecology. Yet, in terrestrial mammals these data are available for only a minority of species, and their availability is taxonomically and geographically biased. Here, we provide the most plausible predictions of average population density, their natural variability and statistical uncertainty for 4,925 terrestrial mammal species. Location: Global. Time period: 1970– 2021. Major Terrestrial mammals. Methods: We fitted an additive mixed- effect model accounting for spatial and phylogenetic autocorrelation


| INTRODUC TI ON
Population density (the number of individuals per unit area) in terrestrial mammals varies greatly, by about two orders of magnitude within species and by about eight orders across species, ranging from c. .0001 individuals/km 2 in large carnivores to >50,000 individuals/ km 2 in small rodents . Understanding the drivers of this variation has been a key objective of macroecology (Brown & Maurer, 1989), and gaining predictive ability on this key population parameter will find many applications in conservation science.
The variation of population density across mammal species is largely explained by their body mass and diet (Damuth, 1981;Silva et al., 1997), which relate to metabolism and energy demand (Damuth, 1987;Smith et al., 2003), and diet-related differences in assimilation efficiency, resource availability and accessibility (Carbone & Gittleman, 2002;Jetz et al., 2004;Robinson & Redford, 1986;Silva et al., 1997). Wild mammal population densities also vary over space, within and across species, as a function of climate (Currie & Fritz, 1993;Santini, Isaac, Maiorano, et al., 2018), primary productivity (Pettorelli et al., 2009;Santini, Isaac, Maiorano, et al., 2018), prey abundance (Carbone & Gittleman, 2002;Hatton et al., 2015), habitat (Roseberry & Woolf, 1998;Šálek et al., 2014) and anthropogenic impact Tucker et al., 2020). Although at a large scale the average density varies as a function of species traits and environmental factors (Santini, Isaac, Maiorano, et al., 2018), populations also exhibit variable densities within the same location over time, which can result from predator-prey and demographic cycles (Korpimäki & Krebs, 1996;Oli, 2003) and from environmental or demographic stochasticity (Gaillard et al., 1998;Shanker & Sukumar, 1999). It has been argued that the range of possible population densities per species (termed "population scope") is bounded between a minimum, determined by the species' maximum home range size and travel distances, and a maximum, determined by primary productivity and resource availability (Stephens et al., 2019).
Population density also plays a key role in conservation because the reciprocal of density is considered one of the seven forms of rarity that are directly relevant to species vulnerability to extinction (Rabinowitz, 1981;Sykes et al., 2020), and population size is one of the key parameters used to assess the conservation status of species (IUCN, 2017).
Although a fundamental parameter in ecology and conservation, density estimates are rare and sparse across space and time . This is because robust estimates of mammalian population density can require long field seasons (from months to years) and can be highly expensive in terms of human resources (field assistants), logistics (travelling to the study area, obtaining permits and visas, vaccinations, accommodation, etc.) and equipment (camera traps, genetic analyses, flights for aerial counts, etc.) (Barea-Azcón et al., 2007;Liberg et al., 2012). Surveys may have to be conducted in areas spanning from hundreds of square metres (e.g., mice or voles; Castañeda et al., 2018) to thousands of square kilometres (e.g., tigers; Tempa et al., 2019), depending on the species' average density. Preferred sampling methods are species and habitat dependent. For example, aerial counts are often used to estimate population density over vast areas and for large animals, and are possible only in open habitats (savannas, grasslands or tundra; but for the use of thermal imaging in forests, see Spaan et al., 2019). Furthermore, population density estimates have a limited temporal validity for conservation management, because populations fluctuate over time. It becomes fundamental, therefore, to develop approaches to obtain preliminary estimates in a more efficient manner.
There have been multiple calls to move from a more "explanatory" to a more "predictive" ecology and conservation (Currie, 2019;Getz et al., 2018;Mouquet et al., 2015;Travers et al., 2019). Predictions allow us not only to test whether our understanding of natural mechanisms is solid and sufficient to explain what we observe (Currie, 2019), but also to inform conservation management directly and move to a more proactive conservation (Travers et al., 2019).
Estimates of population size over vast areas can be vital for a number of conservation applications, such as assessments of the Red List status of species at risk of extinction Yanqing et al., 2020), estimating the suitable size of protected areas to ensure species persistence (Clements et al., 2018;Santini et al., 2016) and applying population targets for conservation (Clements et al., 2011;Di Marco et al., 2016;Hilbers et al., 2017;Sanderson, 2006;Traill et al., 2010). Most large-scale conservation analyses consider only distribution data because of its broader accessibility, but given that abundance is unrelated to both geographical range size and probability of presence (Dallas & Hastings, 2018;Novosolov et al., 2017) and that population density changes substantially across taxonomic groups (Santini, Isaac, Maiorano, et al., 2018), ignoring density can introduce taxonomic biases in conservation assessments and planning (Di Santini et al., 2014).
Many studies have addressed the lack of data using published density estimates of the same or similar species (e.g., Sommer et al., 2002;Yanqing et al., 2020) or body mass-density allometries . Only a few well-studied species have sufficient data to develop species-specific predictive abundance K E Y W O R D S abundance, block cross-validation, body mass, carnivory, locomotor habits, predictive ecology, uncertainty estimation models (e.g., Sus scrofa, Lewis et al., 2017;Panthera onca, Jędrzejewski et al., 2018). A way to address limited data availability for most species is to develop cross-species models that can predict the average expected population density by considering species traits and environmental conditions, together with a predictive interval that reflects our level of confidence in the predictions. The interval should encompass the range of plausible population density values owing to statistical uncertainty, but also varying degrees of competition, predation, resource availability, anthropogenic pressure and demographic fluctuations that cannot be predicted by the model. A measure of uncertainty around average predictions is fundamental to interpret their applied value in different contexts. A first attempt was made by Santini, Isaac, Maiorano, et al. (2018), who explored several drivers of geographical variation in population density and demonstrated a good potential for large-scale predictions. However, Santini, Isaac, Maiorano, et al. (2018) tested the predictive ability of models using a random split-sample approach, ignoring spatial and phylogenetic autocorrelation in the data (Roberts et al., 2017;Yates et al., 2018). A robust validation of model transferability is needed to make a comprehensive assessment of model applicability to different poorly known taxa and undersampled areas. Finally, Santini, Isaac, Maiorano, et al. (2018) provided neither population density predictions nor uncertainty estimates.
Here, we provide predictions of the average population densities for 4925 extant terrestrial mammal species listed under the IUCN Red List and complement these with the range of possible density values, accounting for intraspecific variability and statistical uncertainty. We capitalize on an updated version of the TetraDENSITY database, the most comprehensive collection of population density data in terrestrial vertebrates, with 18,297 estimates for 877 terrestrial mammal species in the most recent version of the database . We aim to provide the best possible density estimates for mammals based on the available data to date.
Specifically, we predict species population density as a function of species body mass and diet, phylogeny, environmental variables and geographical distribution, and we assess the transferability of model predictions across mammal taxa and space. The empirical estimates and predictions for species, together with predictive models and all code necessary to replicate the analyses, are provided to the scientific community for their application in macroecological, conservation and community ecology studies. These estimates should be seen as empirically derived priors of animal densities to be used in further studies, not as estimates of real local density.

| Overview of the methods
In order to predict the average density of mammal species and assess the accuracy of predictions, we followed five main steps presented in the Supporting Information ( Figure S1). First, we collected, filtered and aggregated density data (Section 2.2; Supporting Information Figure S1a). Second, we derived reference values for prediction errors by estimating intraspecific variability in empirical population density estimates (Section 2.3; Supporting Information Figure S1b).
Third, we modelled the average population density of species as a function of their body mass and diet, phylogeny, environmental variables and geographical distribution (Section 2.4; Supporting Information Figure S1c). Fourth, we validated statistical models with taxonomic and spatially independent samples to assess the transferability of model predictions across mammal taxa and space and estimated the average error made, depending on the degree of extrapolation beyond fitted data (Section 2.5; Supporting Information Figure S1d). Finally, we predicted average population densities of terrestrial mammal species and the range of plausible values around the predictions using the estimated prediction error (Section 2.6; Supporting Information Figure S1e).

| Data collection
We extracted the population density estimates from an unpublished extended version of the TetraDENSITY database, which includes >35,000 density estimates for >3,000 species of terrestrial vertebrates . We extracted data for all mammals reported in the database (n = 18,297; 877 species), together with information on the sampling method used in each mammal survey (Supporting Information Figure S1a). We excluded (1) estimates for which the sampling method was not recorded, (2) estimates referring to non-native species, (3) estimates obtained before 1970, and (4) and estimates whose spatial coordinates were central to a wide region. Considering the widespread effect of humans on the environment and animal populations and the fact that c. 75% of the global terrestrial surface is modified by humans to some extent (Venter et al., 2016), an exclusion of non-natural population densities is not possible realistically. We retained estimates of species collected throughout the year depending on their availability, in order to approximate annual species averages. However, to avoid retaining pseudo-replicates in the model (e.g., monthly or seasonal estimates, or different habitats within the same study area), we averaged all estimates of the same species collected in the same location (set of coordinates provided in the study) and using the same sampling method.
A number of factors can influence population density in mammals; however, given the predictive goal of the paper, here we restricted the selection to variables that had complete data for most species and for which previous theoretical expectations and/or empirical exploration existed. Body mass and diet have been investigated widely as drivers of population density variation in mammals (Silva et al., 1997). In allometry studies focused on mammal density, species have typically been categorized into broad diet categories (e.g., herbivores, omnivores and carnivores; Currie & Fritz, 1993;Peters & Wassenberg, 1983;Santini, Isaac, Maiorano, et al., 2018), although other studies have shown that finer diet categories (e.g., insectivores, granivores and frugivores; Robinson & Redford, 1986;Silva et al., 1997) show consistent differences in the allometry between body mass and diet. However, because the classification of mammal species into clear-cut diet categories is non-trivial (Kissling et al., 2014;Wilman et al., 2014), we decided to use percentages of item consumption as continuous predictor variables. We extracted body mass and the percentage of seven dietary items for all species from the EltonTraits database (Wilman et al., 2014). We then used the percentage of carnivory (sum of fish, ectotherms, endotherms and unknown categories), insectivory, scavenging, nectarivory, frugivory, granivory and consumption of other plant items (herbivory/folivory). Given that different locomotor habits involve different mortality rates (Healy et al., 2014;Shattuck & Williams, 2010) and imply foraging in twoor three-dimensional habitats , we also included locomotion as a predictor for population density using the following categories: terrestrial, fossorial, arboreal, semi-arboreal and aerial. We classified all mammal species into these categories, searching per family, genera or species, based on diverse sources, including the Mammalian Species series (https://www. mamma lsoci ety.org/publi catio ns/mamma lian-species), and Animal Diversity web (https://anima ldive rsity.org/) and the Handbook of Mammals of the World series (Wilson & Mittermeier, 2011).
Finally, we included two environmental variables that were important determinants of population density in previous studies on mammals: precipitation seasonality (coefficient of variation of monthly precipitation values) and net primary productivity expressed as MODIS Normalized Vegetation Difference Index (NDVI) (Pettorelli et al., 2009;Santini, Isaac, Maiorano, et al., 2018), which are proxies for annual resource variability and energy availability, respectively. Precipitation can itself be a resource, but it also determines the availability of resources that can, in turn, influence demographic fluctuations (e.g., Lima & Jaksic, 1999;Madsen & Shine, 1999). Yearly resource fluctuations can increase mortality and limit seasonal carrying capacity, and have been shown to affect mammal densities negatively (Santini, Isaac, Maiorano, et al., 2018).
In turn, primary productivity can show a positive relationship at the intraspecific level (Pettorelli et al., 2009), but has been shown to yield a nonlinear relationship across species globally (Santini, Isaac, Maiorano, et al., 2018). Precipitation seasonality was downloaded from WorldClim v.2.0 (Fick & Hijmans, 2017) and covers the period 1970-2000, whereas NDVI was downloaded from the NASA Earth Science Data System program (https://lpdaa csvc.cr.usgs.gov/) and averaged for all years available . Environmental variables were resampled at 1° resolution to accommodate the maximum uncertainty of spatial coordinates in density estimates. Although the environmental data do not perfectly match the period considered for the density estimates (1970-2021), we considered them sufficiently representative of the global variation of climate and primary productivity.
All data sources for the density estimates included in the dataset are listed in the Supporting Information (Appendix S1). The codes and datasets used to run the analyses are accessible at: https://doi.

| Empirical estimates of intraspecific variability
For all mammal species listed in TetraDENSITY, we provide the average across all density estimates for each single species (n = 689), plus an estimate of intraspecific variability (Supporting Information Figure S1b). An estimate of intraspecific variability serves as a reference value for interpreting model predictive errors. In fact, even in an ideal model the predictive errors are expected to be high because predictions of average density are compared with empirical estimates, which might not reflect the long-term local average density.
Although averaging multiple empirical estimates per location can help to reduce this temporal noise, these estimates can still deviate substantially from the local average density. Hence, comparing variability in empirical estimates with predictive error enhances the interpretability of the validation results.
We first estimated the mean and SD of empirical estimates per species, then used the ln-mean and ln-SD to draw 1,000 random values from a log-normal distribution. We summarized this distribution using the interquartile range (25-75%) and the 95% confidence interval (95% CI). Variability was estimated only for species with a minimum sample size, in order to encompass natural variability as accurately as possible. To assess the minimum number of estimates required to estimate intraspecific variability, we fitted a locally weighted smoother model (loess) between the log 10 -transformed sample size per species and the coefficient of variation (Supporting Information Figure S2). Although the underlying data show large variation, the fitted model indicates that the coefficient of variation increases rapidly until an inflection point at a sample size of c. 30, above which the addition of further estimates increases the coefficient of variation at a lower rate (Supporting Information Figure S2).
To avoid underestimating the coefficient of variation because of a small sample, and considering that a minority of species have larger samples, we set the minimum sample size at 30. Given that not all species had 30 or more density estimates, we used a hierarchical approach. For species with >30 estimates, we used the SD at the level of the species. For all other species, we estimated SD at the level of each species with >30 estimates within the same genus, family or order, and we calculated the pooled SD (Thalheimer & Cook, 2002) using the formula: where n is the size of each sample, and k is the number of samples. This gives more weight to SDs based on larger samples, which gives more weight to SDs based on larger samples.
This hierarchical approach assumes that the intraspecific variability in population density is, in part, a biological property that depends, for example, on whether the species undergoes demographic cycles (e.g., rodents) or belongs to a trophic level that tends to have more variable densities. The estimated variability can also represent errors in empirical estimates, which can depend on the type and diversity of methods typically used to estimate the population density of a particular group of species. In 124 cases, there were no species with sufficient sample within the same order (e.g., Pilosa and Peramelemorphia), hence we used the pooled SD across all species with >30 density estimates.

| Model fitting
We used a generalized additive mixed-effect model (GAMM) to predict the population density of species with no empirical estimates (Supporting Information Figure S1c). Population density was log 10transformed before model fitting. We included species as a random effect to control for pseudo-replication, and an additional random effect for the sampling method, encompassing broad categories of methods for estimating population density: censuses ("complete" counts, which assume full detection of individuals), distance sampling, home range extrapolations, mark-recapture, N-mixture models, random encounter models, incomplete counts (e.g., strip transects) and trapping (removal methods, which indicate the minimum number known to be alive)  To control for phylogenetic non-independence in the estimates, we extracted 1,000 random trees from the phylogeny in the study by Upham et al. (2019) and generated a majority-rule consensus tree, setting the frequency with which each clade or bipartition is encountered at .8. We then computed a principal components analysis on the phylogenetic distance matrix of the consensus tree and extracted the first two eigenvectors, which explained >98% of the total variance. We subsequently used these two eigenvectors as fixed factors in the model. Using the phylogenetic eigenvectors is preferable over phylogenetic least squares approaches for our purpose, because it allows the use of the phylogeny for predictions, hence accounting for latent trait variables (i.e., traits not included in our model).
To control for spatial autocorrelation in the model residuals, we carried out a trend surface analysis using the tensor product interaction between the values of coordinates (northing and easting; Fletcher & Fortin, 2018). The GAMMs automatic shrinkage ensures that this trend surface does not distort estimates of the other predictors. Although other, arguably more sophisticated approaches are available to control for spatial autocorrelation (e.g., autoregressive models), this approach has the advantage of accounting for additional latent geographical factors that have not been taken into consideration while being computationally tractable given our large sample size.
Body mass was log10-transformed and dietary variables were logit-transformed after rescaling values from .1 to .99 before fitting the model, and all variables were scaled and centred before model selection. We allowed for nonlinearity using smooth terms for all continuous predictors (body mass, diet and phylogenetic eigenvectors), using four knots for body mass (which is known to show a cubic relationship with population density) (Santini, Isaac, Maiorano, et al., 2018;Silva & Downing, 1995) and three knots for all other predictors. The model was simplified as part of the fitting process by adding a parameter penalty that attempts to shrink smooth terms towards zero, using the "select" argument of the "bam" function (Wood & Wood, 2015). Further, we fitted the model using the restricted maximum likelihood (REML) method. Based on this model, we assessed the relative importance of each predictor by partitioning the variance explained across predictors. We estimated the squared correlation coefficient between the prediction of a model with one variable kept at its mean and the observed data. Each of these estimates was substracted from the full variance explained, in order to estimate the relative importance of each variable. Relative importance was expressed as a percentage.

| Model validation
The prediction interval for species included in the training dataset was obtained by using the SEs associated with the predictions. To estimate the prediction error for species not included in the training dataset or for novel geographical areas, we used several validation techniques that address taxonomic similarity and spatial autocorrelation in the estimates (Supporting Information Figure S1d). For the selected model, we estimated the root mean square error (RMSE) using a taxonomic and geographical block cross-validation procedure (Roberts et al., 2017). The taxonomic blocking was applied to taxonomic orders, families and species. Then we ran two types of spatial-block cross-validations. First, we divided the globe into 5° × 5° geographical blocks and then grouped the blocks into 10 folds.
The size of the blocks ensures spatial independence of the data and avoids training the model on data that overlap spatially with those used for testing. We then fitted the model iteratively on all folds but one and validated on the left-out fold. Second, we divided the dataset into biomes using the map from Olson et al. (2001) and iteratively see Section 2.6).

| Model predictions and uncertainty estimation
We used the selected model fitted on the full dataset for the final predictions and estimated the 75% and 95% prediction intervals using the SE and RMSE values of the model obtained by validation (Supporting Information Figure S1e). Both SE and RMSE values are on a log 10 scale because population density was log 10 -transformed before model fitting. We used the RMSE obtained by (1) Figure S1e). Finally, we estimated the mean bias estimate (MBE; average deviation between predicted and observed) using predictions based on the full dataset to assess possible prediction biases per taxonomic group or biome.
All analyses were conducted in R v.4.0.3 (R Core Team, 2020) using the packages "mgcv" (Wood & Wood, 2015), "raster" (Hijmans & van Etten, 2014), "ape" (Paradis & Schliep, 2019) (Figure 1a,b). Most included species were primates, rodents, cetartiodactyls and carnivorans (Figure 1a,b). The included species covered the entire range of body mass of terrestrial mammals but were biased towards large species (Figure 1c). We provide average population densities for 689 species based on empirical data (Supporting Information Table S1). Among these, the variability was calculated at the level of species for 85 species, at the level of family for 177 species, at the level of order for 327 species and across all species for 100 species.

| Drivers of population density
The selected model included all variables apart from nectarivory, frugivory and herbivory. The fixed effects explained 55.4% of variance (marginal R 2 ), and the fixed and random effects together ex- Furthermore, the two phylogenetic components were positively related to average population density (Supporting Information Figure S3), whereas the spatial coordinates exhibited a complex nonlinear pattern (Supporting Information Figure S4).

| Intraspecific variability and model prediction errors
Empirical density estimates within species varied 3.7 times on average (interquartile range = 2.9-to 5.0) from the median values ( Figure 4). Predictions of average population density for species and geographical areas included in the dataset deviated by 1.4 times on average (1.2-1.6) from the median values, those for novel areas c. 5.9 (5.6-6.9) and those for novel biomes c. 6.5 (4.9-8.6).
Predictions for species not included in the dataset (but belonging to families included) deviated by 4.7 on average (2.6-10.1), for families not included in the dataset by 6.9 (4.2-11.3), and those for orders not included by 8.5 times (5.6-14.9; Figure 4). Overall, the predictive accuracy achieved was higher than in previous modelling attempts using the same dataset (Supporting Information Appendix S2).

| Population density predictions
We provide model predictions of average population densities for 4,925 species ( Figure 5; Supporting Information Table S1).
Predictive interval estimates were obtained using the SEs in 700 species, by species-block cross-validation for 3,701, family-block cross-validation in 461 species and order-block cross-validation for 63 species (Supporting Information Figure S1e). All empirical averages, variability estimates and model predictions are provided as Supporting Information (Table S1).
We found that predicted densities varied across orders roughly following the range of body mass and diet included in each taxon ( Figure 5). We predicted the highest densities for rodents (c. 10,000 individuals/km 2 ) and the lowest for carnivores (c. .01 individuals/ km 2 ). The magnitude of variation in predicted densities spanned one to five orders for some taxonomic groups (e.g., Cetartiodactyla, Carnivora and Rodentia). Our density prediction errors show no consistent pattern across different biomes (Supporting Information Figure S5). It does, however, show consistent biases for some taxonomic groups. For example, species in the order Paramelemorphia (bandicoots and bilbies) tend to be overpredicted. Likewise, our models tend to overpredict densities for species in the families Geomydae (e.g., gophers), Camelidae (e.g., camels and guanacos), Paramelidae (bandicoots) and Pteropodidae (flying foxes).

Conversely, the families Erethizontidae (New World porcupines)
and Eupleridae (Madagascan carnivores) are consistently underpredicted. This might indicate that the density of these groups might be influenced by factors that are not included in our model. For most groups, however, species could be either over-or underpredicted, with no systematic bias to either consistently higher or lower densities when compared with empirical estimates (Supporting Information Figure S5).

| DISCUSS ION
In this study, we provide predictions of average population densities for all terrestrial mammal species, considering their biological traits, environmental conditions, phylogenetic relatedness and geographical location. Acknowledging that population density can vary substantially within species, we also provide a range of plausible density  Figure S6-S7).

| Drivers of average population density
Our results concur with previous studies regarding the effect of body mass and diet on population density as the primary drivers of density across a wide range of taxa. The nonlinearity in log-domain between body mass and density in mammals was first described by Silva and Downing (1995) and later supported independently by studies addressing the allometry of home range size (Kelt & Van Vuren, 2001). Very small mammals (e.g., shrews) have extremely high Conversely, above a certain body size, foraging over a very large area (or defending a territory) as predicted by a linear relationship becomes energetically unsustainable (Kelt & Van Vuren, 2001). We found that animal-based diets were associated with lower density values, as expected given the lower availability and sparse nature of trophic resources and the costs involved in searching and subduing prey (Carbone & Gittleman, 2002;. Likewise, the percentage of granivory was also associated with lower density values, which reflects the nature of seeds as resources. Seeds are temporally and spatially variable in abundance in comparison to other plant material and are better (mechanically and chemically) defended than other plant parts (Hulme & Benkman, 2002). Although seeds have a high energy content, this might not compensate for the time and energy costs of foraging by specialized granivores (i.e., rodents) in comparison to other herbivores that forage on lowerquality but remarkably more abundant vegetation. Furthermore, temporal fluctuation in seed availability contributes to the marked demographic cycles observed in granivores (Hansson, 1998). Hence, although the population density of granivores can vary substantially through time and space, their average is expected to be lower than that of other similar-sized herbivorous species.
Our results also show a novel pattern regarding locomotor habits, with aerial species living at lower population densities than terrestrial and arboreal species and with fossorial species living at higher densities. Aerial, arboreal and fossorial species are generally characterized by lower mortality rates than terrestrial species owing to lower predation rates (Healy et al., 2014), which, all else being equal, might lead to higher densities because of the altered balance between birth and mortality rates. Fossorial mammals are also characterized by lower metabolic rates than similar-sized species, which might also explain their higher densities (McNab, 1979). Volant mammals (i.e., bats), in contrast, are characterized by smaller reproductive outputs and slower reproductive rates than other similar-sized mammals (Barclay & Harder, 2003). The density of bat species might also be limited by the availability of roost sites (e.g., caves, rock crevices and tree cavities). Finally, although the resources of frugivorous bats might be comparable to those of other frugivorous mammals, insectivorous bats rely on different trophic resources from terrestrial insectivorous species. The reproductive strategy and availability of roosts and resources might explain why bats exhibit lower average density than insectivorous species of a similar sized, despite their lower mortality on average. However, a minority of estimates in the database refer to aerial species, and only relatively few density estimates are available for fossorial species; in both cases, the sample is not taxonomically representative, hence uncertainty around F I G U R E 4 Comparison of error estimates. The pooled SD of species-level estimates is used as a reference point representing the average variability in recorded population density estimates within species. Model's predictions standard errors (SEs) indicate the average error for the predictions of species included the training dataset. The root mean square error (RMSE) estimates obtained by block cross-validation indicate the average error made when extrapolating to novel geographical areas or different taxa. Spatial-block = 5° × 5° geographical areas not included in the training dataset; species-block = species not present in the training dataset but belonging to families present in the training dataset; family-block = species belonging to families not present in the training dataset but belonging to orders present in the training dataset; orderblock = species belonging to orders not present in the training dataset. Finally, our model also supports previous findings on the negative effect of precipitation seasonality on mammal densities (Santini, Isaac, Maiorano, et al., 2018), which might be attributable to higher seasonal mortality and seasonal resource bottlenecks, and a nonlinear relationship with primary productivity with lower average density in highly productive areas, which might reflect the higher competitive and predatory pressure owing to the higher number of species in these regions (Santini, Isaac, Maiorano, et al., 2018).

| Uncertainty in model predictions
Our predictions exhibited variable amounts of uncertainty, which depended on the degree of taxonomic extrapolation. Predictions of average density for species and areas included in the dataset have relatively small error, substantially smaller than the error one would make by assuming that a single density estimate from the literature can be representative of the average density of a species. Instead, the error associated with the prediction for species in families with density estimates included in the training dataset is comparable to the observed intraspecific variability. On the contrary, the predictions for species belonging to families or orders not included in the training dataset yield considerably higher errors, although they are plausibly the best estimates available to date. In general, average population densities can be informative for species for which no family or order extrapolation was made, but focusing on predictive intervals is safer for family and order extrapolations. The error of extrapolating to novel areas is higher than that for intrafamily extrapolations (to new species) but lower than the error associated with extrapolations to families or orders that were not included in the training datasets (e.g., Monotremata). Yet, the density estimates on which the model is fitted encompass all biomes and a large portion of dry lands, except for large deserts and the northern Asiatic continent; therefore, the spatial errors of most predictions are expected to be lower than those estimated by spatial-block crossvalidation. All in all, although the error of our density predictions F I G U R E 5 Distribution of population density predictions per taxonomic order. Coloured dots are used for orders with insufficient predictions to produce a kernel density. Black triangles represent the averages of empirical estimates for all species in the dataset can exceed one order of magnitude in poorly known species (thus being uninformative for many applications), the range of density estimates across all species span more than eight orders of magnitude.
Hence, focusing on distribution data only, while not accounting for the fact that species can vary enormously in their average population density, can be more problematic for estimating population protection and/or persistence (e.g., Santini et al., 2014) Obviously, all applications that can benefit from such estimates are those that do not require a real quantification of density, population size or biomass, but consistent and evidence-based estimates of how species are expected to differ in these respects. In the next section, we illustrate a variety of applications that might benefit from integrating such density estimates. Some of these cases typically ignore how density varies across species and rely only on distribution data, whereas others have integrated such information but using simpler approaches.

| Applications
Although acknowledging that many wildlife management applications require more accurate estimates from ad hoc field surveys (which will hardly ever be replaceable by model predictions), costefficient methods for estimating the potential size of populations over vast areas can find a broad array of applications in ecology and conservation. As in previous studies relying on different approaches (e.g., Fechter & Storch, 2014;Galaverni et al., 2016), our estimates allow us to estimate a range of plausible population abundance given a predefined area. Given that defining a distribution area shares its amount of uncertainty (Fourcade et al., 2018;Norberg et al., 2019;, a hierarchical bootstrapping approach can com- biomass estimates. Furthermore, species-specific abundance maps based on our density predictions and their uncertainty can be combined to infer community-level metrics, such as species diversity (alpha, beta and gamma) and community size (Wang et al., 2021), providing valuable information for understanding the facets of regional biodiversity. Other applications include the calculation of species abundance distributions (SADs) at broad spatial scales (e.g., ecoregions), which might provide clues into the evolutionary underpinnings of biodiversity (Fukaya et al., 2020), and the exploration of phylogenetic patterns of species abundance across taxa (Pie et al., 2021). Finally, estimates of average population density can potentially also be used to improve estimates of large-scale trophic web structures (i.e., meta-webs), which currently rely on only species distribution data and binary species interactions (Braga et al., 2019;O'Connor et al., 2020).
Estimation of the average expected size of a population over a large area can be used for preliminary assessments for the management of human-wildlife conflict and problematic species (Fechter & Storch, 2014;Galaverni et al., 2016;Lewis et al., 2017), disease control (Cheeseman et al., 1985;Jaenson et al., 2012) or species of conservation concern (Jędrzejewski et al., 2018;Okello et al., 2015;Tempa et al., 2019). Likewise, such predictions can find application in large-scale and multispecies conservation analyses. For example, population density estimates can be used to set baselines for conservation, to estimate the change in abundance from a pristine condition (Rodrigues et al., 2019) or to set targets for population reintroduction in the absence of more accurate information (e.g., introductions of functional counterparts of locally extinct species; Svenning et al., 2016). Other conservation-oriented applications include the combination of population density estimates with population targets to estimate minimum required areas for conservation (Hilbers et al., 2017) or their application for setting spatial conservation targets (Di . They can also be used to perform large-scale assessment of protected areas that accounts for the diversity of estimates of abundance per unit area across the set of species considered (Clements et al., 2018;Santini et al., 2014Santini et al., , 2016. For example, some studies have shown that overlooking differences in species population density consistently over-estimates the protection level of protected area networks for species living at low density (Clements et al., 2018;Santini et al., 2014Santini et al., , 2016. Conservation planning should also rely on the best possible data available to derive plausible targets of population persistence (e.g., by combining habitat suitability and density estimates) (Di Flather et al., 2011;Hilbers et al., 2017;Pressey, 2004;Traill et al., 2007). For example, approximate density estimates have been used for largescale conservation planning exercises (de Oliveira et al., 2009) and to estimate population viability at the landscape level and identify management options (Akçakaya et al., 2004;Carroll et al., 2003).
Large-scale assessments of species conservation status have mostly relied on geographical range maps and expert-based suitability models (e.g., Bird et al., 2012;Buchanan et al., 2008;Tracewski et al., 2016). Predictive models of population density can inform Red List assessments for species lacking abundance data (Cazalis et al., 2022;Santini et al., 2019). However, given that under the Red List criteria the conservation status of a species depends on the criterion that indicates the most threatened category, the application of criteria C and D using large-scale abundance estimates can be very informative at identifying species that would not be categorized as threatened by considering their distribution only . For example, a species with an area of occupancy >2,000 km 2 is not considered threatened according to criterion B. However, a large carnivore with a density of .01 individuals/km 2 would be expected to have a population of c. 40 individuals in an area of 4,000 km 2 , which would be sufficient to be listed as "critically endangered" according to criterion C. Estimates of average population density have also been used to project global biodiversity indicators into the future  and are thus informative for global biodiversity assessments and forecasting population trends in the future when combined with species distribution models.

| Caveats and future research lines
In this study, we modelled the average population density of mammal species as a function of both species characteristics and climatic variables, in line with previous studies (Currie & Fritz, 1993;Pettorelli et al., 2009;Santini, Isaac, Maiorano, et al., 2018). for spatially explicit predictions. The development of spatially explicit species-specific models for data-rich species would be preferable in this respect (e.g., Jędrzejewski et al., 2018;Lewis et al., 2017).
Our estimates are likely to be influenced by anthropogenic pressure, both because densities have been modified depending on how animals are positively or negatively impacted by humans (Said et al., 2016;Šálek et al., 2015;Tucker et al., 2020) and because some species might typically be studied in more pristine environments, whereas others are typically studied in human-dominated areas (e.g., badgers and foxes; Schley et al., 2004;Soulsbury et al., 2007;Van Apeldoorn et al., 2006). For example, some species of carnivore might occur at higher densities in disturbed habitat than in their natural habitat (Šálek et al., 2015), whereas highly threatened species, such as pangolins, might occur at lower densities because they are disproportionately persecuted (Ingram et al., 2017). In a way, the predictions we make reflect the taxonomically biased knowledge on animal abundance. It is, for example, conceivable that predictions for species in functional groups with generally higher extinction risk (e.g., Davidson et al., 2009) are biased to lower densities than those expected in pristine conditions (Santini & Isaac, 2021). Integrating variables of human impact in predictive models would be problematic for extrapolation to different areas because species cam show diverse responses to human impact and because the overall positive effect at the global scale is likely to arise from a filtering effect, whereby sensitive species are absent in highly anthropogenic areas, whereas resilient species thrive as a result (Tucker et al., 2020).
Although different techniques are deemed appropriate and used consistently to estimate densities in different taxa, they might influence model predictions to some extent. For example, the differences detected between aerial, terrestrial or fossorial mammals might, in part, reflect a sampling artefact. Further research on such differences and their underlying causes is needed.
Our results also highlight future avenues of research in the ecology and macroecology of population density. First, they highlight priorities for data collection, particularly in Paucituberculata, Notorycteromorphia, Monotremata, Tubulidentata, Afrosoricida and Chiroptera. Even a few empirical estimates for these groups might substantially improve our predictive ability. Second, they highlight knowledge gaps about species-specific drivers of variation in population densities. To ensure model predictions of all species, here we focused on biological variables with complete information, namely body mass, diet and locomotor activity. However, the resolution and quality of these variables could be improved in the future.
For example, expressing diet as percentage of item consumption is definitely an improvement compared with coarse diet categories, but still ignores that in some species diet can vary substantially across habitats. For example, the CarniDIET database (Middleton et al., 2021) contains several locally recorded diets for >100 species of carnivores, therefore allowing assessment of how local prey availability and diversity can influence predator population densities. Furthermore, life-history traits (e.g., litter size and sexual maturity age) and social (e.g., group size), reproductive (e.g., monogamy vs. polygamy) and territorial behaviour are unexplored factors that might also explain why some taxa tend to deviate substantially from model predictions. Data on such factors are limited and prevent the development of better predictive models for all mammals. Yet, data on a minority of taxa might be available and could allow a first theoretical exploration that might indicate the necessity of expanding our ecological knowledge on that particular trait. For example, it has recently been shown that species with large brain masses relative to body mass tend to live at lower densities owing to higher energetic costs (Gonzalez-Suarez et al., 2021); however, data on brain mass are available for only a limited number of mammal species, pointing out that measurement of brain sizes in other species could definitely open up novel research lines in macroecology and, eventually, increase our predictive capacity.
This study provides a comparative overview of population densities in terrestrial mammals that can find applications in many ecology and conservation studies. The quality of our estimates is contingent on the data available and, as more data are collected, we will be able to provide more accurate and reliable predictions of species population density. Our approach can be applied to other taxa for which a sufficiently representative set of density estimates is available. Future studies might explore spatially explicit approaches to predict population density in species for which many empirical estimates are available (Jędrzejewski et al., 2018;Lewis et al., 2017).

DATA AVA I L A B I L I T Y S TAT E M E N T
All datasets and codes used for the analyses are accessible at: