Digital mapping of GlobalSoilMap soil properties at a broad scale: A review

, collecting representative environmental covariates, improving the performance and interpretability of advanced spatial predictive models, relating performance indicators such as accuracy and precision to cost-benefit and risk assessment analysis for improving decision support; moving from static DSM to dynamic DSM; and providing high-quality, fine-resolution digital soil maps to address global challenges related to soil resources.

To address these global and regional concerns, the demand for up-todate and updatable soil information capable of supporting decisions at every level is increasing (Sanchez et al., 2009).Conventional soil mapping relies on soil survey; is labour intensive, time-consuming, and expensive; and heavily relies on experts' knowledge (e.g.McBratney et al., 2003;Grunwald et al., 2011;Sanchez et al., 2009;Arrouays et al., 2014a;Minasny and McBratney, 2016;Zhang et al., 2017).For a nonsoil scientist, these soil maps are difficult to interpret and use for decisionmaking in land management (Sanchez et al., 2009) because they are mostly based on taxonomic classification rather than quantifying soil properties.In response to these challenges, digital soil mapping (DSM, McBratney et al., 2003) has emerged over the last two decades to predict soil properties by integrating soil survey data, geographic information systems, geostatistics, terrain analysis, machine learning, remote sensing, and high-performance computing (Minasny and McBratney, 2016;Arrouays et al., 2017a).
As summarised by Lagacherie and McBratney (2006), "DSM is the creation and population of spatial soil information systems by numerical models inferring the spatial and temporal variations of soil types and soil properties from soil observation and knowledge from related environmental variables".This numerical or quantitative approach was trialled in the 1990s by McKenzie and Austin (1993), who proposed an environmental-correlation approach.The concept of DSM was later formalised as a framework for making digital soil maps, rooted in Jenny's five soil-forming factors (climate, organisms, relief, parent materials, and time; Jenny, 1941).The factors were developed into the Scorpan-SSPFe (soil spatial prediction function with spatially autocorrelated errors) framework (Eq. 1) by McBratney et al. (2003) for a quantitative spatial prediction.
S a or S c = f (s, c, o, r, p, a, n) + e (1) where S a and S c represent the soil attributes and soil classes, respectively.s refers to soil information; c refers to climate; o refers to organisms, vegetation, fauna or human activity; r refers to relief; p refers to parent material; a refers to age or time factor; n refers to spatial or geographic position; and e are spatially correlated residuals.Most of these variables are spatially and temporally explicit.

Scientific activity
The use of DSM has increased rapidly since the 1 st Global Workshop on Digital Soil Mapping organised in Montpellier, France, in 2004.The workshop was held biennially between 2006 and 2016 (Table 1).The idea of a global grid of soil properties emerged at the 2 nd Global Workshop on Digital Soil Mapping held in Rio de Janeiro, in 2006, as a solution to the increasing need for soil information to address global challenges.This workshop culminated in the establishment and development of the GlobalSoilMap.netProject, the name of which was changed to GlobalSoilMap in 2012.This project aims to coordinate the production of a fine-resolution soil information grid for a limited set of soil properties.The main outcome of the project was the production of a consensus-based specification document for the global grid (details in Table 2, Arrouays et al., 2014a).A set of six standard depth intervals (0-5, 5-15, 15-30, 30-60, 60-100, 100-200 cm) was suggested for generating digital soil maps.The 1 st GlobalSoilMap conference was held in Orléans, France, in 2013 (Arrouays et al., 2014b).Since then, DSM has evolved from a theoretical, academic focus to an operational status for delivering soil information to the scientific community and decision makers and policy makers (Scott et al., 2016;Grundy et al., 2020;Kidd et al., 2020) through the GlobalSoilMap project and Pillar 4 of the Global Soil Partnership (GSP) initiative (Minasny and McBratney, 2016;Arrouays et al., 2017a).In 2016, the International Union of Soil Sciences endorsed the Global Soil Map Working Group.
Several reviews of DSM have been conducted since 2009.Grunwald (2009) summarised the DSM progress on soil modelling from 90 articles published in two high-impact international soil science journals.Minasny et al. (2013) reviewed and discussed the advances in digital mapping of soil carbon from 40 articles, spanning field to national scales.Zhang et al. (2017) reviewed the progress in legacy soil data, environmental covariates, soil sampling, predictive models, and the applications of DSM products before 2017 and summarised their prospects.Minasny et al. (2019) reviewed 90 studies on peatland mapping in 12 countries or regions using the DSM.Lamichhane et al. (2019) reviewed DSM algorithms and covariates for soil organic carbon (SOC) mapping from 120 articles published between 2013 and 2019.Ma et al. (2019a) reviewed the relevance and synergy of pedology in DSM and discussed how DSM supports further advances in pedology.Wadoux et al. (2020) reviewed the application of machine learning algorithms in DSM.Piikki et al. (2021) performed a systematic review of the validation methods used in the DSM of continuous attributes.
In contrast with the listed reviews, this study focuses on the mapping of the 12 mandatory GlobalSoilMap properties (Table 2, Arrouays et al., 2014a) on a broad scale.In this case, a spatial extent greater than 10,000 km 2 is used to define broad-scale studies because this threshold corresponds to 83% of the countries and 99.99% of their total landmass.These soil property maps are produced using shared, international specifications that can be applied from country to globe.They relate to key soil information and can be used to address global environmental challenges.This review examines sampling strategies, environmental covariates, predictive models, efficient geospatial predictions, validation strategies, and uncertainty quantifications.Based on this review, we provide suggestions for further applications and developments in broadscale DSM.

Methods
To assess the current progress in broad-scale DSM, we performed a ) of the articles.The search expressions were "digital soil mapping" OR "globalsoilmap" OR "soilgrids" OR "soil-landscape modelling" OR "soil predictive modelling" OR "predictive soil mapping" (Scull et al., 2003).Because this literature review focuses on broad-scale DSM, we retained only the articles that had a spatial extent larger than 10,000 km 2 and with at least one soil property of interest listed in GlobalSoilMap specifications.We found 244 relevant articles published in English and recorded in the Web of Science and Scopus.We then extracted a list of variables to derive systematic plots for the results section.Table 3 shows the 26 variables recorded for this review.

Frequency of articles per year and journal
Fig. 1a shows the annual number of articles published between January 2003 and July 2021.A few articles addressed broad-scale DSM prior to 2011 (no more than six per year), and a significant increase in publications was observed after 2012.After the publication of the 1 st GlobalSoilMap Conference book in 2014 (13 articles), we observed a small increase in the number of publications for that year (i.e.32 articles), with 19 articles from journal publications.Between 2016 and 2018, the number of annual publications was stable (between 22 and 24), whereas in 2019 and 2020, there was an increase in the number of publications (i.e.37 and 31 articles).Seventeen articles were published from January to July 2021.
The number of journals relevant to broad-scale DSM is shown in Fig. 1b.Fifty-seven journals were identified, and journals with only one publication were classified as "Others."Geoderma had the greatest number of broad-scale DSM publications, with 72 articles.Science of the Total Environment and Geoderma Regional ranked second, with 17 publications each.In addition, 24.2% of the articles (59 of 244) were open access, with the majority published after 2014 (not shown in Fig. 1b).

Spatial distribution of articles
Fig. 2 presents the sum of articles at the national/sub-national scale, from which the continental or global studies were excluded (three in Africa, one in North America, one in North and South America, one in the Middle East, 11 in Europe, and seven worldwide).The figure shows that DSM has been used to deliver soil information worldwide.Among these countries, China, France, Australia, and the United States were the most active, with the highest number of publications at 43, 29, 25, and 24, respectively.We classified these publications according to continents: 35.2% of the articles (86 articles) were published in Europe, representing 6.8% of the total landmass (Antarctica included).Representing 29.5%, 5.9%, and 16.5% of the total landmass, Asia, Oceania, and North America accounted for 24.2%, 10.7%, and 13.1% of the articles, respectively.Africa and South America published less than 10% of the publications (17 and 13 articles), although they accounted for 20.4% and 12% of the total landmass, respectively.

Soil sampling
Trends in the spatial extent and soil sampling density are presented in Fig. 3.The sampling density varied from 1 to 0.0001 sample km -2 for the 244 articles.Sampling density tended to decrease with increasing spatial extent.This result also showed that the articles reported with high sampling density (>0.1 sample km -2 ) were mostly from Europe (e.g.Denmark, Sweden, Italy, Belgium, Northern Ireland, France, Estonia, the Netherlands, Scotland).
Fig. 4 shows the soil sampling year reported in 244 articles, 40.2% (98) of which did not provide relevant information when producing digital soil maps.The sampling year was a long time span, and the oldest samples were from the 1920s.Among the 146 articles reporting a sampling year, 24.7% had soil samples collected before the 1980s, and 50% used soil data collected after the 2000s.
The soil sampling design used for broad-scale DSM is shown in Fig. 4.More than half (52.4%) of the articles did not report relevant information, partially because soil databases were compiled from various sources of legacy soil information for different purposes; thus we can reasonably assume that they are nonprobability sampling designs (i.e. using the legacy data available).Fig. 4b also shows that probability and nonprobability sampling were used in 20.2% and 22.6% of the articles, respectively.A small proportion of the studies (4.8%) had a soil database that included samples collected from both probability and nonprobability sampling designs.For the most recent sampling campaigns (data not shown), there is a trend to use probability sampling more frequently and focus on filling geographical gaps based on legacy soil data or maps (e.g.uncertainty-guided sampling).

Soil property and maximum depth of interest
Soil properties mentioned in the GlobalSoilMap specifications are shown in Fig. 5a.SOC and soil organic matter (SOM) content had the highest number of articles (126).If the 63 articles on mapping SOC stock were added to this number, 77.5% of the articles (189 of 244) addressed SOC or SOM.They were followed by articles on soil particle size fraction, with 69, 57, and 39 articles related to clay, sand, and silt mapping, respectively.Soil pH was another soil property of high interest (49 articles).Other soil properties, namely, bulk density (BD), available water capacity (AWC), soil depth (SD), and cation-exchange capacity (CEC), were relatively less predicted in the broad-scale DSM.

Table 2
Overview of soil functional properties for GlobalSoilMap (Arrouays et al., 2014a).The maximum soil depth of interest for the soil maps produced is shown in Fig. 5b.Thirty-seven articles (15.2%) did not report the exact depth of the produced maps.These articles either provided broad categorical descriptors of soil layers (i.e.topsoil and subsoil) or reported soil genetic horizons (i.e.A horizon, B horizon).More than half the studies focused on topsoil mapping with a maximum depth of 30 cm, and 21.7% of the studies produced digital soil maps for soil depths greater than 100 cm.Among these studies, 98% reported values to a depth of 200 cm.was no clear tendency between the resolution of the produced map and spatial extent.Indeed, 70.9% of the articles produced digital soil maps with a rather fine spatial resolution (≤250 m), which is close to the spatial resolution (3 arc-second or approximately 90 m) mentioned in the GlobalSoilMap specifications.
The disaggregation model (6) refers to the spatial disaggregation by downscaling soil map units into soil classes or types (e.g.DSMART, Odgers et al., 2014a).Model averaging (7, or ensemble modelling) refers to soil property predictions that combine multiple models or map products.
As indicated in Fig. 8a, the number of articles related to the broadscale DSM was low prior to 2013, dominated by geostatistics and linear models.Since 2014, a large increase was found for nonlinear models with/without kriging, which peaked in 2019 and 2020 (i.e. 30 and 28 articles, respectively).Disaggregation and model averaging were used in broad-scale DSM since 2012 and 2015, respectively.Usage of model averaging showed an increasing trend, reaching a peak of four articles in 2020, and disaggregation was relatively less used in broadscale DSM.
We grouped the papers into 2D, 2.5D, and 3D models.A 2D model indicates that the predictive model does not account for vertical variation in soil properties; thus, only one soil layer is mapped.A 2.5D (or pseudo-3D) model uses harmonised soil properties for each specified depth interval (e.g.depth-weighted, equal-area spline), and then a predictive model is fitted independently for each depth interval.A 3D model uses depth as a covariate in modelling or predicting the parameters of a depth function spatially.The results showed that many studies (63.6%) applied 2D models, and a small percentage (8%) used 3D Fig. 2. Numbers of articles by country at national/sub-national scale with their sums at continental scale.Articles at continental and global scales (11 in Europe, 3 in Africa, 1 in North America, 1 in North and South America, 1 in the Middle East, and 7 for Globe) are excluded.The proportions of the countries and landmass covered by digital soil mapping for each continent are indicated.S. Chen et al. models for predictive mapping.

Validation strategy, performance indicators, and uncertainty estimation
Data splitting was the most frequent validation strategy used to evaluate map accuracy (86.9 %).Data splitting comprises single random holdback (i.e.divide data into the calibration and validation sets only once), stratified random holdback, cross-validation (i.e.k-fold crossvalidation, leave-one-out cross-validation, repeated k-fold crossvalidation), and other strategies (e.g.bootstrap, Kennard-Stone, nearest-neighbour distance), which accounted for 45.5%, 4.5%, 37.7%, and 2.9%, respectively (nine articles applied two validation strategies).Only 8.9% of the studies used independent validation.However, 5.8% of the articles did not show any validation procedure for evaluating the prediction performance of digital soil maps.
Fig. 8b presents the performance indicators used in the model evaluation.The coefficient of determination (R 2 ) and root mean square error (RMSE) were the most frequently used performance indicators (more than 180 articles).In third place (87 articles) was the indicator mean error (ME).Lin's concordance correlation coefficient (CCC) and mean absolute error were used for model evaluation in 55 and 53 articles, respectively.Other indicators, namely, the prediction interval coverage probability (PICP, assessing the quality of uncertainty estimates), the ratio of performance to deviation (RPD), adjusted R 2 (R 2 adj ), the mean and median of standardised squared prediction errors (θ and θ), the spatial structured variance ratio (SSVR), and the ratio of performance to inter-quantile (RPIQ), were used less frequently.
More than 56% of the studies provided uncertainty estimates (i.e.uncertainty maps) associated with the predicted soil properties produced by DSM.

Model performance of soil properties
In this study, we used R 2 was used to assess and compare the prediction performance across studies because R 2 was the most frequently used indicator (80.3%) in broad-scale DSM studies.R 2 is also dimensionless, which allows a comparison of the accuracy of different soil properties.However, the method for calculating R 2 differed by the author: some used the R 2 calculated by the square of the correlation coefficient between observations and predictions, and others used the R 2 by using the percentage of explained variance of the target soil property.
In our study, we focused on R 2 expressed as the percentage of explained variance to represent performance.For studies that did not report R expressed by percentage of explained variance but had Lin's CCC, we attempted to estimate R 2 by the CCC collected in this study.The fitted 3rd order polynomial regression model between the CCC and R explained 90% of the variance (R 2 = 0.90); therefore, it is suitable to estimate R 2 from CCC using this model.
Fig. 9 presents the performance of different soil property maps using DSM that are grouped in three depth intervals (0-30, 30-100, and 100-200 cm), except for SD.Most soil properties showed a decreasing trend in performance with increasing depth intervals, except for pH, AWC, and CEC.Additionally, the performance of 3D models (indicated by R 2 ) for most soil properties was greater than that of the 2.5D models, except for the BD.Soil pH had the best performance with a median R 2 of 0.60, 0.63, and 0.56 at depths of 0-30, 30-100, and 100-200 cm, respectively.A significant decrease in performance (0.49, 0.28, and 0.14) was observed for SOC/SOM over the three sequential depth intervals.SOC stocks had a similar performance as a median R 2 of 0.51, 0.35, and 0.06 at 0-30, 30-100, and 100-200 cm depth intervals, respectively.BD also decreased with depth, with a median R 2 of 0.56 at 0-30 cm, to 0.47 at 30-100 cm, and finally to 0.40 at 100-200 cm.Soil particle size fractions (i.e.clay, silt, and sand) had a median R 2 of approximately 0.50 in 0-30 cm, 0.40 in 30-100 cm, and 0.29 in 100-200 cm.Performance for AWC was higher at 0-30 cm (R 2 = 0.34) and 30-100 cm (R 2 = 0.42) than that (R 2 = 0.27) at 100-200 cm.CEC had a median R 2 between 0.34 and 0.40 at three depth intervals.In general, coarse fragments and SD were poorly predicted, with a median R 2 of less than 0.28.
(2) Uncertainty estimates was provided associated with digital soil map predictions.
In total, 21.7% of the articles (53 of 244) were classified as Global-SoilMap products, all of which were published after 2012.An increasing trend was found for GlobalSoilMap products, which increased from three articles in 2012-2013 to 11 articles in 2014-2015, and reached 13 articles biennially from 2016 to 2021.

The future, remaining challenges, and perspectives
This section is presented in four parts: i) trends in current and future products (i.e.publications and maps), ii) how to use legacy data and gather new soil information, iii) issues in modelling and spatial prediction, and iv) performance evaluation and uncertainty estimation.

Diversity challenges
Notably, 43.4% of the articles were published in three journals (i.e.Geoderma, Science of the Total Environment, and Geoderma Regional).The benefit of this phenomenon is the exchange of ideas in a limited environment, making it easier for the DSM community to use state-ofthe-art methodologies and approaches and keep abreast of the most recent updates.Consequently, we observed that the studies only attempted to communicate with soil science communities.Therefore, expanding the use of DSM products for soil property maps by other disciplines and, ultimately, stakeholders require publication and communication in multidisciplinary and open-access journals.Indeed, the DSM community is moving toward this target, for example, the diversification of journals increased from less than four before 2011 to more than 14 after 2018, expanding into nonsoil science journals and open-access articles.
Four countries (China, France, Australia, and the United States) have published nearly half the articles related to broad-scale (>10,000 km 2 ) DSM for the required soil properties.This result may be due to their huge land mass and/or their state of progress in delivering GlobalSoilMap products.In the future, we expect articles on broad-scale DSM to originate from other countries as part of the GlobalSoilMap initiative.
We may expect that the links between some institutes will be reinforced and expanded through the activities of the GlobalSoilMap Working Group and future workshops and meetings.In addition, we expect that some exchanges of researchers and DSM practitioners will reinforce these links and that capacity building will result in new partners.We also posit that the cluster constituted by the UN-FAO GSP will grow as they attempt to deliver products for other soil properties in addition to SOC, and if the effort in training and capacity building is pursued.

Increase in broad-scale digital soil maps
We observed a large increase in the number of publications over time.This increase began at the end of the 2000s, which may be partially attributed to the article in Science by Sanchez et al. (2009) and the activities of the GlobalSoilMap consortium.A peak was observed in 2014, corresponding to the publication of the 1st GlobalSoilMap Conference book.If we exclude this exceptional peak, the number of publications continues to increase.This increase occurs because some countries (e.g.Cameroon, China, Chile, Denmark, France, Hungary, India, Nigeria, Scotland, New Zealand) published complete GlobalSoilMap products (the United States, Australia) and released some selected GlobalSoilMap soil properties (Adhikari et al., 2013;Akpa et al., 2014;Viscarra Rossel et al., 2015;Mulder et al., 2016;Padarian et al., 2017;Poggio and Gimona, 2017;Ramcharan et al., 2018;Dharumarajan et al., 2019;Laborczi et al., 2019;Liang et al., 2019;Roudier et al., 2020;Silatsa et al., 2020;Reddy et al., 2021;Liu et al., 2021).What is likely is that the increasing trend in the number of maps will continue because some required soil properties remain under development and more countries (e.g.The Netherlands, Canada, Russian Federation, Iran, some countries in Africa, many countries in South and Central America) are joining the Global-SoilMap project or similar national/continental initiatives (Guevara et al., 2018;Pfeiffer et al., 2020;Zeraatpisheh et al., 2020;Helfenstein et al., 2021).This trend may be amplified by the increasing number of available environmental covariates (e.g.satellite imagery) with time, and efforts from the GSP to apply DSM methods to deliver soil property products.The strength of the trend is not certain, however, because it may depend on national soil mapping programme commitments and/or political situations or conflicts.

Current sources for input soil data
In most cases, the soil information used from DSM is from legacy data; thus, the sampling design cannot be controlled.Furthermore, for large territories, soil data may be assembled from local soil surveys, which can result in clustered sampling and unharmonised databases.
Although the technical and theoretical challenges for selecting the appropriate soil sampling schemes are well elaborated in the DSM community, the practical and operational challenges regarding how to implement them at global scales are not well defined.Additionally, most future input data will probably remain beyond the researchers' control, either because the data will be historical or because new data collected by others will mainly be used (i.e. with purposes other than mapping or performing some type of statistical sampling).Soil data used in broad-scale DSM dates back to the 1920s.Consequently, most of the available soil data used for modelling soil properties do not consider changes with time because of the scarcity of soil information (Fig. 4a).Although this practice may be acceptable for relatively stable soil properties, such as SD and particle size fractions, because that there is no evidence of erosion or redeposition in the short term (i.e. 100 years), it introduces large uncertainty for dynamic soil properties that may change over short time scales (from years to decades), such as topsoil SOC, pH, and CEC.

Improvement of imperfect soil data
Considering the imperfections (e.g.spatial coverage, sampling density, strategy) of data sources, developing techniques that mitigate the Fig. 9. Performance of DSM products for different soil properties at three depth intervals (0-30, 30-100 and 100-200 cm, except for soil depth).The dash horizontal grey line indicates the median performance of DSM products for relevant soil properties using 3D models (not built on specific depth intervals separately).The p values between different depth intervals are calculated by Mann-Whitney test.
S. Chen et al. unevenness and clustering of soil data is necessary.One solution to improve imperfect soil data is to correct and harmonise input data.A method to solve discrepancies in data density is declustering legacy data, either spatially or using covariates.In other words, clustered data can be weighted differently according to their spatial or covariate clusters and soil data collected from multiple sensors, platforms, and surveys (Richer- de-Forges et al., 2017;An et al., 2018;Liu et al., 2020a;Taghizadeh-Mehrjardi et al., 2020b).Another solution is to add new soil observations by using specific sampling techniques that correct the initial sampling design (Carré et al., 2007).
In solving the inconsistency of soil sampling time, two solutions may be helpful: ( 1 Another cause of the heterogeneity of legacy data can be related to different sampling protocols (e.g. between legacy soil profile databases and the grid-based soil monitoring network in France) and/or different laboratory methods (e.g.Walkley-Black method, dry combustion for SOC measurement) changing over time.One solution is to develop pedotransfer functions to make the results comparable, but this may require many soil observations where various methods have been applied (Ciampalini et al., 2013;Louis et al., 2014).The use of pedotransfer functions and differences resulting from multiple laboratories that measure soil properties based on different quality control standards is a potential error source for predictions when using soil data of different vintages from different countries for global DSM products (Libohova et al., 2019;Hu et al., 2021).
In a review, Arrouays et al. ( 2017b) indicated that approximately 800,000 legacy soil profiles were rescued in countries that responded to their survey and that this number of soil profiles is probably and largely underestimated.Despite the substantial success of DSM and data rescue efforts, the majority of soil data remain lost or only available in hardcopy format.However, new emerging deep learning techniques (e.g.image analysis, text recognition) are promising for converting data from hardcopy to digital format, speeding up soil data rescue.In a recent paper, Bui et al. (2020) demonstrated the necessity of considering the positional error (50 to 100 m) for data collected before 2000.We agree that this error can be a substantial source of data uncertainty in fieldscale studies, and it often remains in the same order of magnitude as most of the covariates (e.g.relief, climate) used for broad-scale DSM, especially for global studies.Moreover, the data collected in the past are precious regarding their use to assess soil changes at different periods.
Another advantage of using legacy data could be the use of field measured soil texture categories by soil surveyors; Malone andSearle (2021a, 2021b) presented a new approach to converting these data into quantitative estimates of soil particle size fractions by soil texture centroids and then integrating them into DSM modelling to improve map accuracy.

Suggestions for new data collection
Regarding future sampling campaigns, Brus (2019) asserted that there is no single best sampling design for DSM, and which is the best depends on the DSM technique.To produce digital soil maps, systematic grid sampling or regular geographical coverage sampling (even adding supplementary closer samples to better capture local variability) is recommended when no environmental covariate is available (Marchant and Lark, 2007;Walvoort et al., 2010;Wadoux et al., 2019a).However, these sampling schemes, particularly regular grids, may be costprohibitive.These two sampling designs are also preferred for constructing soil-monitoring networks to evaluate various soil properties simultaneously.
In the presence of environmental covariates, stratified random sampling, if sufficiently dense, offers an efficient manner to cover the feature and geographical spaces of some soil properties of interest and provides efficient statistical estimates (De Gruijter et al., 2006;Minasny et al., 2013).Proposed by Minasny and McBratney (2006), the conditioned Latin hypercube sampling (cLHS) is a modified version of a stratified sampling design that enables the selection of sampling locations based on the distribution values of a set of environmental covariates (feature space) and has been applied in many DSM studies (Mulder et al., 2013;Pahlavan Rad et al., 2014;Thomas et al., 2015;Omuto and Vargas, 2015).Other methods of stratification to maximise sampling efficiency (i.e.covering similar spatial variation with lowest cost) have been advocated, such as feature space coverage sampling with k-means (Brus et al., 2007;Yang et al., 2016;Brus, 2019;Ma et al., 2020;Wadoux et al., 2019b).In addition, covering not only the feature space but also the geographical space is necessary (Lagacherie et al., 2020).The best sampling design could result from a trade-off depending on the field conditions and relations between the target properties and covariates.If the relationship between covariates and the target property is strong, more effort can be put into sampling using the covariate feature space; if this relationship is weak, the preference may be to cover the geographical space.In any of these cases, leaving entirely "empty" areas is unsuitable because some spatial structures not captured by the covariates might be missed.Moreover, to use regression kriging, regularly spread points are necessary.For regions that have already used legacy data for DSM, the outcomes could be used to design efficient supplementary sampling campaigns in locations with greater uncertainty.
In addition to conventional laboratory analysis, significant efforts have been dedicated to cost-effective soil measurement using soilsensing techniques (e.g.soil spectroscopy, hyperspectral remote sensing) to significantly increase the density of soil data (or pseudodata) for DSM (Viscarra Rossel et al., 2011;Lagacherie and Gomez, 2018;Lagacherie et al., 2019).Although these studies have been mainly conducted to a much smaller spatial extent, we expect that these soilsensing techniques will be available to a wider spatial extent with the development of national and global soil spectral libraries and soilsensing platforms and systems (Shi et al., 2015;Viscarra Rossel et al., 2016, 2017;Demattê et al., 2018;Seybold et al., 2019).Some challenges remain: i) integration of the prediction errors from sensors in DSM modelling and ii) cost-efficient data fusion between sensors having different wavelengths and precisions.
Other new datasets collected outside soil surveys and DSM applications, such as data collected by farmers or through citizen science approaches, can also be utilised and integrated into DSM studies.Román Dobarco et al. (2017) and Caubet et al. (2019) have provided examples of how to incorporate information collected on farmers' demand for merging digital maps of soil texture using ensemble modelling.Citizen science (participation of nonspecialists in scientific research) has been proposed by Rossiter et al. (2015) in DSM studies.They stated that "citizens can contribute primary observations or note discrepancies on existing maps" and that the major issues are how to stimulate citizens to participate and how to integrate observations from citizens and professionals.
Fig. 5b shows a large gap in soil information for SDs greater than 30 cm.Ascertaining soil properties at depth is essential to understanding and the complete accounting for ecosystem service provision and response.The SOC stocks in the first 30 cm represent less than half and one-third of that stored in the first 100 and 200 cm, respectively (Batjes, 1996).Pries et al. (2017) showed that warming of the whole soil (down to 100 cm) revealed a larger soil respiration response than in many in situ experiments focused on topsoil.Deep carbon may also play a major role in C cycling, because it was shown to be more recalcitrant than topsoil carbon (Balesdent et al., 2018).Considering crops and plants, some essential parameters, such as SD and AWC, cannot be correctly assessed by monitoring only the topsoil.Moreover, climate change (e.g.anomalously warm droughts) will probably influence deep soil properties, especially through changes in water behaviour in soil, impacting crops and plants (Montagne and Cornu, 2010;Schlaepfer et al., 2017;Goulden and Bales, 2019).In many cases, rooting depth will exceed cm, and this is one of the main reasons why the GlobalSoilMap specifications include the characterisation of soil properties down to 200 cm.Deep soil properties are also linked to other important soil functions, such as water filtering, mediating pollutants, habitat support for animals, and human activities (e.g.infrastructure) (Baveye et al., 2016;Jónsson et al., 2017;Kelleway et al., 2017).However, because of limited resource availability, difficulty in deep soil sampling, and shallow soil conditions, only a few legacy soil data provide information for the 100-200 cm layer, and few current initiatives aim to develop soil databases at this depth.Data to this depth would allow predictions with acceptable precision but also support decisions that consider soil as a whole.Therefore, there is an urgent need to include data to greater SDs when building soil databases to improve soil monitoring, modelling, and functioning in further research and applications (Lal, 2018).Proximal soil sensors (e.g.electromagnetic induction, ground-penetrating radar) reveal information in subsoil, and these data are mainly used in local surveys; thus, further research is necessary to upscale them to larger areas and to search for more sensors relevant to subsoil.

Soil information of interest and its prediction performance
The first step of global soil mapping is to map essential soil properties related to flows, changes, and stocks of water and nutrients in soils.The 12 soil properties defined by GlobalSoilMap are central to nearly all modelling, monitoring, and forecasting exercises involving soils.We fully agree that mapping the threats to soil quality is also necessary (Montanarella et al., 2016), but we should not put the cart before the horse, and mapping these essential soil properties is a necessary precondition for soil threats and for soil functions, services, and security (McBratney et al., 2014).
In broad-scale DSM, studies on mapping SOC/SOM content and SOC stocks account for the largest proportion (77.5%).Because of the need to ensure food security and adapt to climate change, the potential of soil to store or sequester additional SOC has received considerable attention in recent years (e.g.Wiesmeier et al., 2014Wiesmeier et al., , 2020;;McNally et al., 2017;Minasny et al., 2017;Zomer et al., 2017;Chen et al., 2018Chen et al., , 2019a;;Lal et al., 2018;Chenu et al., 2019;Ma et al., 2021a).The UN-FAO GSP published their first global soil map on SOC stocks (GSOCmap) to establish a baseline.This map is used as a baseline to monitor soil conditions, identify degraded areas, set restoration targets, explore SOC sequestration potentials, support the reporting of greenhouse gas emission reporting, and make evidence-based decisions on adapting to and mitigating climate change.In addition, a global SOC sequestration potential map was prepared using the UN-FAO GSP.Soil scientists that recognised the significant role of SOC in the global C cycle and ecosystem services (Koch et al., 2013;Adhikari and Hartemink, 2016;Rumpel et al., 2018;Amelung et al., 2020); however, there remains a large uncertainty associated with global SOC estimates, and it consequently promotes the need to map soil carbon stocks (IPCC, 2013;Scharlemann et al., 2014).DSM can be a useful tool in the broad-scale mapping of SOC sequestration and storage potential.DSM can be an efficient decision-making platform for implementing proper, sustainable management practices and identifying areas with high potential for sequestering atmospheric carbon or for protecting soils to avoid CO release into the atmosphere (Akpa et al., 2016;Chen et al., 2018Chen et al., , 2019b;;Minasny et al., 2019;Martin et al., 2021).
Soil particle size fractions (i.e.clay, silt, and sand) are the second most frequently studied and are important for soil hydrologic model parameters (i.e.AWC and soil moisture), erosion, biogeochemical, and crop modelling.SD and thickness are among the most poorly predicted soil properties, mainly for four reasons: (1) high spatial heterogeneity, (2) difficulty in modelling with the most used Scorpan factors, (3) high cost of measurement, and (4) different definitions (Lacoste et al., 2016).
Right-censored SD (when the measured SD is less than the actual SD) is also notable because it underestimates SD for deep soil in modelling.Several solutions are available for correcting this limitation: (1) estimating the pseudo SD at the censored location by the simulated beta function from SD observations (Kempen et al., 2015); (2) integrating pseudo-observations generated by expert knowledge in modelling, which is applicable for deep soils and bare rocks (Shangguan et al., 2017); (3) using random survival forest to produce a probability map of exceeding a given SD (Chen et al., 2019c); and (4) integrating data mining with binary models representing both rock outcrops and deep soil (Malone and Searle, 2020).
Coarse fragments were almost the least predicted soil properties.This finding is probably due to its high spatial variability.Because of the importance of coarse fragments in weight-to-volume conversion for soil water, SOC, nutrients, or trace elements, improving its predictive ability is necessary.A more accurate measurement of coarse fragments would require large sample volumes.In addition to the larger volumes required for such analysis, in many routine soil surveys and laboratory measurements, coarse fragments from a soil profile are often only visually estimated.In some cases, they are measured in the field for fractions between 20 and 70 mm (in diameter) but measured in the laboratory for fractions between 2 and 20 mm (Soil Survey Staff, 2014), which may be another reason for its poor prediction accuracy.Regardless of the method of coarse fragment estimation, one challenge is to integrate the measurement error propagated in DSM predictions (Román Dobarco et al., 2019b).
Plant exploitable (effective) depth, also called rootable depth, is rarely mapped in broad-scale DSM.This phenomenon is due to two reasons: (1) rootable depth is not recorded and/or directly related to any soil property that can be observed during soil field surveys, and (2) rootable depth varies between plant species.Leenaars et al. (2018) provided a good reference for plant exploitable (effective) depth by estimating the rootability index for maize.This index expresses the adequacy (0-100%) of the 10 selected soil factors to support root growth relative to optimal root growth.A threshold index of 20% for each soil factor describes inadequate conditions for plant exploitable (effective) depth.
Currently, DSM has a very unbalanced vision of the soil that mainly focuses on SOC and nearly ignores key properties such as SD or coarse fragments.In the future, DSM should address these properties, soil types, and soil quality indicators that match user demands.Many national and global concerns (e.g.food security, water security, human health, mitigation, adaptation to climate change, biodiversity protection) require other soil information.This is also relevant for mapping threats to soils (e.g.erosion, salinity, loss in SOC, contamination, nutrient imbalance, loss of biodiversity, compaction).Thus, applying broad-scale mapping to other soil properties, such as electrical conductivity, soil moisture, Na 2+ , N, P, K, S, trace elements and persistent organic pollutants, infiltration capacity, structural stability, and biological properties (e.g. the abundance, diversity, and activity of soil organisms) is necessary.Some soil structure descriptions and related properties are often stored in databases as qualitative variables, but they are rarely used for DSM.In addition, Richer-de Forges et al. ( 2019) stated that based on a survey of users' needs in France, some indicators of soil properties, such as structural stability, macro-porosity, water infiltration, and rooting potential, are often asked by end-users as they integrate information on many soil functional properties.Maps of functional soil properties rather than maps of directly analytical soil variables are necessary, which emphasises the need to move from DSM to digital soil assessment (DSA) (e.g.Carré et al., 2007;Finke, 2012;Minasny et al., 2012;Kidd et al., 2015Kidd et al., , 2020;;Arrouays et al., 2020b).Finally, there remains large potential for coupling process knowledge, pedology, and DSM (Finke, 2012;Ma et al., 2019aMa et al., , 2019b;;Wadoux, 2019) and for mapping services rendered by soils (e.g.Dominati et al., 2010;Robinson et al., 2014;Turner et al., 2016;Kidd et al., 2020).In summary, future DSM should involve stakeholder requirements to produce products that decision makers (not only soil scientists) need and want to use.
Overall, model performance (in R 2 ) for most of the soil properties was not high in broad-scale DSM, which may interactively result from the following reasons: (1) the low quality of input soil data, including the unrepresentativeness of soil samples and differences in soil sampling dates; (2) the low quality of environmental covariates, including the inconsistency of resolution in the original data, the absence of data relevant to soil formation, and target properties; (3) the mismatch between point-based soil sampling and pixel-based modelling; and (4) the low predictive ability of the model.The solutions to these issues are discussed in the following sections.

Space-time modelling
Most DSM studies focus on predicting soil properties for a particular time frame but not on their changes.To map past changes, Meersmans et al. (2011) resampled soil profiles at the same locations as a soil survey from the 1960s.Most other studies have used soil data from several periods obtained using different sampling designs (Sun et al., 2012;Minasny et al., 2016;Schillaci et al., 2017;Song et al., 2018;Huang et al., 2019;Zhou et al., 2019;Sun et al., 2021).However, the range of the estimated trend should be compared with the map uncertainty to test if this trend is plausible or at the same order of magnitude as the cumulated errors linked to the spatial predictions at different dates.
Few studies have attempted to predict future soil changes, and most of them are related to SOC.Minasny et al. (2013) stated that two methods are used to address this issue: a dynamic mechanistic simulation model and a static empirical model.In a dynamic mechanistic simulation model, DSM is first used to estimate an initial soil state; next, the model is simulated per pixel under future climate, land use/land cover (LULC), and land management scenarios (Martin et al., 2021).In a static empirical model, future soil changes can be predicted using a fitted Scorpan model, in which the present climate, LULC, and land management are replaced by future scenarios (Yigini and Panagos, 2016;Gray and Bishop, 2016;Meersmans et al., 2016;Adhikari et al., 2019;Reyes Rojas et al., 2018).These last broad-scale DSM studies used a static empirical model, which mainly results from several constraints, such as (1) the large disconnection between DSM and mechanistic dynamics modelling, and (2) complex parameter initialisation and heavy computing, which is challenging for mechanistic dynamics modelling on a broad scale (Walter et al., 2006;Luo et al., 2016).These challenges require collaboration among scientists from multiple disciplines and improved integration of DSM and dynamic mechanistic modelling to speed up the simulation efficiency and improve the prediction accuracy (e.g.simulate observed locations by mechanistic dynamics model and then map soil information by DSM on simulated data).

Environmental covariates
The frequency of the Scorpan factors used in DSM is often restricted by the availability of environmental covariates (Grunwald, 2009).Benefiting from global, free, and available remote sensing data (Fig. 11), relief, organisms (mainly vegetation), and climate factors have been widely used in broad-scale DSM, and the frequency of other factors, such as soil, parent material, age, and position, have had more limited use (Fig. 6).Some proximal sensors, such as electromagnetic induction, ground-penetrating radar, and X-rays, are commonly used in field-scale soil mapping and are absent in large-scale mapping studies.
LULC, vegetation index (e.g.normalised difference vegetation index, enhanced vegetation index), and net primary productivity are mainly derived from remote sensing products.Because of substantial advances, satellites can now provide products with a high spatiotemporal resolution.Examples include the newly launched multitemporal Sentinel (5-20 m resolution), Sentinel-2 (10-60 m resolution), and Sentinel (300 m resolution), which has proven its high potential in DSM (Poggio and Gimona, 2017;Loiseau et al., 2019;Dharumarajan et al., 2020;Zhou et al., 2020;Zhou et al., 2021).The potential of future Sentinel-10 hyperspectral data requires further exploration.

S. Chen et al.
Notably, hyperspectral remote sensing can also be used as a proxy for some soil properties in topsoil for bare soil surfaces (Gholizadeh et al., 2018;Lagacherie and Gomez, 2018;Castaldi et al., 2019;Vaudour et al., 2019).Fine-resolution and multitemporal imagery offer the possibility of detecting vegetation dynamics and changes in some topsoil properties (e.g.SOC, salinity).Using a time series average (from month to decade) or a particular snapshot is a commonly used practice in DSM modelling.However, spectral images may only represent "recent" times (at least since the widespread deployment of satellites) where soil has been more disturbed by human beings.As a result, poor correspondence between covariate snapshots (the time coverage of spectra images) and the status of the soil property is expected.This limitation from satellite imagery can probably be solved by integrating other Scorpan factors, such as parent material and age, in spatial predictive modelling.Including recent LULC changes (e.g.conversion from forest or grassland to cropland) and land management practices helps improve map accuracy, especially in regions with intensive human activities.In addition, new effective covariates can also be explored in DSM, such as land surface dynamic feedback information and land surface phenology variables (e. g.Zeng et al., 2020;Yang et al., 2021).
Evidence shows that soil micro-and macro-organisms contribute significantly to global biogeochemical cycles in a changing climate (Wieder et al., 2015;Luo et al., 2016;Cavicchioli et al., 2019;Jansson and Hofmockel, 2020).These below-ground organisms are rarely used in DSM because they are much more difficult to measure than aboveground organisms, in addition to being highly variable both in space and time (Banfield et al., 2017).Global maps for soil fungi (Tedersoo et al., 2014), bacteria (Delgado-Baquerizo et al., 2018), nematodes (van den Hoogen et al., 2019), and earthworms (Phillips et al., 2019) have been published, providing good resources for broad-scale DSM practices.Global soil biodiversity maps were produced using a DSM approach.However, concerns remain about their correlation with other environmental covariates (e.g.LULC, climate).In addition, such biological indicators are highly variable in space and time (Kuzyakov and Blagodatskaya, 2015), making accurate predictions difficult.
In most studies, only the annual averages of climatic data have been considered in the DSM.We suggest that the short-term average (monthly or seasonal) of climatic data and its variance within the corresponding period may also be a useful covariate to characterise some soil properties (e.g.Keskin et al., 2019;Liu et al., 2020b) because this information provides insights into key soil processes (e.g.wetting and drying frequency, soil moisture movement).Changes in vegetation response to extreme climatic events, such as changes in NDVI or other remote sensing indices after extreme droughts, could also be very useful (Mulder et al., 2019).
In broad-scale DSM studies, soil factors are often characterised by soil class maps and/or soil texture maps derived from historical soil surveys (Grunwald, 2009).Soil information from proximal soil sensing can also be spatially interpolated and serve as a covariate for DSM.For example, the first three principle components of visible-near infrared (Vis-NIR) spectra have been used to produce an Australian threedimensional soil grid (Viscarra Rossel et al., 2015).Other soil information, including soil moisture and soil property maps from other sources, can also be used as covariates in spatial predictive modelling (Keskin et al., 2019;Liang et al., 2019).
For broad-scale studies, parent material is mainly derived from geological maps and partially from airborne gamma-rays in some countries or regions, such as the United States, Australia, the United Kingdom, and some regions of France (Lacoste et al., 2011;Beamish, 2014;Viscarra Rossel et al., 2015;Keskin et al., 2019).Based on the statistics in Fig. 6, the usage of the parent material is low because of the absence of data sources.With technical advances, such as airborne gamma-ray spectrometry and proximal gamma-ray spectrometers, the availability of this factor is expected to increase and is already used in DSM practice (Loiseau et al., 2020;Chen et al., 2021a).Gray et al. (2016) demonstrated strong improvement in DSM predictions when the lithology variable was classified for pedologic purposes and then used for DSM.They pointed out that lithology data could have substantial potential for use in DSM.Despite some recent progress (Miller and Juilleret, 2020;Simon et al., 2021), lithology maps remain scarce in most parts of the world; thus, they require further exploration.
Despite its significant role in pedogenesis, age remains the least used Scorpan factor because of the difficulty of direct measurement at a broad scale (McBratney et al., 2003;Zhang et al., 2017).Considerable advances in technology (e.g.soil dating, material dating) and expert knowledge are necessary to derive the age factor, especially for broadscale DSM.Geology and geomorphology, however, may help derive information on age (e.g.ancient periglacial landscapes).
Because the number of environmental covariates used for DSM has rapidly increased recently, the principle of parsimony has become more crucial than ever for covariate selection in DSM (Wadoux et al., 2020).Although the use of more environmental covariates may improve the model accuracy, especially in combination with machine learning, this approach may introduce more uncertainty from input data and make the results less interpretable from a soil science point of view.Therefore, to find the correct balance between parsimony and model performance, covariate selection is necessary in DSM, not only based on a pure statistical selection or case-based reasoning but also on their pedological relevance (Wadoux et al., 2020;Liang et al., 2021).
Notably, the importance of environmental covariates varies in different soil properties of interest (Fig. 12).SOC and SOC stocks are jointly dominated by temperature, precipitation, elevation, and parent materials in broad-scale studies, and BD is mainly controlled by soil class, temperature, precipitation, land use, topographic wetness index, and parent materials.If the soil properties of interest were soil particle size fractions, then parent materials, elevation, solar radiation, temperature, and precipitation are major controlling factors, and the use of gamma-ray data can also help improve their model performance.2020) after investigating the environmental controllers of SOC with scale.To manage multiscale interactions between soil properties and environmental covariates, wavelet transformation, empirical mode decomposition, and the Gaussian scale space are suggested to produce multiscale covariate layers to potentially improve the prediction accuracy in DSM (e.g.Zhou et al., 2016;Behrens et al., 2018;Sun et al., 2019;Taghizadeh-Mehrjardi et al., 2021).
A commonly observed issue in DSM using field point data and covariates is that they are not linked to the same spatial support (points vs "pixels" or map units) and do not have the same accuracy.Thus, one challenge is how to make the best use of both data when knowing that there are discrepancies in spatial support and precision of measurements.

Predictive models
Machine learning has become the most commonly used predictive model in broad-scale DSM since 2011 (Fig. 8a).This trend has also been confirmed by Arrouays et al. (2020b) and Padarian et al. (2020), and mainly results from three reasons: (1) machine learning can manage complex nonlinear relationships between the soil property of interest and an increasing number of environmental covariates, and it thus often performs better than classic statistical models and geostatistics; (2) the rapidly increasing computing power and techniques (e.g.parallel computing, cloud computing, high-performance computing) make it more efficient to produce digital soil maps from big data using DSM than ever before; and (3) machine learning is a nonparametric method that does not require any hypothesis on distribution and stationary, which are no longer valid with large spatial extents and legacy data.However, geostatistical models remain useful because they can capture other spatial structures (e.g.diffuse contamination) better than pure machine learning models, and the use of hybrid modelling (e.g.random forest spatial interpolation, Sekulić et al., 2020) for broad-scale DSM may improve the model performance when some spatial structures have not been captured by machine learning models.An alternative to using geostatistical models, which may be difficult to undertake at a broad scale, is to consider the geographical locations as inputs of machine learning algorithms, enabling these algorithms to capture the spatial structures not explained by classical environmental covariates (Hengl et al., 2018).Nevertheless, using spatial coordinates as inputs in hierarchical models should be avoided because these models partition the input space and do not fit a trend.
A recent advance in predictive models of DSM is the introduction of deep learning (i.e.2-dimensional convolutional neural network), explicitly described by Padarian et al. (2019), Wadoux (2019), Wadoux et al. (2019c), andTaghizadeh-Mehrjardi et al. (2020a).Deep learning opens new possibilities for predicting soil properties because (1) the input data for model training is a stack of spatial patterns, not spatial points, and (2) the trained model enables simultaneous predictions of multiple soil properties (Padarian et al., 2019).
Despite substantial advances in machine learning and deep learning, predictive models focus on prediction performance and overlook the importance of pedological knowledge for DSM and the use of DSM in understanding controlling factors of the soil property of interest.Therefore, further DSM research should further endeavour to open the "black boxes" of machine learning and deep learning (e.g.convolutional neural network) and integrate more pedological knowledge (e.g.structural equation modelling) in both the predictive model and environmental covariate selection (Angelini et al., 2017;Angelini and Heuvelink, 2018;Arrouays et al., 2020a;Ma et al., 2019b).Conversely, DSM can be used not only for predictive mapping but also for enhancing pedological knowledge (Wadoux et al., 2020;Ma et al., 2019a).

Comparison between 2.5D and 3D models
There is no general consensus on whether the 2.5D model is better than the 3D model or vice versa (Ma et al., 2021b).Each has pros and Fig. 12. Variable importance in broad-scale DSM.Considering the differences in determining the variable importance, such as correlation coefficient, t value, F value or partial R 2 using MLR, rule usage in Cubist, %IncMSE, or IncNodePurity in machine learning, the importance of each variable is ranked from 10 (most important) to 1 (least important).We excluded AWC, CEC, and coarse fragments because the variable importance for these soil properties were reported in less than five articles (not robust enough).Notably, we only keep widely used Scorpan-related environmental covariates.AWC, available water capacity; PET, potential evapotranspiration; ET, evapotranspiration; SR, solar radiation; EVI, enhanced vegetation index; NDVI; normalised difference vegetation index; NDWI, normalised difference water index; NPP, net primary production; LULC, land use and land cover; SP, slope position; SL, slope length; PlanCur, plan curvature; ProCur, profile curvature; TWI, topographic wetness index; TRI, terrain ruggedness index; TPI, topographic position index; CTI, compound topographic index; MrVBF, multi-resolution valley bottom flatness; MrRTF, multi-resolution ridge top flatness; VD, valley depth; CA, contribution area; PM, parent materials.
S. Chen et al. cons.In a 2.5D model, an equal-area spline is commonly used in vertical interpolation or depth interval standardisation (Ma et al., 2021b).This spline technique can be sensitive to outliers and cannot be extrapolated outside the range of observations.
Another approach is to fit a depth function (e.g.exponential function) in 3D modelling (Meersmans et al., 2009a(Meersmans et al., , 2009b;;Ottoy et al., 2017;Rentschler et al., 2019).Poggio and Gimona (2014) used a 3D GAM model using geographical location (x, y) and depth (z) as coordinates to fit the model.Hengl et al. (2017) directly used depth as a covariate in random forest and gradient bootstrap tree models.Liu et al. (2016) proposed an approach to predict the 3D variation of SOM concentration by integrating a similarity-based method with depth functions.They concluded that the proposed approach is effective and accurate for 3D SOM prediction and that it overcomes the two drawbacks of the 2.5D approach: (1) the neglect of vertical soil patterns when performing horizontal soil predictions and (2) the repeated applications of depth function fittings in the mapping process, both of which may lead to prediction errors.However, Nauman and Duniway (2019) noted that 3D modelling of soil properties with strong variation with depth can result in substantial areas with much higher uncertainty that coincide with unrealistic predictions relative to 2.5D models, although 3D models performed slightly better.Roudier et al. (2020) proposed a 3D modelling approach relying on data augmentation and demonstrated that there were no differences in 2.5D and 3D models in predicting soil properties at defined depth intervals.Ma et al. (2021b) found similar results; however, using depth as a predictor in tree-based models could create "stepped" depth functions.The choice between 2.5D and 3D models may be case-specific, depending on the variation of soil properties with depth and soil sampling volume.

Performance evaluation uncertainty estimation and map resolution 4.4.1. Validation strategy
As aforementioned in the results, data splitting either in single random holdback (45.5%) or cross-validation (37.7%) was the most commonly adopted validation strategy for evaluating DSM products, which is in line with the recent finding of Piikki et al. (2021).Although more studies chose a single random holdback, it may result in nonrobust accuracy because the single randomly selected validation data may not represent the entire dataset (Lagacherie et al., 2019;Chen et al., 2021b).Therefore, we recommend repeated random holdback (i.e. 100 times) or cross-validation for reporting robust validation results.Indeed, such an approach has provided uncertainty estimates in many broad-scale DSM studies (e.g.Mulder et al., 2016;Kempen et al., 2019;Loiseau et al., 2019).In addition, spatial cross-validation needs to be further tested for clustered legacy data to avoid overly optimistic model performance due to spatial autocorrelation (Meyer et al., 2018;Ploton et al., 2020;Poggio et al., 2021).Brus et al. (2011) recommended the use of independent validation datasets because data splitting may not provide an unbiased accuracy assessment because of the nonrandom sampled soil data.These additional independent data can be collected using a design-based sampling strategy involving probability sampling and design-based estimation.Due to the high cost of additional soil sampling, only a few broad-scale studies (9.1%) have used independent validation for map evaluation (Thomas et al., 2015;Rial et al., 2016;Vaysse et al., 2017;Ellili Bargaoui et al., 2019).Lagacherie et al. (2019) suggested that if this independent validation is not conducted with a proper sampling density, it can lead to uncertain prediction performance assessments.

Indicators for model evaluation
As indicated in Fig. 8b, R2 is the most commonly used indicator for model evaluation of continuous soil properties.The use of R 2 allows the comparison of the accuracy for different soil properties with various units and magnitudes; thus, a recommendation is to report it in DSM studies.However, R 2 has several limitations in interpretation because it strongly depends on the number of points used to calculate it, and it is very sensitive to the presence of extreme values or outliers.Notably, the method used to calculate R 2 differs by the authors; some may take the R by the square of the correlation coefficient between observations and predictions, and others use the R 2 by using the percentage of explained variance of the target property.If different DSM studies are to be compared, a consistent method for R 2 calculation must be adopted.RPD is another indicator that eliminates the difference in units and magnitudes.However, Minasny and McBratney (2013) suggested using either RPD or R 2 because they are the same measure, and the ratio of performance to interquartile range (RPIQ) is a better indicator than RPD for data that are not normally distributed (Bellon-Maurel et al., 2010).
In addition to the aforementioned indicators, we suggest the following in DSM studies: (1) SSVR and PICP.SSVR has been used to fill the gap if none of the aforementioned indicators are relevant to the spatial structure of the produced digital maps (Poggio and Gimona, 2018).It is defined as a complement to one of the nugget to sill ratios (Kerry and Oliver, 2008).An SSVR value closer to one indicates a higher proportion of the data explained by the spatial component.PICP is an indicator that determines the efficacy of uncertainty estimates, which could be important for end-users in decision-making.If the uncertainty estimates are reasonably defined, PICP should result in 90% for a 90% prediction interval (Malone et al., 2016).
Caution should be used when employing large-scale (i.e.national, continental, and global) DSM products for studies at a local scale, because map accuracy at this scale may be much lower than global accuracy (Gomez and Coulouma, 2018).Notably, uncertainty may be underestimated because of sampling imperfections.For instance, Lagacherie et al. (2020) showed underestimations of the uncertainty of DSM predictions, especially for sparse samplings poorly covering the distribution of the target soil property and the dense samplings unevenly distributed in the geographical space.

Estimates of map uncertainty
More than half (137 of 244 articles) studies provided estimates of map uncertainty, of which a large percentage were published after and related to GlobalSoilMap products.Approaches used for uncertainty quantification can be classified into the five groups (Malone et al., 2017): (1) universal kriging prediction variance, (2) bootstrapping, (3) empirical uncertainty quantification through data partitioning and cross-validation, (4) Monte Carlo simulation, and (5) the Bayesian approach.Groups 1, 4, and 5 were mainly used in geostatistical models, and Groups 2 and 3 were mainly used for machine learning models.The produced map and its associated uncertainty are useful in decisionmaking for end-users and allow quantification of uncertainty propagation in some secondary soil information (e.g.SOC stocks, AWC) and digital soil assessment (Finke, 2012;Poggio and Gimona, 2014;Román Dobarco et al., 2019a, 2019b).For example, Román Dobarco et al. (2019b) found that the main sources of uncertainty for soil AWC maps were not the pedotransfer function for predicting AWC but the input maps of coarse fragments and particle size fractions.
Among the reviewed papers, none provided a complete probability distribution for each property at six depth intervals.In addition, no study provided the joint probability distribution of several soil properties, which are necessary to combine these soil properties to develop soil indicators for digital soil assessment.Thus, further efforts are necessary to fulfil the GlobalSoilMap specifications (marginal probability distribution and joint probability distribution), as Heuvelink (2014) stressed.
In GlobalSoilMap specifications (Tier 1), the uncertainty of the estimates of soil properties should be presented at 90% prediction intervals (PIs) (Arrouays et al., 2014a).An important question is whether 90% PIs are useful for decision-making or should be narrower, especially for poorly predicted soil properties.In some cases, as Helmick et al. (2014) demonstrated, the actual width of these intervals might be larger than is practically useful.For some decision makers, a PI of 75% might be adequate to support a decision.If we compared the PI with conventional S. Chen et al. map purity, which rarely exceeds 80% (Marsman and de Gruijter, 1986), even for detailed maps, we might wonder if 90% PI is practical and useful for some end-users.Models such as quantile regression forest (Meinshausen, 2006;Vaysse and Lagacherie, 2017;Kasraei et al., 2021) allow the use of different quantiles to estimate PIs.Moreover, communication on uncertainty would be more efficient if it was based not only on the primary soil attributes but also on the consequences of their uncertainties when they are used as inputs for final products delivered by end-users (Heuvelink, 1998;Richer-de-Forges et al., 2019) or modellers.For the modeller, knowing only a modal or a mean value and a PI seems useless unless hypotheses on the distribution of the target soil property can be formulated.In some cases, simple hypotheses such as lognormal or triangular distributions may be used (Odgers et al., 2014b).Heuvelink (2014) proposed the representation of uncertainty in soil property maps by using probability distribution functions.
Although the maps we reviewed were mainly shown as "pixel" or "voxel" representations, they are point-based predictions.In moving to the Tier 2 GlobalSoilMap product, predictions on an area (or a block) rather than a point must be made.This should have consequences on the way the uncertainties of these predictions are modelled and estimated.This task is not trivial, because most soil data used for training and validation are, at present, point observations.Moreover, in many applications, end-users are interested in obtaining information on areal units (e.g.watershed, municipality, farm, field); thus, spatially aggregating soil property prediction and uncertainty is often necessary (Vaysse et al., 2017).

Conclusions
In this paper, we reviewed 244 articles on the use of DSM to map GlobalSoilMap soil properties at a broad scale (>10,000 km 2 ) published from January 2003 to July 2021.This review provides the following insights.
(1) The number of DSM publications increased exponentially; however, most of the growth occurred on the application, not in the development of new DSM approaches.
(2) Many articles do not provide information on how the soil sample data were collected, such as sampling year and sampling strategy.We suggest that this information should be reported because of its high relevance to DSM quality and the design of future sampling campaigns.
(3) Most of the studies focused on mapping SOC/SOM, SOC stocks, and soil particle size fractions.Further efforts are necessary to predict other soil properties essential for soil function.
(4) Half the articles produced maps for topsoil only (<30 cm).Studies on deep soil, down to 200 cm, need to be more represented (21.7%).We also observed a decreasing model performance trend with deeper depth intervals for many soil properties.New sampling efforts should focus on the whole soil profile and finding improved covariates and new sensors for deep soil.
(5) In many cases, missing covariates constrain the prediction accuracy of the models.For example, the age factor has rarely been used, because of the lack of relevant covariates; therefore, advances in technology are necessary to better represent age.
(6) Nonlinear models (i.e. machine learning) have been increasingly used in DSM.Incorporating more pedological knowledge for reasonable modelling is necessary from scientific and technical viewpoints.Additionally, a focus should be figuring out how DSM could provide new insights into pedological processes.
(7) In addition to commonly used model performance indicators (i.e.R 2 , RMSE, and ME), the SSVR and PICP are suggested for model performance evaluation so that the spatial structure and the efficiency of the uncertainty estimates can be accounted for.
(8) We call for a continual effort in legacy data rescue and new data collection for mapping historic and recent soil status, forming the basis for monitoring and/or forecasting soil dynamics to support evidencebased decision-making on soil resources.Furthermore, instead of using either a dynamic mechanistic simulation model (e.g.CENTURY, RothC) or a static empirical model (DSM) for mapping soil changes, we suggest integrating both to provide accurate estimates.
(9) Insights gained in this review indicate that the DSM community is working progressively to provide fine-resolution digital soil maps to address the global challenges related to soil resources.However, challenges remain, especially in integrating DSM efforts with other disciplines and developing functional soil maps and metrics to incorporate them in decision-making processes at multiple levels.

Fig. 6
Fig. 6 shows the frequency of environmental covariates (classified by Scorpan factors) used in broad-scale DSM.Relief was used in 204 articles and was the top covariate.The organisms and climate covariates were ranked second and third in 201 and 191 articles, respectively.They were followed by soil and parent material, used in 159 and 106 articles, respectively.Position (geographic coordinates) was used in 20 articles, and age was rarely used (3 articles).Fig. 7 presents the resolution of the maps produced under different spatial extents.The map resolution ranged from 10 m to 12 km.There

Fig. 1 .
Fig. 1.Number of articles in digital soil mapping at a broad scale from January 2003 to July 2021 (a) and number of articles per journal (b).In (a), the first and second GlobalSoilMap Conferences were held in 2013 and 2017, resulting in two conference books published in 2014 and 2018.The articles in these books are included.In (b), whether the article is open access or not is indicated by colour.The category "Others" contains the journals with only one record among all the articles in digital soil mapping at a broad scale.

Fig. 4 .
Fig.4.The time span of soil sampling year and soil sampling strategy (pie plot) reported for the 244 articles.For soil sampling year, it excludes 40.2% of the articles (98 of 244) that did not report sampling year.The percentage of oldest samples in the soil database shown on the right (y axis) is proportional to the articles reporting sampling year.In soil sampling strategy (pie plot), the category "Mixture" represents the articles compiling soil data from both probability and non-probability sampling.

Fig. 5 .
Fig.5.Frequency of target soil properties (a) and maximum soil depth of interest for the produced soil property maps (b).Note that only 12 soil properties recommended by GlobalSoilMap specifications are listed.SOC stock is listed separately from SOC/SOM content as it integrates other soil properties such as bulk density, coarse fragments and soil depth.For maximum soil depth of interest (b), the counts exclude 15.2% of the articles (37 of 244) with missing reporting soil depth.The percentage of maximum depth intervals shown on the right (y axis) is proportional to the articles reporting maximum depth of interest.

Fig. 6 .
Fig. 6.Frequency of environmental covariates used in broad-scale digital soil mapping.
S.Chen et al.

Fig. 8 .
Fig. 8. Type of prediction models (a) and frequency of performance indicators in model evaluation (b).Note that many articles compared the model performance using several types of spatial prediction models (a) and therefore the sum of all the counts is far more than the total number of articles.
) only use the soil data within a given period(i.e.2001-2010)  in model training and (2) incorporate the sampling year as a covariate and build a space-time model (2D + T or 3D + T).The first strategy was adopted byStockmann et al. (2015) to map SOC stocks at a global scale in the 1960s, 1980s, 1990s, and 2000s and to assess spatialtemporal changes in SOC.The second strategy was recently developed byHeuvelink et al. (2020) in modelling topsoil SOC change for Argentina by using a machine learning-based space-time method and bySun et al. (2021) in modelling SOM change by using integrated nested Laplace approximation with the stochastic partial differential equation.

Fig. 11 .
Fig. 11.Sensing techniques and their relevance to Scorpan factors.Counts indicate the frequency of different bands used as environmental covariates.

Table 1
International workshops on Digital Soil Mapping and GlobalSoilMap review of DSM articles published between January 2003 and July 2021.On 19 July 2021, Web of Science and Scopus were queried using several expressions applied to the topic (i.e.title, abstract, and keywords *TBD: to be defined S.Chen et al.literature