Estimating soil organic carbon stock change at multiple scales using machine learning and multivariate geostatistics

Many national and international initiatives rely on spatially explicit information on soil organic carbon (SOC) stock change at multiple scales to support policies aiming at land degradation neutrality and climate change mitigation. In this study, we used regression cokriging with random forest and spatial stochastic cosimulation to predict the SOC stock change between two years (i.e. 1992 and 2010) in Hungary at multiple aggregation levels (i.e. point support, 1 × 1 km, 10 × 10 km square blocks, Hungarian counties and entire Hungary). We also quantified the uncertainty associated with these predictions in order to identify and delimit areas with statistically significant SOC stock change. Our study highlighted that prediction of spatial totals and averages with quantified uncertainty requires a geostatistical approach and cannot be solved by machine learning alone, because it does not account for spatial correlation in prediction errors. The total topsoil SOC stock for Hungary was predicted to increase between 1992 and 2010 with 14.9 Tg, with lower and upper limits of a 90% prediction interval equal to 11.2 Tg and 18.2 Tg, respectively. Results also showed that both the predictions and uncertainties of the average SOC stock change were smaller for larger spatial supports, while spatial aggregation also made it easier to obtain statistically significant SOC stock changes. The latter is important for carbon accounting studies that need to prove in Measurement, Reporting and Verification protocols that observed SOC stock changes are not only practically but also statistically significant.


Introduction
Soil is the largest terrestrial pool of organic carbon (Stockmann et al., 2013). Soil organic carbon (SOC) directly or indirectly influences various soil-related functions and services, such as food production and climate change mitigation. Therefore, there is an increasing demand for information about how SOC stocks vary not just in space but also in time. This increasing demand is explicitly demonstrated by a number of recent global and continental initiatives, e.g. SoilGrids (Hengl et al., 2017;Hengl et al., 2014;Poggio et al., 2021), GlobalSoilMap (Arrouays et al., 2014), 4 per mille Soils for Food Security and Climate , Global Soil Organic Carbon Mapping Campaign (Yigini et al., 2018), Soil organic carbon projections for Europe (Yigini and Panagos, 2016) and Australian SoilGrids (Viscarra Rossel et al., 2015). Besides, a large effort has been made by many countries in order to compile their own national, high resolution SOC stock map (e.g. Adhikari et al., 2014;Kempen et al., 2019;Liang et al., 2019;Padarian et al., 2017;Vitharana et al., 2019).
More recently, the assessment of SOC stock change is in the spotlight and a number of international initiatives, studies and researches rest on it, e.g. achieving land degradation neutrality (Keesstra et al., 2018;Stavi and Lal, 2015) and mitigating climate change (Lal, 2004;Minasny et al., 2017;Stockmann et al., 2013). If detailed, spatio-temporally referenced observations on SOC stock are available, then an explicit space-time model can be built for studying and assessing the spatio-temporal variation of SOC stocks (e.g. Heuvelink et al., 2020). However, if the observations on SOC stock are only detailed in space and not so much in time (e.g. SOC stock was only measured at two points in time), then building an explicit space-time model is not possible. In such a case, a different approach must be used. A frequently adopted approach is to model and predict separately the spatial distribution of SOC stocks for the different years, after which the SOC stock change is computed by subtraction (e.g. Schillaci et al., 2017;Zhou et al., 2019). However, such procedure does not account for the correlation between the SOC stock prediction errors for the two years and will therefore fail to characterise the uncertainty about the SOC stock change reliably. Therefore, it is better to jointly model the spatial distributions of SOC stock for different years. In geostatistics, the cokriging approach can be used to jointly model the spatial distributions of interdependent variables (McBratney and Webster, 1983;Wackernagel, 2003). Cokriging gives both the variances and covariance of the prediction errors for each prediction location, which allows to compute the prediction error variance associated with the SOC stock change for each location. Knowledge about the uncertainty of SOC stock change is important to identify and delimit areas with a statistically significant change in SOC stock.
It is common that data on SOC are frequently found to be positively skewed (e.g. Liang et al., 2019;Orton et al., 2014;. From a geostatistical point of view, this could pose difficulties for spatial modelling and inference. For example, a property of a variable with a positively skewed distribution is the proportional effect, meaning that variability is higher in areas with high values than in areas with low values, which may corrupt the experimental variogram (Chilès and Delfiner, 2012;Webster and Oliver, 2007). Furthermore, most kriging techniques (e.g. ordinary, simple, universal and regression kriging) are only optimal if the variable of interest is normally distributed. In the case of a multivariate normal distribution, the conditional distribution is also normal and the conditional mean is a linear function of the conditioning data. Therefore, in case of a variable with a positively skewed distribution a commonly adopted approach is to: (i) transform the positively skewed variable to normal scale; (ii) perform geostatistical modelling and prediction of the transformed variable; and (iii) back-transform the kriging results to the original, untransformed scale (Webster and Oliver, 2007).
The scale at which spatially explicit information on SOC stock change is needed may vary widely from application to application. For example, in case of a country, state or smaller region (county, landscape, etc.), one may need SOC maps at a relatively fine spatial scale (≤1 ha) (Adhikari et al., 2019;Ellili et al., 2019;Schillaci et al., 2017;, whereas for a global or continental assessment it is sufficient to know the SOC stock change at a coarser scale (≥1 km 2 ) (Lugato et al., 2015;Yigini et al., 2018). An extreme example is carbon accounting, in which only the spatial total or average SOC stock change for a given time period is required for a country as a whole. In such cases stakeholders may be interested in information on SOC stock change referring to an area or region that is larger than the support of the observations, an issue which is commonly referred to as change-of-support. In change-of-support, we are interested in the spatial total or average of a soil property over a region or "block" that may be irregularly shaped (Webster and Oliver, 2007). In geostatistics, this is achieved with block kriging, which not only derives a prediction of the spatial average but also quantifies the associated prediction uncertainty (Cressie, 2006;Lark and Lapworth, 2012;Webster and Oliver, 2007). However, block kriging may be challenging in case observations are transformed prior to geostatistical modelling and predictions are back-transformed afterwards (Cressie, 2006;Cressie, 1993;Orton et al., 2015). In such case one often resorts to spatial stochastic simulation, where the spatial average over a region is approximated by generating many simulated region values as spatial averages of simulated point values (Goovaerts, 2001). This method can also be applied in a cokriging case and thus be used to compute the change in the average SOC stock for regions between two years.
The objective of this study was to use cokriging and spatial stochastic simulation to predict the SOC stock change between two years, i.e. 1992 and 2010, in Hungary at multiple aggregation levels. Simultaneously, we also aimed to quantify the uncertainties associated with these predicted SOC stock changes in order to be able to identify and delimit areas with statistically significant change.

Soil data
We derived soil data from the Hungarian Soil Information and Monitoring System (SIMS), which is a countrywide soil monitoring system with the aim of providing geographically referenced biological, physical and chemical information on the status and temporal change of Hungarian soils. It stores measurements of 4,859 genetic soil horizons belonging to 1,236 soil profiles (Fig. 1). The density of soil profiles is 0.013 profiles per km 2 and all soil profiles are benchmark locations, which were selected by expert knowledge-based purposive sampling in 1992. According to the sampling protocol of SIMS, all soil pits are revisited after a certain time period and sampled at each genetic soil horizon by taking point samples. In practice, a new soil pit is dug a few meters from where the previous pit was, thus introducing short-distance variation in measured soil properties. Note that these differences have no systematic component and will average out when spatial aggregates are derived. The laboratory measurements of the soil samples are carried out according to the Hungarian standards, where SOC is determined by wet combustion.
SOC stock can be readily computed by the following soil properties: (i) soil organic carbon content [unit: wt. %]; bulk density (BD) [unit: g cm − 3 ]; and (iii) proportion of coarse fragment (CF) (i.e. proportion of stones and gravels) [unit: vol. %]. These soil data for the years 1992 and 2010 were derived from the genetic soil horizons of SIMS and used in this study. BD and CF were only measured in 1992, nonetheless, they can be considered static that have negligible change over time. Therefore, we used the BD and CF measurements of 1992 for both years.

Environmental covariates
We used the same set of environmental covariates for statistical modelling of SOC stocks as that used in . The covariates are listed in Table 1. A detailed description on the selection of the covariates and their preprocessing can be found in . We used this set of 26 environmental covariates for modelling the trend component of SOC stock for both years.

Supports for spatial aggregation
In this study we chose supports which are frequently used in Hungary and that are of interest to stakeholders. We used 1 km × 1 km and 10 km × 10 km blocks, which could be useful for countrywide crop simulation (Fodor et al., 2014) and terrestrial ecosystem process modelling (Hidy et al., 2016), in national ragweed (Ambrosia artemisiifolia L.) pollen forecasting (Csépe et al., 2020), in natural vegetation mapping and in national contributions to global/continental assessment of soil organic carbon stock (Yigini et al., 2018). In addition, we also aggregated to the Hungarian counties and entire Hungary (Fig. 1). The latter is important in carbon accounting, in which the spatial total SOC stock change has to be reported by a country.

Spatial modelling of SOC stocks and SOC stock change
Fig. 2 presents a flowchart summarizing the methods and approaches used in this study. The various steps shown in the flowchart are explained in the subsections below. We used a direct mapping approach (Miller et al., 2015), also referred to as "calculate-then-model" (Orton et al., 2014) or "calculate first, interpolate later" (Heuvelink and Pebesma, 1999) for modelling the spatial distribution of SOC stocks in 1992 and 2010. This means that SOC stocks are computed at the level of soil profiles first, and subsequently used for spatial modelling.

Computing SOC stock at the level of soil profiles
First, we computed SOC stocks for each genetic soil horizon at each soil profile for 1992 and 2010 using soil data listed in Section 2.1. We used the following equation  (Bishop et al., 1999) for modelling the vertical distribution of SOC stock at each soil profile. The fitted MPSs were used to derive the SOC stock for the depth interval 0-30 cm at each soil profile.

Spatial modelling
As indicated in the Introduction, SOC stock data frequently show positively skewed distribution and the same was found in Hungary by . Therefore, we transformed the SOC stock values by taking their square root and spatial modelling was conducted on the transformed, normal scale.
In general, the spatial variation of soil properties can be described and modelled in terms of a deterministic component and a stochastic component. Thus, we can write where Z is a vector of transformed SOC stocks for the years 1992 and 2010, m is a vector of the deterministic part describing SOC stock spatial variation that can be explained from covariates for the years 1992 and 2010, ε is a vector of the stochastic residuals that can be spatially auto-correlated as well as cross-correlated for the years 1992 and 2010, and u is a vector of geographical coordinates.
In both years, we separately modelled the deterministic part using random forest (RF; Breiman, 2001). Before fitting RF between the transformed SOC stock values and the environmental covariates listed in Table 1, we fine-tuned the hyperparameter mtry, which is the number of input covariates selected randomly at each split. Repeated 5-fold crossvalidation was used for evaluating the performance of RF fitting in both years. We used the fine-tuned values of mtry for fitting the final RFs, which were then used to predict the deterministic component for all locations of a 100 m × 100 m grid covering Hungary in both years.
Since we wanted to account for correlation between the stochastic residuals for both years, we used regression cokriging (coRK), where RF predictions for the years 1992 and 2010 were taken to be m 1992 (u) and m 2010 (u) of Eq. 2, respectively. Geostatistical modelling of ε of Eq. 2 was obtained by first subtracting the RF predictions from the computed SOC stock values for all sampling locations for both years. Next we computed experimental variograms and a cross-variogram from these residuals and fitted a linear model of coregionalization (LMC). We used the LMC to ensure a statistically valid model (Goovaerts, 1997).
As we already pointed out in the Introduction, there is no analytical geostatistical solution to predict a spatial average over a region or block in case a non-linear transformation is applied prior to geostatistical modelling. We therefore used sequential Gaussian cosimulation to generate possible realities of the geostatistical coRK model. We generated 500 stochastic realizations for both years. These realizations were conditional to the SOC stock observations from both years and reproduced the joint spatial variability.
We added the RF predictions to the stochastic realizations before we transformed the results back to the original, positively skewed scale, by taking the square. The corresponding realizations of SOC stock change were determined by taking the difference of the paired backtransformed realizations of the two years (i.e. subtracting the SOC

Spatial aggregation
The back-transformed realizations at point support and 100 m × 100 m resolution were the basis for predicting the spatial average of SOC stocks at the different supports listed in Section 2.3 for both years. For predicting the spatial average SOC stock change, we used the realizations of SOC stock changes computed in Section 2.4. The spatial average over a block at a given support is obtained by first averaging the simulated values falling in that block for each realization. This yields 500 average values for that block, which can be interpreted as 500 realizations of the block average. Second, we take the average of the 500 block averages to obtain a prediction for the spatial average over that block. This was done for predicting (i) the spatial average of SOC stocks for both years and (ii) the spatial average of SOC stock change.
For identifying blocks with statistically significant SOC stock change we counted how many of the 500 simulated block values of SOC stock change had a positive or negative change for each block. If the number of positive (negative) changes in SOC stock was larger than 475 (i.e. 95% of the 500 simulated block values), then we concluded that the SOC stock had statistically significantly increased (decreased) for that block.

Quantification of uncertainty
It is routine in digital soil mapping to quantify the prediction uncertainty. Thus, we also computed the 90% prediction interval (PI) in this study. The 90% PI reports the range of values within which the true value is expected to occur 9 times out of 10. The width of the 90% PI is a useful measure of the prediction uncertainty. We used the backtransformed realizations for quantifying the prediction uncertainty, not just for the two years but also for the SOC stock change between the two years. At the point support, the 90% PI at a prediction location was identified by the 0.05 and 0.95 quantiles of the 500 realizations at that location. In the case of spatial aggregation, the 90% PI for a given block was identified by the 0.05 and 0.95 quantiles of the 500 block values of that block.

Validation
We had no SOC stock observations at larger support and we therefore restricted the statistical validation to the point support predictions. We used 10-fold cross-validation to evaluate the performance of point support predictions and uncertainty quantifications. Common validation measures, i.e. bias (or mean error, ME) and root mean square error (RMSE), were computed: where n is the number of observations; P i and O i are the predicted and observed SOC stock or SOC stock change for observation location i, respectively. We also computed Lin's concordance correlation coefficient (CCC; Lin, 1989): where P and O are the mean of the predictions and observations. Uncertainty quantifications were validated by accuracy plots and G statistics. An accuracy plot presents the observed fraction of validation data falling within symmetric prediction intervals of varying width. If the quantified uncertainty is a realistic assessment of the 'true' uncertainty, then the observed and expected fraction should be similar. For instance, in such case about 90% of the validation data would fall within the 90%. If the observed fraction is lower than the expected fraction, then the uncertainty has been underestimated. If it is higher, the uncertainty has been overestimated (i.e. too conservative). Ideally, an accuracy plot follows the 1:1 line. The G statistics measures the overall closeness of the accuracy plot to the 1:1 line by where ξ(p) and p are the observed and the expected fraction for a p-width symmetric PI, respectively. Ideally, G is equal to 1. As it follows from Eq. 6, we did not distinguish between over-and underestimation of uncertainty by applying differing weights as recommended by Deutsch (1997). Indeed, we think that applying different weights on over-and underestimation could harm the objectivity of G statistics. Some studies also used non-weighted G statistics for validating uncertainty quantifications (e.g. Szatmári et al., 2020;Wadoux et al., 2018).   Fig. 3 shows the histograms of the original (i.e. untransformed) and transformed data on SOC stock for both years. The transformed data on SOC stock showed a fairly normal distribution in both cases (Fig. 3, right column). Fine-tuning of RF fittings showed that mtry values of 9 and 15 were found to be optimal for the years 1992 and 2010, respectively. The fitted RF models explained 41% and 38% of the total variance of SOC stocks in 1992 and 2010, respectively. The importance of the environmental covariates in both RF models are presented in Fig. 4. In this study, soil type proved to be the most important covariate for both years, which is not surprising because the genetic type of soils (according to the Hungarian soil classification system) depends on the amount and quality of soil organic matter, among others. We also found that parent material, geomorphometric parameters (e.g. altitude above sea level, ridge top flatness and vertical distance to channel network) and climate attributes (e.g. evaporation and evapotranspiration) were informative covariates in both RF models. Fig. 5 presents the experimental direct-and cross-variograms of the residual and the fitted LMC. The model has a nested structure, where the first structure models discontinuity at the origin (i.e. nugget effect), the  second models a short scale variability with a range value of 10 km, and the third describes a larger scale of spatial continuity with a range value of 50 km. Table 3 summarizes the shapes and parameters of the LMC. Fig. 6 presents the point-support predictions of SOC stocks and SOC stock change between the two years. It also shows the quantified prediction uncertainty (i.e. the lower and the upper limit of the 90% PI) at point support. Table 4 summarizes the performance of the point-support predictions of SOC stocks and SOC stock change by the computed values of ME, RMSE and CCC. The ME results show that the predicted SOC stocks and SOC stock change are fairly unbiased. Fig. 7 presents the validation  of the uncertainty quantifications by the compiled accuracy plots and the G statistics. The accuracy plots approximately follow the 1:1 line, proving that the accuracy of uncertainty quantifications is acceptable. This is also supported by the G statistics, which are fairly close to the expected value. Based on the results of cross-validation it seems that the performance of spatial predictions and uncertainty quantifications was realistically assessed.

Spatial aggregation
The resulting maps of spatial aggregation of SOC stocks for the two years can be found in the supplementary material (SM). Fig. 8 presents the results of the spatial aggregation of SOC stock change for all supports considered. In addition, it also shows the associated prediction uncertainty expressed by the width of the 90% PI and presents the map of statistically significant SOC stock change. Note that the area of significant change is larger for larger spatial supports. Table 5 summarizes descriptive statistics of the point-support predictions and the spatially aggregated SOC stocks and SOC stock change at all supports used in this study. Note that Table 5 does not include the results on spatial aggregation of the entire country because there is just a single value for each case, as shown in Fig. 8.

Interpretation and discussion of the results
In Fig. 8 we observed that the larger the support, the closer to zero the spatially aggregated predictions of SOC stock change. This is also confirmed by Table 5 in which the range values decrease as the support size increases. This can be explained by the fact that positive and negative values of SOC stock change neutralize each other in the course of spatial aggregation. This could be important from a practical point of view because it means that the absolute value of SOC stock change (i.e. regardless whether it is a positive or negative change) decreases as the size of area for which spatial prediction is required increases. This is  nicely demonstrated especially in the case of the two largest aggregation levels (i.e. entire Hungary and the Hungarian counties) in Fig. 8, where the absolute value of SOC stock change is only a few tons ha − 1 . Another important observation from Fig. 8 and SM is that the larger the support, the smaller the prediction uncertainty, both for SOC stocks and SOC stock change. The reason for this is that within-block negative and positive interpolation errors partially cancel out when values are aggregated, a phenomenon well-known in geostatistics and also reflected in the fact that block kriging variances are typically much smaller than point kriging variances (Webster and Oliver, 2007). Furthermore, as was shown in Fig. 8, parallel to increasing the support size the area that has a statistically significant change in SOC stock increases. Thus, decreasing prediction uncertainty by spatial aggregation made it possible to identify and delimit larger areas with statistically significant decrease or increase in SOC stock. This is an important message for users of SOC stock change maps, because we can rule out chance effects only for areas that have a statistically significant change. A broader discussion on this specific topic is given in Section 4.2.
Based on the spatially aggregated prediction of SOC stock change at the highest aggregation level (i.e. entire Hungary) (Fig. 8), the overall amount of SOC stock increase between 1992 and 2010 was 14.9 Tg C (1 teragram = 10 12 g) for entire Hungary with a 90% PI of [11.2 Tg C, 18.2 Tg C]. This means that there was a substantial enrichment in SOC stock in Hungary for the respective period, which was statistically significant, since the lower limit of the 90% PI (i.e. 11.2 Tg C) is much higher than zero. This can be explained by the fact that Hungary experienced substantial afforestation in the past three decades, which significantly increased the total area of forests. This is in line with the findings of . At the level of Hungarian counties, we could also show that there was a significant change in SOC stock in 12 out of the 19 counties in Hungary. Moreover, 10 out of the 12 counties showed significant enrichment. This could be mostly attributed to the increasing   Fig. 7. Validation of uncertainty quantifications by accuracy plots and G-statistics. Abbreviation: SOC: soil organic carbon.
area of forests, since most efforts made on afforestation took place in these counties. The remaining two counties showed significant loss in SOC stock, which may be explained by cultivation of pastures (county in the northwest) and degradation of woodlands (county in the northeast).

General discussion
A commonly adopted approach in DSM is to add kriged residuals to the trend predictions obtained with multiple linear regression or machine learning, in order to improve the accuracy of the predictions (Keskin and Grunwald, 2018). However, we should emphasize that in this study we aimed for more than just improving the prediction  accuracy. Of foremost importance was that we wanted to quantify the prediction uncertainty of the spatial aggregates. Indeed, change-ofsupport problems with uncertainty quantification cannot be handled and solved without using geostatistics. Although several studies addressed the spatial distribution of observations in random forest fitting and prediction (Hengl et al., 2018;Møller et al., 2020;Sekulić et al., 2020), we should note that neither random forest nor other machine learning techniques are currently able to quantify spatial correlation. Therefore, there is a need to apply geostatistics or its combination with machine learning because geostatistics quantifies the spatial correlation (Matheron, 1963;Webster and Oliver, 2007), which is indispensable when the uncertainty of the spatially aggregated predictions is targeted.
In this study, we used the cokriging approach for jointly modelling the spatial distribution of SOC stocks for both years since it has the advantage of not just taking the spatial interdependency of SOC stocks between the two years into account but also exploiting this interdependency in spatial modelling (Webster and Oliver, 2007). In addition, cokriging yields coherent results with respect to the difference between SOC stock data observed in the two years (i.e. SOC stock change), which is very useful in further assessments. Due to these advantageous properties of the cokriging approach we advocate its use in cases when measurements are taken at only a limited number of times and therefore a full space-time model (Gräler et al., 2016) cannot be developed.
We used stochastic simulation instead of block kriging for predicting block means because there is no analytical geostatistical solution to predict a spatial average over a block in case a non-linear transformation is applied prior to geostatistical modelling. Furthermore, stochastic simulation is more flexible than block kriging because: (i) not just a linear average but also non-linear averages can be computed (e.g. geometric or harmonic mean, proportions above thresholds); and (ii) once the simulated point values have been generated, predictions for various block sizes and shapes can be derived at little computational cost (Goovaerts, 2001;Heuvelink and Pebesma, 1999).
A number of papers discussed that only a small area with a statistically significant change in SOC stock could be identified and delimited due to the relatively high prediction uncertainty (e.g. Heuvelink et al., 2020;. Indeed, this is a key problem if, for example, a country wants to verify that its SOC conservation strategy carried out in a given period significantly increased SOC stock or not.  and Heuvelink et al. (2020) extensively discussed the possibilities of reducing prediction uncertainty, e.g. collection of more observations and optimization of observation locations in the geographic and feature space, application of more and spatially (or spatiotemporally) detailed environmental covariates or improving the DSM technique. However, all these possibilities come at a cost and may have limited potential. This study highlighted that spatial aggregation substantially decreases prediction uncertainty and can effectively identify and delimit larger areas with significant decrease or increase in SOC stock. This is an important message for further studies on the spatiotemporal assessment of SOC stock change. The price paid is a loss of spatial detail, which means that spatial aggregation may not be a proper solution for local studies, precision agriculture and high-resolution insitu management decision support. Moreover, it requires a geostatistical approach in which the spatial correlation of the prediction errors is quantified.
In case spatial averages or totals are required for blocks that cover many sampling locations, and when in addition these sampling locations were selected using probability sampling, an alternative to the modelbased approach used in this study would be to apply design-based estimation (Brus, 2019;de Gruijter et al., 2016;de Gruijter et al., 2006). We could not use this approach in our study because the Hungarian SIMS is not a probability sample.

Limitations and further research
In this study, we could only validate point predictions since we had no block-support SOC stock observations. In practice it may be difficult to collect block-support observations, although they may be estimated from averaging sufficient point samples within the block (Brus et al., 2011;Vaysse et al., 2017). Cross-validation statistics at point support are indicative of the quality of the block-support predictions, but not in all cases. For example, if the point-support predictions are unbiased then so will the block-support predictions. The block-support RMSE is likely substantially smaller than the point-support RMSE, but it is difficult to tell how much. Therefore, more effort should be made to collect validation data at block support, such as by using a probability sample of point observations within the block (Brus and de Gruijter, 1997;Brus and de Gruijter, 1993). Cleary, these validation data are not error-free since they are estimates based on point observations, and so one must assure that the sample size is sufficiently large to ensure that the estimation error is small compared to the error in the DSM map.
When spatial stochastic simulation is used, attention should be paid to how many realizations are needed to give reliable statistical inferences. In this study, we generated 500 realizations of SOC stocks and SOC stock change, which is likely sufficient to give reliable predictions both at point and at block supports. In the literature, some rule-of-thumb values have been published, e.g. it has been claimed that at least 100 realizations are needed to be able to give statistically sound inferences (Deutsch, 1997;Goovaerts, 2001). However, we should note that the number of realizations needed depends on what parameter is predicted. For instance, estimation of the mean of a distribution needs fewer realizations than estimation of the tails of a distribution. Therefore, we do not recommend the use of such rule-of-thumb values. Indeed, the literature of uncertainty propagation discuss this issue in more detail and provides approaches on how to determine the number of realizations needed to achieve stable results (Heuvelink, 1998).
We should note that we assumed in this study that there was no uncertainty about the trend (i.e. RF predictions). Thus, we only considered uncertainty about the stochastic residual in Eq. 2. Although cross-validation showed that uncertainty quantifications were appropriate (Fig. 7), it is advised to also take the uncertainty about the trend into account in spatial modelling in order to get a full picture about the overall uncertainty.
We used a direct mapping approach for modelling the spatial distribution of SOC stocks in 1992 and 2010, and then derived the SOC stock change between the two years since the cokriging approach yields coherent results as we pointed out in Section 4.2. However, there are further alternative trajectories or approaches (Heuvelink and Pebesma, 1999;Styc and Lagacherie, 2019) that may be interesting to check and compare with the approach used in this study, since Styc and Lagacherie (2019) observed important differences of performances between such inference trajectories.

Conclusions
The objective of this study was to predict the SOC stock change between 1992 and 2010 in Hungary at multiple aggregation levels, and to quantify the uncertainty associated with these predicted SOC stock changes in order to be able to identify and delimit areas with statistically significant change. Our study highlighted that the sole application of machine learning techniques in spatial aggregation is not sufficient for this goal since these do not quantify the spatial correlation, while this is needed to quantify the uncertainty of spatially aggregated predictions. We also illustrated that by increasing the support the prediction uncertainty decreases, which is a well-known but frequently forgotten fact in geostatistics. Increasing the support thus allows to identify and delimit larger areas with statistically significant change in SOC stock. This is an important message for practitioners who not only need to estimate the SOC stock change resulting from a SOC conservation strategy but also need to show that the effect of measures to increase the SOC storage is statistically significant.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.