Spatial Prediction of Organic Matter Quality in German Agricultural Topsoils

: Soil organic matter (SOM) and the ratio of soil organic carbon to total nitrogen (C/N ratio) are fundamental to the ecosystem services provided by soils. Therefore, understanding the spatial distribution and relationships between the SOM components mineral-associated organic matter (MAOM), particulate organic matter (POM), and C/N ratio is crucial. Three ensemble machine learning models were trained to obtain spatial predictions of the C/N ratio, MAOM, and POM in German agricultural topsoil (0–10 cm). Parameter optimization and model evaluation were performed using nested cross-validation. Additionally, a modification to the regressor chain was applied to capture and interpret the interactions among the C/N ratio, MAOM, and POM. The ensemble models yielded mean absolute percent errors (MAPEs) of 8.2% for the C/N ratio, 14.8% for MAOM, and 28.6% for POM. Soil type, pedo-climatic region, hydrological unit, and soilscapes were found to explain 75% of the variance in MAOM and POM, and 50% in the C/N ratio. The modified regressor chain indicated a nonlinear relationship between the C/N ratio and SOM due to the different decomposition rates of SOM as a result of variety in its nutrient quality. These spatial predictions enhance the understanding of soil properties’ distribution in Germany.


Introduction
Soil organic matter (SOM) plays a crucial role in determining the ecosystem services provided by soil by influencing key properties such as nutrient availability, water retention, structure, and stability [1,2].However, it is widely acknowledged that dividing SOM into its two main constituents, mineral-associated organic matter (MAOM) and particulate organic matter (POM), simplifies the conceptualization, comprehension, and large-scale prediction of SOM dynamics [3].This is because MAOM and POM differ significantly in terms of their formation, stability, and function [3].POM mainly consists of relatively undecomposed plant residues, while MAOM is composed of carbon compounds that have been transformed by microorganisms.POM has a lower nitrogen content, undergoes faster turnover, and contributes to soil fertility by providing nutrients and energy to decomposers.In contrast, MAOM has a higher nitrogen content and is considered a more stable component of SOM due to its chemical bonding with the mineral surface of the soil.This stability allows MAOM to persist as a long-term reservoir for nutrient storage [3][4][5].Acknowledging the importance of SOM's dynamics through its two main constituentsthe ratio between soil organic carbon, as the main component of SOM, and total nitrogen, i.e., the C/N ratio-also provides an indicator of soil quality and health [6].For example, the C/N ratio impacts soil microbial activity, subsequently influencing the decomposition rate of SOM and nutrient cycle [7].For instance, Blanco et al. [8] showed that a higher C/N ratio in grasslands reduces the decomposition rate, whereas a lower C/N ratio increases the decomposition rate.For these reasons, the quantification of SOM and the C/N ratio and an understanding of the factors that influence them on a large scale is essential.
Modeling soil properties is a challenging task due to the complexity and diversity of soils.Additionally, the dynamic nature of soil-forming factors, which continuously evolve over time, adds another layer of challenge [9].This necessitates an interdisciplinary approach encompassing data science, soil science, and geoinformatics, which leads to the field of pedometrics [10].Pedometrics seeks to create a continuous approximation of soil properties as a function of soil-forming factors with the addition of other soil properties and geographical positions.This approach is cost-effective, scalable, and can be utilized for multidimensional and spatial-temporal modeling [11], and thus is suitable for the continuous approximation of soil attributes on a larger scale.However, one of the main challenges in pedometrics is determining the appropriate sample size and sampling design for developing and evaluating models.Consequently, the importance of comprehensive soil inventories and monitoring networks has increased [12].The first harmonized agricultural soil inventory in Germany was conducted to address the lack of large-scale, high-quality data on agricultural soils [13].This inventory comprised 3104 sites that were sampled along an 8 km raster at fixed depth intervals from 10 cm down to 100 cm with samples analyzed for different soil properties.More in-depth information on this inventory can be found in the work of Poeplau et al. [13].
Studies in pedometrics have shown that machine learning algorithms, such as random forest (RF), boosted regression trees (BRTs), and support vector machine (SVR), are capable of modeling different soil properties such as soil organic carbon (SOC) and the C/N ratio [14,15].Nonetheless, the challenge of capturing multiple characteristics and underlying structures of the data using a single algorithm has led to the emergence of ensemble learning (EL) as an alternative approach [16].For example, Hengl et al. [17] predicted various soil properties, including SOC and total nitrogen (TN), for Africa using EL in which RF, extreme gradient boosting (XGBoost), a neural network (NN), Cubist regression trees, and a generalized linear model provided the initial prediction, i.e., the base learners, for a final prediction by a linear model, i.e., the meta learner.Similarly, Sun et al. [18] compared the performance of RF, SVR, XGBoost, and NN with EL for the spatial prediction of SOC density in China.Their results showed that the EL model obtained the best performance followed by RF and XGBoost.They also identified annual mean temperature, leaf area index, and soil pH as the most important predictors to predict SOC density in China.In another study, Adeniyi et al. [19] used EL to predict different soil properties including SOC in agricultural soils of Lombardy in Italy.They further showed the vertical distance to the channel network and channel network base level to be the most important predictors for predicting their response variables.However, these algorithms and methods are commonly used to predict multiple soil properties independently, which can be referred to as single-target regression (STR), without taking their interaction into account.This lack of consideration of interaction might lead to the absence of crucial correlation patterns among the soil properties, potentially leading to inaccurate interpretations or predictions.To address this limitation, a better option might be multi-target regression (MTR), which aims to capture the underlying interdependencies.Nevertheless, the focal point of most works in pedometerics that have used MTR has been model performance rather than interpretation.For example, Padarian et al. [20] used a convolutional neural network (CNN) architecture with MTR capability for the spatial prediction of SOC at multiple depth intervals in Chile.They compared this neural network (NN) architecture with Cubist and reported a better performance with the former.Another study conducted by the authors of [21] used an MTR CNN to spatially predict various soil properties such as SOC, cation exchange capacity (CEC), and soil texture, and compared the results with Cubist and partial least squares regression.Similar to [20], they demonstrated a better performance with the MTR CNN than with the other two.However, the authors of ref. [22] tried using MTR with RF to predict SOC and TN in Germany and the Netherlands, but concluded that the performance of MTR RF was comparable with that of STR.
Despite the advantages of MTR, its practical application with algorithms such as SVR and BRT, which do not natively support MTR, presents a methodological challenge that attempts are being made to address using frameworks such as regressor chains (RCs) [23][24][25].Thus, by extending the MTR capabilities of these algorithms and methods, RCs offer a potential route to integrate the interdependencies of multiple soil properties when predicting them.In contrast to the broader field of data science, to the best of the author's knowledge, RC has not been implemented in the modeling of soil properties, particularly for the purpose of spatial prediction or interpretation.Probably the most notable work in soil science was conducted by Santana et al. [26], who implemented RF and SVR within different frameworks of MTR and STR to predict 10 parameters of soil fertility in Brazil.Nevertheless, firstly, the aim of this work was to compare the model performance of these frameworks, and secondly it was not expanded to pedometrics.Thus, the potential of RC or target stacking for model interpretation remains unknown.
Therefore, the aims of the present study were as follows: (1) to model the soil C/N ratio, MAOM, and POM on a national scale and explore their main drivers; (2) to model the underlying interdependencies between the C/N ratio, MAOM, and POM; and (3) to provide a spatial prediction of the C/N ratio, MAOM, and POM and their prediction uncertainties for German agricultural topsoils.Furthermore, this paper introduces a modification of RC, i.e., modified RC, and demonstrates how it can be used for the purpose of interpreting machine learning models.

Research Area and Response Variables
This study focused on the top 10 cm of agricultural mineral soils in Germany, henceforth referred to as top10.Although the German Agricultural Soil Inventory consists of samples from both mineral and organic soils, the latter were excluded using the organic soil map because the models cannot approximate the spatial extent of organic soils due to their small sample size (119 samples) in the present data [15].The models were trained with respect to three response variables: the C/N ratio, MAOM, and POM.The C/N ratio was obtained from the first harmonized German agricultural soil inventory, where the SOC and TN were directly measured via dry combustion using an elemental analyzer [13].The MAOM and POM data used in this study were based on Jaconi et al. [27], in which 105 topsoil samples (0-10 cm) representative of the German Agricultural Soil Inventory were used for density fractionation.Near-infrared spectroscopy was then implemented to quantify MAOM and POM for all mineral soils.More details are provided in the studies of Schneider et al. [28] and Jaconi et al. [27].

Predictor Set
A comprehensive series of predictors, including the coordination of the samples, with a potential influence on the spatial approximation of the response variables, was collected following the SCORPAN framework.When several products were available for a predictor, the one with the fewest artifacts and the best spatial resolution was selected.The polygon-based predictors (six predictors) were rasterized in compliance with the INSPIRE standard grid [29] using ArcGIS software [30].The spatial resolution of the predictors was set to 25 m since the organic soil map used to spatially distinguish the mineral and organic soils classified the relevant soil samples and masked the predictors better at this resolution.Additionally, the rasterized land use map was less susceptible to misclassifying the sampling points at a 25 m resolution compared with a coarser resolution.The predictors with a lower resolution were continuous predictors and were resampled using the INSPIRE standard grid and the bilinear interpolation technique.

Soil and Parent Materials
Considering the importance of soil properties and parent materials on SOM components, these factors were approximated using four rasterized categorical predictors in the models: the hydrological unit map [31], which describes hydrologically relevant attributes such as permeability, type of rock, and geochemical classification; the soilscapes map [32], which divides the country according to geological-topographic factors such as river landscapes, young and old moraine landscapes, and loess and sand loess landscapes [33]; the pedo-climatic region [34], which describes German soils by linking a climate condition to a relatively homogeneous soil class; and finally the soil type map [35], which distinguishes different soil classes across Germany.

Climate
Climate information obtained from Germany's national meteorological service (DWD) was included via four long-term averaged predictors: sunshine duration [36], minimum air temperature [37], the number of summer days [38], and precipitation [39].These predictors were selected from a set of 34 parameters based on the work of Schneider et al. [28], who used principal component analysis to identify the most relevant and least correlated climate parameters for the same study area.

Land Use and Land Cover
Land use has been shown to be one of the important parameters determining SOM [40] and the C/N ratio, particularly for topsoils [28].Information on land use was captured by a detailed land use map from Germany's official Topographic-Cartographic Information System [41].Furthermore, land cover was approximated using remote sensing (RS) data obtained from LANDSAT 8.The bands included blue, red, green, near-infrared (NIR), SWIR1, SWIR2, and thermal reflectances.Each reflectance was aggregated to capture the seasonality in a multi-annual composition from 2011 to 2018 that matched the sampling period of the inventory.Furthermore, spectral indices were calculated: the Normalized Difference Vegetation Index (NDVI) (Equation (1)), the Normalized Difference Moisture Index (NDMI) (Equation (2)), and the Normalized Difference Water Index (NDWI) (Equation (3)).While the NDVI quantifies vegetation greenness that is linked to plant productivity [42], the NDMI and NDWI are useful for evaluating a crop's water content [42,43].The NDMI has also been used to determine surface wetness and soil moisture [44,45].

Relief
Relief plays an important role in soil formation and its properties by affecting various factors such as solute movement, soil moisture, soil fertility, and erosion [46,47].This factor was reflected by eight predictors, of which six were derived from the digital elevation model EU-DEM [48].Elevation, slope, northernness-easterness aspect, and surface plan-profile curvatures describe the terrain surface by a slope's gradient, direction, convexity-concavity, and convergence-divergence.These parameters relate to solute movement, moisture, and soil redistribution throughout the landscape, which can subsequently affect the chosen response variables.Furthermore, the topographic wetness index quantifies the effect of topography on hydrological processes [49] and potential changes in soil texture [50].The derivation of the relief factor was performed in SAGA GIS software.The geomorphic map of Germany [51] is a categorical predictor that classifies relief into four classes, dis-tinguishing between the Northern Lowland, Central Uplands, Alpine foothills, and Alps of Germany.

Algorithms and Modeling Framework
BRT, RF, and SVR were applied in EL and modified RC frameworks to model the response variables.Although BRT and RF are both tree-based algorithms, BRT relies on combining several weak learners sequentially to improve its accuracy while RF depends on bagging to increase the robustness and stability of its results.In contrast, SVR applies kernel tricks to transform a nonlinear problem to a higher dimension where the problem can be solved linearly.It then obtains an estimation function that has at most ε deviation from the response values of the training data while minimizing model complexity [52].All the frameworks and modeling were performed using mlr3 package [53] in R.

Ensemble Learning
The aim of EL is to identify a set of estimation functions or committees that, when combined, can best approximate the unknown function between the predictor vectors and its corresponding response variable [54].Therefore, C/N, MAOM, and POM were predicted by ensembling the aforementioned algorithms as the base learners and averaging their predictions, i.e., ensemble averaging.Equal weight was designated to the base learners since a previous study has shown that the main difference between these algorithms is more in the way they obtained information from the predictor set rather than their predictive performance [15].This framework was used to obtain the spatial prediction of each response variable and interpret their main drivers on a national scale.

Multi-Target Regression
The prediction of the C/N ratio, MAOM, and POM independently, i.e., using the STR method, undermines their interdependencies, which can then mislead model interpretation.To address this limitation, a combined approach of RC and stacked single target (SST) was implemented.While both the RC and SST methods expand the model's input space by including the predictions of response variables as predictors, they differ in their implementation.In RC, the prediction of each response variable is included one at a time in a cascade framework, i.e., chaining.Conversely, in SST, the response variables are independently predicted.The predictions are then used as additional predictors for training meta models, which predict each response variable, an approach known as stacking [23][24][25][26].Although SST can potentially enhance model performance by effectively leveraging information from multiple models, it also complicates the interpretability of the final model since each response variable is predicted once by a base model and once by a meta model.In contrast, although the RC's cascade inclusion of response variables can leverage the relationships between them, the order of their inclusion influences their effect on the response variable of interest [23], which subsequently leads to a misinterpretation of the relationships between response variables.Therefore, a solution can be to combine both SST and RC approaches to incorporate their advantages.In this hybrid method, all response variables, except the response variable of interest, are predicted independently and then used as additional predictors.This step resembles SST, which aims to expand the original input space by incorporating the information of the response variables.However, it differs from SST in that the response variable of interest is not predicted, and hence is not included in the original input space.In the next step, a meta model is trained on the expanded input space to predict the response variable of interest.This step resembles the last chain of the cascade framework in RC, where the prediction of a response variable is used as an additional predictor to train the meta model and predict the next response variable.However, it differs from RC in that the order of inclusion of response variables does not play a role in the final prediction (Figure S1).Thus, this method allows a direct and interpretable prediction of the response variable of interest while incorporating information from other response variables.

Performance Evaluation
K-fold cross-validation (CV) is a commonly used technique in model evaluation that involves partitioning the dataset into k subsets or folds.During each iteration, k-1 folds are used to train the model, while the remaining fold is used to test and evaluate the model's prediction error.This process is repeated k times, with each fold taking turns as the testing set.The final prediction error is obtained by averaging the errors from all the folds [15].In this study, Germany was divided into 50 strata using a 100 km × 100 km INSPIRE grid.Random samples were taken from each stratum to create five folds.This sampling process was repeated three times, constituting the outer loop of the CV, which was used for model evaluation.Within each iteration of the outer loop, the training set was further divided into five folds, forming the inner loop, i.e., the nested CV.The nested CV was used for parameter optimization and to address high multicollinearity (r ≤ 0.9).These steps helped ensure unbiased error estimation [55].Furthermore, an average 8 km distance between sample points prevented spatial auto-correlation between the training and test data [15,56].By employing an identical CV across all models and frameworks, information leakage between the training and test sets of different frameworks was avoided, thereby providing fair and comparable evaluations.The root mean square error (RMSE) (Equation ( 4)) and the mean absolute percentage error (MAPE) (Equation ( 5)) were used to evaluate the model's performance.
In Equations ( 4) and ( 5), n is the number of samples, and P i and O i are the predicted and observed values, respectively.The MAPE was interpreted according to Lewis [57] (Table 1).

Parameter Optimization
The differential evolution algorithm was chosen to optimize both EL and the modified RC models because SVR and BRT comprise continuous tuning parameters.This optimization strategy is particularly suitable since the whole continuous parameter space can be taken into account [58].This method is described in more detail by Storn and Price [58].As shown by Shahhosseini et al. [59], parameter optimization in EL should be considered a single task.In other words, even though it is possible to optimize the parameter set of each base learner separately, optimizing all the parameters combined leads to better model performance.Therefore, the ensemble models of C/N, POM, and MAOM were optimized in line with the suggested approach.The same logic can be applied to the modified RC models, which means that stacking and chaining should be considered a single optimization task.However, the optimized parameters from the ensemble models of C/N, MAOM, and POM were used for the stacking step of the modified RC, and only the ensemble model in the chaining step was optimized.This strategy was chosen in order to reduce the computational cost of optimization since EL was used in both the stacking and chaining steps.The optimization task was performed using "DEoptim" R package [60,61].

Model Interpretation
Since the EL and modified RC models were comprised of multiple base learners, model interpretation could only be made using model-agnostic methods.These methods are mainly based on input manipulation and observe its effect on model prediction [62].Thus, they are independent of a particular model and suitable for complex machine learning pipelines [62].Model-agnostic methods can be categorized into two groups: global, which provides insights into the overall behavior of the model, and local, which focuses on an individual prediction, i.e., sampling point [63].These methods were applied using the iml [64] and fastshap [65] packages in R.

Predictor Importance and Effect for Global Model Interpretation
Grouped permutation feature importance (GPFI) was used as a global method to measure the importance of a group of predictors in the ensemble models.GPFI quantifies importance by measuring the impact of permutating a group of predictors on the error of the model's prediction.For this method, the predictors of the test set were grouped according to Section 2.2 since they belonged to the same SCORPAN factor.These groups will be referred to as groups or factors throughout the manuscript.Each group was permuted 10 times, the change in RMSE was averaged, and the relative change in RMSE was calculated.As mentioned by Molnar et al. [66], GPFI has an advantage over permutation feature importance (PFI) since dependency among predictors is inevitable in most real cases, leading to potentially misleading model interpretation.Thus, permuting these predictors as a group can minimize this risk.
However, GPFI quantifies the importance of a group for model performance when other groups are present in the model [67].Recognizing this limitation in GPFI, Au et al. [67] propose an alteration called grouped-only permutation feature importance (GOPFI).GOPFI compares the performance loss of joint permutation of all groups with the performance loss of joint permutation of all groups excluding the group of interest [67].This method was implemented with a similar number of permutations as GPFI as an additional approach to evaluate the difference in the model's behavior.
Unlike GPFI/GOPFI, which are global feature importance methods, global feature effect methods, such as accumulated local effect (ALE) and partial dependence plot (PDP), were employed to quantify the effect of the values of a predictor on the direction and magnitude of change in a model's outcome [66].These methods are suitable for tasks such as investigating the relationships between the response variables in the modified RC framework or decomposing the effect of a single predictor on the response variable in EL.

Predictor Importance for Local Model Interpretation
The Shapley method, derived from cooperative game theory, was applied to determine the predictor's importance for ensemble models on a per-sample basis.The Shapley value quantifies the average contribution of each predictor to the overall model score by considering all possible permutations of the features and evaluating their marginal contributions to the model's predictions.The marginal contribution of a predictor is calculated as the difference in the model's output when the predictor is included and when it is excluded.However, Štrumbelj and Kononenko [65] proposed a Monte Carlo approach to approximate Shapley values by systematically perturbing predictors, which is a more feasible approach, particularly for handling interactions and redundancy among predictors.The difference between GPFI and the Shapley value is that while the former shows the effect of a predictor on model performance, the latter illustrates its contribution to the prediction [66].

Results and Discussion
The mean C/N ratio, MAOM, and POM for the German Agricultural Soil Inventory were measured to be 11.2, 77.5% of total SOC, and 22.5% of total SOC, respectively, with medians of 10.6, 82.8%, and 17.3%, respectively.As expected, MAOM showed an inverted distribution to POM (Figure 1).POM values ranged from 0.6% to 99.1%, and MAOM from 0.9% to 99.5%, while the C/N ratio ranged from 5.9 to 27.1.The spatial distribution of the measured response variables for top10 is presented on the left of Figure 2. As shown, the samples with the lowest measured C/N are mainly distributed in the west and partially in the south of Germany.However, higher C/N samples are mainly located in the north and extend to the east, with the highest C/N measured in the northwest.The lowest MAOM and the highest POM were mainly clustered in the northwest and to some extent in the northeast.

Predictive Performance of the Ensemble Models
The prediction accuracy of the C/N ratio model was high, as indicated by its low MAPE of 8.2% (Tables 1 and 2).Additionally, the MAOM model demonstrated good accuracy (Tables 1 and 2) while the accuracy obtained by the POM model was reasonable (Tables 1 and 2).Even though the RMSE of both MAOM and POM models was measured to be approximately 8.8%, when the data distribution (Figure 1) of the corresponding response variables was taken into account, it confirmed that, on average, POM was predicted with lower accuracy.The analysis showed that the first and third quartiles of absolute percent error for C/N (2.8% and 11.0%) and MAOM (1.9% and 9.0%) were relatively similar, and POM was noticeably higher (9.0% and 35.5%).Moreover, although the MAPE of C/N was lower than that of MAOM (Table 2), MAOM was predicted with a lower median absolute percent error (MDAPE) (4.4%) than C/N (6.0%), with POM further apart (19.2%).Further analysis revealed that 71.0% of C/N, 77.4% of MAOM, and 27.6% of POM samples were predicted with a high MAPE accuracy (<10%).Subsequently, 0.2% of C/N, 3.9% of MAOM, and 13.8% of POM samples had a MAPE greater than or equal to 50%, which reduced the overall model performance.The mean and median of the predicted values were calculated as 11.2 and 10.7 for C/N, 77.5% and 88.7% for MAOM, and 22.5% and 18.3% for POM.The close proximity between the mean and median of the measured and predicted values, respectively, indicate a low bias (Tables S1 and S2).However, the prediction of extreme values was difficult, particularly for MAOM and POM.The minimum predicted values of C/N, MAOM, and POM were 8.6, 20.9%, and 6.0% while the maximum values were 18.9, 94.0%, and 79.6%, respectively.The descriptive statistics of the measured and predicted values of the response variables are summarized in Table S1.The scatterplots (Figure 1b) show the deviation between the measured and predicted values of the response variables.MAOM and POM exhibit similar heteroscedastic patterns, while the C/N ratio is less impacted by this issue.This suggests that the predictive uncertainty varies across the measured values of the response variables, particularly for MAOM and POM.The displayed patterns indicate that higher measured values for POM and the C/N ratio, and lower ones for MAOM, lead to increased predictive uncertainty, and vice versa.Several factors can account for this increased uncertainty, such as the wide range of response variables, particularly for MAOM and POM; the data distribution (Figure 1a), which results in the rarity of high C/N ratio and POM values; and low MAOM values.The figure also reveals that the models underpredicted higher measured POM and overpredicted lower measured MAOM.While a similar underprediction can also be seen for high values of the C/N ratio, it is less pronounced than for POM.It should be noted that challenges with under-or overprediction, particularly for extreme and rare values in datasets with similar distribution characteristics, have also been reported by other studies in pedometrics [2,68,69].Thus, addressing the prediction of these extreme values could benefit from methodologies such as that proposed by Ribeiro and Moniz [70] combined with the error metrics they recommend [71].To the best of our knowledge, this method has not yet been evaluated within pedometrics, demanding a more detailed analysis that falls outside the scope of the present study.
The spatial distribution of the absolute percent error of the C/N ratio and especially of MAOM coincide with the extreme values of these two response variables, with the highest values in the northwest (Figure 2).Various factors contribute to the clustered spatial distribution of the C/N ratio, MAOM, and POM.For instance, Poeplau and Don [72] reported that POM levels were nearly double in top10 grassland areas compared with croplands, which was attributed to greater C input from the abundant root systems and constant vegetation in grasslands [73].Considering that northwest Germany predominantly comprises grasslands and the northeast is mainly utilized for cropland, this can partly explain the regional gradients in the POM and MAOM fractions.The diverse soil types associated with these land uses also play a role in these variances [73].Soil types could affect soil texture and moisture content, which in turn can influence a soil's capacity to stabilize SOC as MAOM [73].The high fractions of POM in the northwest can also be traced back to historical land use with peatland and heathland in this region [73].Furthermore, the north-south gradient can be connected to a sandier soil texture in the north, and a siltier, clayey soil in the central and southern regions.Consequently, the coarse soil texture of the north provides less available surface on mineral particles, affecting OM stabilization [74].Additionally, the northern region of Germany can mostly be characterized by its low pH [75] which reduces the microbial activity and decomposition rate, aiding the retention of plant materials as POM.All these factors highlight the importance of soil-forming factors for SOM quality indicators, such as C/N ratio, POM, and MAOM, within Germany.
Unexpectedly, the absolute percent error of POM had a more random distribution and did not closely mirror the pattern of its extreme values (Figure 2c).Therefore, in contrast to MAOM and the C/N ratio, where the absolute percent error was region-specific and could be attributed to data characteristics, the random error pattern in POM suggests other factors at play.This hints at inherent limitations in the predictive model besides data characteristics.Unfortunately, to the best of our knowledge, there is a lack of similar studies for direct comparison.However, studies outside of the field of pedometrics have also pinpointed a lower prediction accuracy of POM compared with MAOM.Ramírez et al. [76] used diffuse reflectance mid-infrared spectroscopy (MIRS) in conjunction with partial least squares (PLS) and noted higher R 2 for MAOM (0.79) than POM (0.61).Similarly, Jaconi et al. [27], who used NIRS with PLS, also reported similar outcomes in two independent datasets from agricultural and forest sites.

The Bigger Picture Using Global Model Interpretation
The models indicated that the variance in the C/N ratio, MAOM, and POM is primarily attributed to two factors: the soil and parent material group, and the land use and land cover group.The former has a more significant influence, as depicted in Figure 3a.Specifically, in the MAOM and POM models, the soil and parent materials group accounted for nearly a 75% change in RMSE.In contrast, the land-use and land cover group was responsible for a change of over 25%.For the C/N model, these groups accounted for changes of 50% and 20% in RMSE, respectively.This underscores the heightened importance of the soil and parent materials group in the MAOM and POM models relative to the C/N model.It should be noted that the quantified importance comprises the importance of each group for model performance, i.e., the main effect, and the importance of their interaction with other groups, i.e., the interaction effect.Thus, GOPFI was performed to evaluate the importance of each group when other groups did not contribute to the model's performance.Figure 3b shows that, similar to GPFI, the same groups remain the most important ones to explain the variance in the C/N, MAOM, and POM models.However, when comparing GPFI and GOPFI, there was a more marked reduction in importance for these groups in MAOM and POM compared with the C/N model.This reduction suggests a greater interaction effect of these groups with others in the MAOM and POM models than in the C/N model.In other words, the lower interaction effect of the soil and parent material group with the other groups in the C/N model, for example, means that the effect of the predictors in this group on the C/N ratio is more constant across all values of the other groups.This is in contrast to the MAOM and POM models, where the effect of the soil and parent material group on the response variables is more dependent on the values of the other groups.This makes the interpretation of the C/N model relatively straightforward in terms of main effects.It is important to be mindful of such an interaction when interpreting a model as it can lead to interpreting predictors that might not have a major effect on the response variable.However, for the models in this study, the most important groups remained unchanged.
The importance of the soil and parent materials in the models was not in line with the results of Ballabio et al. [77], who predicted five topsoil chemical properties, including C/N ratio, on the European scale using Gaussian process regression.Although they reported that their C/N model had the best performance, they considered the land cover predictors derived from satellite images, rather than soil parameters, to be the most important predictors.The divergence from the findings of the present study might stem from the use of different algorithms, the broader scale of their study, or variations in predictor sets [77].For example, with regard to soil parameters, they attributed soil age and parent material geochemistry to the C/N ratio as opposed to those in this study.The predictors used in this study are summarized in Table S3.Such disparities could alter the interaction between the predictors and also yield different main effects between the predictors and the C/N ratio within models, subsequently impacting their importance.Nonetheless, it is crucial not to interpret the predictor importance as indicating causality.Supervised machine learning models typically identify associations, not direct cause-effect relationships, between predictors and response variables [66].

The Main Effect of Predictors
The importance of the soil and parent material group for model performance can be explained by the four predictors in this group.These predictors describe soil by different attributes that are directly associated with the response variables, such as soil type, pedoclimatic characteristics, or hydro-geochemical properties (Figure S2).The main effect of each predictor in this group on the response variable was assessed using first-order ALE due to its robustness against potential dependency among predictors and its lower computational cost [66,78] (Figure 4).The first-order ALE of each predictor class is represented by a bar in Figure 4.Each bar is interpreted as the main effect of that class relative to the average data prediction, which is centered at 0.
The most influential soil type class was Podzol-Regosol (class 33).It increased the mean prediction of the C/N ratio (Figure 4a) and POM (Figure 4c) by 1.0 and 8.7%, respectively, while decreasing the mean prediction of MAOM (Figure 4b) by 8.8%.These soil types are formed from sandy parent materials rich in silicate [79].Other classes exerting similar effects on the response variables include various Podzol types (57, 25, and 17) and Hyperskeletic Leptosol derived from dry, nutrient-poor sands (class 34).Podzols cover the northern lowland, with a higher concentration in the northwest, and stretching to the east and southeast.Hyperskeletic Leptosol predominantly lies from central to northern Germany.In general, Podzols are known for their nutrient-poor, partly recalcitrant litter with a wide C/N ratio, some of which may date back to pre-agricultural times.Decomposition in these soils is also slow due to their low pH, fostering the preservation of POM as minimally transformed plant litter.Similarly, Hyperskeletic Leptosol is a shallow, nutrientdeficient soil in the early phases of organic matter accumulation [80], making it deficient in MAOM.These characteristics elucidate the models' associations between these soils and the response variables.
In contrast to Podzols, marshlands (classes 4 and 5) and Cambisol derived from alkaline-rich tuff (class 54) increased the MAOM fraction (Figure 4b), while marshes (classes 3 and 4) and Vertic Cambisol derived from marlstone and claystone weathering (class 51) reduced POM (Figure 4c) and the C/N ratio (Figure 4a).The marshlands are located in the northwest of Germany along the coastline and are derived from marine and fluvial sediments.These soils were influenced during the Holocene by the inward rise in sea level, which led to fine sediment decomposition, and are now distinguished by their fine texture and their intensive use in agriculture due to their high productivity [81].Some of these marshlands have a high calcium carbonate content which increases the stability of soil structure [81] and pH.In this regard, with higher pH, it can be expected that these soils have a higher microbial activity than the adjacent Podzols, thus increasing the decomposition rate, while its fine and stable mineral structure provides a large surface area that can stabilize SOM [82].Moreover, Vertic Cambisols are found in the southwest, but mainly stretch from the central to the southern parts of the country in its mountainous regions.These young soils are characterized by their intermediate weathering stage and are relatively poor in SOC, particularly under cropland use [13].Those derived from alkaline-rich tuff are expected to have a higher pH.The models linked Fluvisols derived from loamy-to-clayey floodplain sediments (class 8) to a reduction in the C/N ratio and POM.These soils are found alongside rivers.Fluvisols are young with little horizon differentiation.They generally tend to have a low infiltration rate, which might not lead to an increase in the decomposition rate and fixation of SOM.However, because they are mainly sedimentary transport/deposition, they can also be poor in plant litter and low in POM.Thus, the association between these soil types with the response variables is reasonable and explainable by different processes [83].
The models used the soilscapes map to associate the old moraines (classes 8, 5, and 6) formed during the Pleistocene period with a high C/N ratio (Figure 4a) and POM (Figure 4c) and low MAOM (Figure 4b).During the Pleistocene, northern Germany came under the influence of the Fenno-Scandinavian ice sheets.This occurred as they advanced into the central German uplands during the Elster and Saale glaciation, leading to the formation of the northern old moraine [84,85].Later in this period, deposits from the Weichselian glaciation covered the northeast and parts of the northern region, giving rise to the northern young moraine [85].Although the parent material of both moraines is predominantly Pleistocene deposits, the young and old morainic areas are distinguished by their relief and soil formation.The soils covering the strongly planated features of the old moraines have generally been subjected to extensive decalcification, eluviation, and podzolization.This provides insight into why the models associated them with a high C/N ratio and POM and a reduced MAOM.
However, the effects of the Pleistocene period were not confined solely to northern Germany.The young moraine (class 33) in the south of the country, found in the Alpine foothills, is a consequence of the Würm glaciation of this period [86,87].This young moraine has a finer soil texture, ranging from clay to sandy loam, compared with the young moraine of the north [87], which in the model of the present study was correlated with a low C/N ratio (Figure 4a) and POM (Figure 4c) and high MAOM (Figure 4b).Furthermore, the models indicated that loess and loess loam soils, originating from carbonate rocks (class 27), play an influential role in decreasing the C/N ratio and POM and increasing MAOM.These soils originate from mountainous and hilly areas that have a high proportion of non-metamorphic carbonate rocks, situated within the basins of the Franconian Alps and the Kraichgau in the central and southeastern parts of the country [32,88].A similar association between the response variables with the loess of German uplands, the Upper Rhine Valley, and most parts of the loess from the northern Central European loess belts (class 22) [88] was observed.Given that these soils are aeolian sediments, they have a fine texture and are rich in carbonate content, at least in their original state [89].The carbonate content is the main factor in soil properties that determines the soil pH and serves as a buffer system to stabilize it [75], thus promoting a faster decomposition rate.Moreover, it helps stabilize the fine-textured soils of the region for the sorption of decomposed SOM.
The main effect of the hydrological unit classes indicates that the unconsolidated or loose silicate sediments, which are characterized by a porous structure and hydraulic conductivity ranging from >10 −6 to 10 −3 m/s, and more consolidated sediments of silicate or silica-carbonate, which have relatively lower conductivity ranging from >10 −7 to 10 −5 m/s (classes 19,26,41,143,199) increase the C/N ratio and POM while decreasing MAOM.In contrast, consolidated sediments of silicate or silica-carbonate with low hydraulic conductivity (<10 −5 m/s) (classes 261, 263, 264, 287, 449, 576, 642, 700, and 927) are associated with a reduction in the C/N ratio and POM, and an increase in MAOM.Hydraulic conductivity, soil water tension, and soil moisture content are closely interlinked [90].Whereas short-term fluctuations in soil moisture can influence microbial activity and the decomposition rate of SOM [91], long-term soil moisture variations alter mineral and chemical composition, thereby changing organo-mineral interactions [92].For example, a decline in soil water content leads to the thinning of water films in soil pores.This results in an increase in flow path length and solution viscosity.Consequently, the diffusion of dissolved compounds in the soil solution becomes limited, reducing the likelihood of encounters between microbes, extracellular enzymes, and substrates [92,93].
The pedo-climatic regions, namely the northwest and southwest Weser-Ems regions (class 50, 32), the upper and lower Geests (class 38), and the Elbe-Weser triangle (class 33), are located in the northwest of the country, and are linked to a high C/N ratio (Figure 4a) and POM (Figure 4c), and low MAOM (Figure 4b).These soils represent the historical peatlands-heathlands that have been severely disturbed, degraded, and mixed with or coveredby mineral soils.Moreover, characterized by their sandy soil, oceanic climate with high precipitation, low evapotranspiration, and low pH, the SOM formation of these regions is mainly dependent on ombrotrophic conditions and is nutrient-poor [94].These conditions lead to a lower decomposition and accumulation of SOM in the form of POM.However, Uecker-Randow, southern Schleswig-Holstein, and Erzgebirge (classes 4, 36, and 45) stretch eastward toward a more continental climate with lower precipitation and higher pH.The decomposition of the SOM in these regions is more heterogeneous, yet more prominent than in the soils of the northeast [94], which explains the decline in POM and C/N ratio.Furthermore, the low mountain ranges of the east and central Hesse, Odenwald, Hunsrück, and south Hannover (class 21, 23, 24, 25, 29) that stretch from the central part to the southwest are correlated with high MAOM and with a low C/N ratio and POM.S4.

Spatial Prediction of the Response Variables
Spatial prediction of the C/N ratio and MAOM for agricultural land use for the top10 of mineral soils was generated using their ensemble models (Figure 5).However, since the POM model performed similarly to when it was calculated as a subtraction of 100% from MAOM (Table 2), the spatial prediction of POM was obtained using this method.The mean C/N ratio, MAOM, and POM ranged from 8.9 to 19.3, 21.9% to 94.1%, and 2.6% to 100%, respectively.The spatial prediction of the response variables indicated that, in general, the north of Germany contains a higher C/N ratio and POM, and a lower MAOM than the rest of the country.This follows the general trend of the soil texture across Germany, with coarser textures distributed in the northern region and finer textures distributed in the center and south of the country [56].Although some of the predictors in the present study and [56], particularly in the soil and parent material group, were similar, they also revealed the close relationship between soil texture and these response variables.Nonetheless, in detail, the spatial predictions resemble the predictors of the soil and parent material group.Within the northern region, higher MAOM, as well as lower POM and C/N ratio, are distinguishable in low terraces along the Elbe River, the young moraine of the northeast, and the marshlands along the North Sea coast in the northwest.In contrast, the old moraine stretching across the northern region is identifiable by its lower MAOM, and higher POM and C/N ratio.The previously observed gradient of the response variables from northwest to northeast (Figure 2) is also reflected in their spatial predictions with the highest C/N ratio and POM and the lowest MAOM located in Podzols in the northwest.
Despite the gradient, the magnitude of these variables was not fully captured, primarily because of the models' subpar performance in the northwest (Figure 2).This shortcoming is further revealed by the interquartile ranges of the response variables, which signify model uncertainty (Figure 5).The figure signifies that the highest level of uncertainty is in the northwest, then decreases toward the northeast, and even more so toward the center and south of the country.Such a trend highlights the models' challenges in predicting the extremes of the northwest as well as the heterogeneity of the response variables in the north.Moving toward the central and south of the country, the variability in the C/N ratio, POM, and MAOM is not as noticeable as in the northern region and has a more homogenous pattern.In these regions, the C/N ratio and POM tend to reduce, while MAOM increases.Nevertheless, regions with a slightly higher C/N ratio, POM, and lower MAOM are identifiable, notably within the Upper Rhine Plain, the foothills of the Alps, and the valleys of the Danube River.

From Global to Local Model Interpretation
As mentioned above, the C/N and MAOM models exhibited high and good performance, respectively, according to their MAPE.However, this naturally leads to the question of why the POM model did not demonstrate a similar performance to that of the MAOM model.Such a disparity was unexpected, especially considering that POM and MAOM share an inverse relationship with their sum adding up to 100%.In other words, these two response variables have an inverse function.Thus, the initial assumption was that the POM model should be able to obtain the necessary information from a similar feature space from which MAOM conveyed high accuracy.To gain a visual insight into the models' behaviors, the Shapley method, described in Section 2.5.2, was employed for each group.This helped decompose the MAOM and POM models, cross-examine them, and discern any variances in how the models accessed information from their respective feature spaces during the training phase.
Figure 6 illustrates the order of importance of the predictors for the MAOM and POM models.In this context, the focus is not on the order of predictors within an individual model, but on their similarities across both models.The figure reveals that the most im-portant predictors are similar for the two models.The magnitude of the contribution of each sample for each predictor is shown on the x-axis of the figure, i.e., the SHAP value.For example, soilscapes, hydrological unit, and soil type cover a wider range of SHAP values, thus implying that they make a more varied contribution compared with other predictors in both models.The gradient in color serves to highlight the effects of higher or lower values (for numerical predictors) and varied classes (for categorical ones) on the model outcomes.By way of illustration, higher values of the Landsat green band for summer (Green 9.12), represented by a darker hue, increase the predicted MAOM by up to around 5%, while simultaneously decreasing the predicted POM by a comparable amount.Most importantly, the distribution of the SHAP values suggests that the POM and MAOM models are essentially mirrors of one another, and obtained relatively similar information in relation to their respective response variables-an outcome that was anticipated.The values on the y-axis are the absolute mean contribution of each predictor.These values, derived from SHAP values, reveal the overall contribution of a predictor to the final prediction.The figure emphasizes that the overall contributions are very similar across the two models.In general, the figure confirms that the POM model's behavior aligns with expectations.However, the query surrounding the divergent performances of the two models remains unresolved.One advantage of the Shapley method is its capability for the local interpretation of a model, i.e., interpretation per sample.Using this method, this capability was implemented to identify the predictors' contribution for samples where, for example, the predictions from the MAOM model had an absolute percent error below 1% (387 samples), and to compare it with the same samples but predicted by the POM model.The same approach and threshold were used with MAOM and POM switched (87 samples).However, due to the high number of samples and for better visualization, the mean contribution of each predictor across the samples was calculated (Figure 7).This approach provides a detailed perspective of the model's behavior on a part of the data where it performed the best.Figure 7 reveals the direction and magnitude of a predictor's contribution in a model to shift the mean prediction of all samples, i.e., E[ f (x)], toward the mean prediction of given samples, i.e., f (x).The sum of the contribution of all predictors for the best-predicted samples of MAOM and the corresponding samples of POM was 6.1% (Figure 7a), while it was 4% for the best-predicted samples of POM and its MAOM counterpart (Figure 7b).Despite an equal sum of the contribution of all predictors, the POM model performed worse than the MAOM model.The analysis for the best-predicted MAOM samples showed that the MAOM model had 0.47% MAPE, while the MAPE for these samples for the POM model was calculated to be 3.9%.This demonstrates a noticeable difference in error between the two models, particularly as they were expected to obtain similar accuracy considering their observed similar behavior.Interestingly, even for the best-predicted samples of the POM model, the MAPE of the MAOM model was calculated to be lower than the POM model, i.e., 0.5% MAPE for MAOM and 0.6% MAPE for POM.The disparity between the models' behavior and their performance indicates that the magnitude of the contribution of predictors is not adequate for the POM model to have a similar accuracy as to the MAOM model.Model interpretation methods can reveal the model's behavior to a certain level.Consequently, one possible explanation is that the feature space lacked a predictor or predictors to provide additional information to the POM model either through the main effect or interaction effect.
As previously discussed, POM and MAOM are formed from two different paths.POM, primarily originating from plant residues, is more susceptible to disturbance and land management [74].However, several studies have highlighted the influence of soil properties, such as texture, on carbon fractions [74,82,95].Therefore, in addition to the effects of the predictors discussed in Section 3.3.1,a plausible reason for the efficacy of the feature space in predicting MAOM could be that the predictors within the soil and parent material group capture the nuances of German soil texture, as previously shown by Gebauer et al. [56].As a result, it was the MAOM model-and not the POM model-that obtained adequate information from its feature space.This observation also suggests that the land-use and land-cover data did not provide sufficient information for the POM model, despite the importance of vegetation cover for POM [76].The absence of direct or indirect data on pivotal aspects such as residue decomposition and land management [76] could have hindered the POM model's performance.Additionally, this highlights that while model interpretation techniques offer valuable insights into the intricacies of the behavior of the machine learning model, these models do not fully elucidate or account for intricate cause-effect relationships.

Modeling the Underlying Relationship between Response Variables
The modified RC framework was implemented to identify the relationship of the response variables.Figure 8 presents second-order PDP plots that clarify the main effect and interaction effect of the two predicted response variables when used as additional predictors to predict another response variable.Figure 8a reveals that higher values of the predicted C/N ratio and lower predicted MAOM values lead to an increase in predicted POM.Notably, the white horizontal lines in the figure demonstrate an interaction between the predicted MAOM (especially in its lower range) and predicted C/N ratio (specifically when it is above approximately 15).This interaction results in the highest predicted POM in the model.Figure 8b   The first-order ALE (Figure 9) was used to isolate the main effect of a single response variable in the modified RC model when employed as a predictor.For these response variables, the main effect could only be evaluated by the first-order ALE, given the interdependence between POM, MAOM, and the C/N ratio.Figure 9a,c depict a negative linear correlation between POM and MAOM in their corresponding models.This echoes their inverse relationship previously discussed.Each grey line in the figure represents a fold in the CV for which ALE was calculated.The divergence of the grey lines indicates the effect of model uncertainty on ALE, exacerbated by the scarcity of samples at extreme values.In contrast, model uncertainty drops near the central tendency of the data, as denoted by the rug plot (Figure 9). Figure 9b,d demonstrate that the C/N ratio has a logistic relationship with two other response variables.This logistic relationship suggests that diverse mechanisms control the interaction between the response variables.The slope of the red line, the interpolated line, indicates that the C/N ratio has a linear relationship with POM (Figure 9d) and an inverse one with MAOM (Figure 9b) in the majority of German soils.These are generally the finer-textured soils encompassing vast stretches of the country, especially the central and southern regions as explained above.However, the linear relationship does not continue in soils characterized by a high C/N ratio, typical of the sandy soils in the north.A plausible rationale for such a nonlinear relationship is the variety in the decomposition rate of SOM in this region [94].This could be driven by multiple factors, including the nutrient quality of SOM spread across the north of Germany [94] and the pH gradient from the northeast to the northwest, with the latter having the country's lowest pH [75].This heterogeneity in SOM decomposition intensifies in the northeast and central northern regions, contrasting with the northwest [94].These soils essentially mediate between the central and southern regions, as manifested in the linear portion of the figures (Figure 9b,d), and the northwestern soils, which have the highest C/N values.Consequently, it is conceivable that the heightened variability of factors affecting the response variables in this particular region, compared with the rest of the country, amplifies the variability in the response variables (Figure 5).This, in turn, leads to a nonlinear relationship between the C/N ratio and the two other response variables.

Conclusions
We successfully trained three ensemble models, comprising SVR, BRT, and RF, for spatial prediction of the C/N ratio, MAOM, and POM in agricultural topsoils across Germany at a 25 m resolution.Additionally, we introduced a modified version of RC, designed to delineate the relationships among the response variables.This method proved to be scalable and applicable in pedometrics.Most importantly, this method addresses the challenge of building an interpretable model that takes into account complex interactions between multiple soil properties at a large scale.This is particularly important for algorithms that inherently do not support multi-target prediction.The effectiveness of the approach was ensured through the application of the differential evolution algorithm for optimal model parameter tuning.Subsequently, to gain deeper insights into model behavior and spatial predictions, we employed model-agnostic interpretation methods, enabling us to decompose the models thoroughly.This analysis revealed that soil type, pedo-climatic region, hydrological unit, and soilscapes accounted for the most variance of the C/N ratio, MAOM, and POM compared to other soil-forming factors.Furthermore, our analysis of modified RC identified a nonlinear relationship between the C/N ratio and the other two response variables (POM and MAOM), underscoring the intricate complexity of SOM interactions on a national scale.We argued that this complexity could be caused by various reasons such as different decomposition rates of SOM as a result of variety in its nutrient quality.Moreover, our models unveiled regional heterogeneity in the distribution of the response variables in the north of the country.Here, the young moraine of the northeast exhibited a lower soil C/N ratio, POM, and higher MAOM as opposed to the old moraine where this observation was reversed.The spatial predictions also identified a transition to a lower C/N ratio and POM, and an increase in MAOM, in central and southern Germany, corresponding with a soil transition from sandy to more silty and clayey compositions.However, it is crucial to acknowledge that every model, irrespective of its predictive prowess, bears inherent uncertainty.To address this, we presented the uncertainty associated with each model alongside its spatial predictions for each response variable.This would help users to consider not just the predictive performance but also recognize regions marked by pronounced uncertainty.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/agriculture14081298/s1; Figure S1: The framework of (a) stacked single−target (SST), (b) regressor chain (RC), (c) modified regressor chain; Table S1: The summary of measured, and predicted data of ensemble models (EL) of the response variables; Table S2: The evaluation of the performance of the ensemble models (EL) of the response variables; Table S3: The predictors and their types; Table S4: The description of classes of predictors for the models: C/N ratio, mineral-associated organic matter (MAOM), and particulate organic matter (POM); Figure S2

Figure 1 .
Figure 1.(a) Box-violin plot of measured C/N ratio, mineral−associated organic matter (MAOM), and particulate organic matter (POM) from the German Agricultural Soil Inventory.(b) Scatterplot of measured versus predicted values for C/N, MAOM, and POM ensemble models.

Figure 2 .
Figure 2. (a) Spatial distribution of C/N ratio, (c) mineral−associated organic matter (MAOM), and (e) particulate organic matter (POM) in the top 10 cm of mineral soils in the German Agricultural Soil Inventory, and the spatial distribution of the absolute percent error of (b) C/N ratio, (d) MAOM, and (f) POM.The absolute percent errors of MAOM and POM were calculated from the corrected predictions.

Figure 3 .
Figure 3. Grouped permutation feature importance (GPFI) (a) and grouped−only permutation feature importance (GOPFI) (b) of the ensemble model for C/N ratio, mineral−associated organic matter (MAOM), and particulate organic matter (POM).The importance is calculated based on the percent relative change in RMSE regarding the permutation method.

Figure 4 .
Figure 4. (a) First−order accumulated local effect (ALE) of predictors for soil and parent materials factor for the C/N ratio, (b) mineral−associated organic matter (MAOM), and (c) particulate organic matter (POM).Six classes of each predictor with the three highest and three lowest ALE values are plotted for better visualization.The classes are described in TableS4.

Figure 5 .
Figure 5. (a) Spatial prediction of the C/N ratio, mineral−associated organic matter (MAOM) (%), and particulate organic matter (POM) (%) for the top 10 cm of German mineral agricultural soil, with (b) the corresponding model uncertainty.

Figure 6 .
Figure 6.SHAP values for the mineral−associated organic matter (MAOM) (a) and particulate organic matter (POM) (b) models.Each dot indicates a sample from the training set.The color of the dot indicates the value of the predictor for the given sample.The darker the dot color, the higher the value of the predictor.The predictors are ordered from highest to lowest based on their absolute mean of the contribution to the prediction.The absolute mean is indicated on the y-axis.The numbers on the x-axis indicate the magnitude that each predictor for the given sample contributed to the final prediction of the model.

Figure 7 .
Figure 7. (a) Contribution of predictors in the mineral−associated organic matter (MAOM) model for samples with the lowest absolute percent error, (b) the same samples in the particulate organic matter (POM) model, (c) the POM model for the samples with the lowest absolute percent error, and (d) the same samples in the MAOM model.E[ f (x)] is the mean predicted value of all samples.f (x) is the mean of the predicted values for the given samples.The numbers on the y-axis indicate the mean contribution of a given predictor to the final prediction of the given samples.
displays the inverse of what is shown by Figure 8a because the predicted MAOM was substituted by POM.

Figure 8 .
Figure 8.(a) The second−order partial dependence plot (PDP) contour plot of particulate organic matter (POM) and (b) mineral−associated organic matter (MAOM) in the modified regressor chain models.

Figure 9 .
Figure 9.The first−order accumulated local effect (ALE) plot of the relationship between mineral−associated organic matter (MAOM) and particulate organic matter (POM) (a), MAOM and C/N ratio (b), POM and MAOM (c), and POM and C/N ratio (d) from their corresponding modified RC model.The gray lines represent the interpolated ALE for each CV fold and the red line indicates the overall interpolation.The rug plot indicates the distribution of data from which the first−order ALE was computed.
: The predictors of the soil and parent material groups.(a) Soil type, (b) soilscapes of Germany, (c) hydrological unit, (d) pedo-climatic region.References for each map are cited in the main text.Author Contributions: Conceptualization, A.S.; Methodology, A.S.; Software, A.S.; Validation, A.D., M.L., T.S. and R.T.-M.; Formal Analysis, A.S.; Investigation, A.S.; Data Curation, A.S.; Writing-Original Draft Preparation, A.S.; Writing-Review and Editing, A.D., M.L., T.S. and R.T.-M.; Visualization, A.S.; Supervision, A.D. and T.S.; Funding acquisition, A.D. All authors have read and agreed to the published version of the manuscript.Funding: The AI4SoilHealth project has received funding from the European Union's Horizon Europe research and innovation program under grant agreement No. 101086179.

Table 1 .
Interpretation of mean absolute percent error (MAPE) for evaluation of model performance.

Table 2 .
The evaluation of the performance of the ensemble models (EL) of the response variables.