SOC stocks prediction on the basis of spatial and temporal variation in soil properties by using partial least square regression

Global warming is a wide-scale problem and soil carbon sequestration is its local scale, natural solution. Role of soil as carbon sink has been researched extensively but the knowledge regarding the role of soil variables in predicting soil carbon uptake and its retention is scarce. The current study predicts SOC stocks in the topsoil of Islamabad-Rawalpindi region keeping the soil properties as explanatory variables and applying the partial least square regression model on two different seasons’ datasets. Samples collected from the twin cities of Islamabad and Rawalpindi were tested for soil color, texture, moisture-content, SOM, bulk density, soil pH, EC, SOC, sulphates, nitrates, phosphates, fluorides, calcium, magnesium, sodium, potassium, and heavy metals (nickel, chromium, cadmium, copper and manganese) by applying standard protocols. Afterwards, PLSR was applied to predict the SOC-stocks. Although, current SOC stocks, ranged from 2.4 to 42.5 Mg/hectare, but the outcomes of PLSR projected that if soil variables remain unaltered, the SOC stocks would be likely to get concentrated around 10 Mg/hectare in the region. The study also identified variable importance for both seasons’ datasets so that noisy variables in the datasets could be ruled out in future researches and precise and accurate estimations could be made.

At global scale, there is a noticeable level of interest growing towards better management of soils' organic carbon not only for dealing with the food security problems but also for tackling the changing climate. The major initiatives addressing this problem include 4p1000 initiative, REDD + and Global assessment of SOC sequestration potential program (GSOCseq) [1][2][3] .
Soil is considered to be the greatest of all sinks for fixing the atmospheric carbon. They hold double the amount of carbon as compared to the terrestrial vegetation 4,5 . Soil holds carbon content in the form of soil organic carbon (apart from the calcareous soils) 6,7 . The uptake of carbon in soil, commonly referred to as carbon sequestration occurs either directly when CO 2 is transformed into inorganic compounds like calcium and magnesium carbonates by means of particular inorganic chemical reactions 8 ; or it occurs indirectly when the biomass gets degraded and becomes part of the soil system in the form of organic compounds, broadly referred to as soil organic matter consisting of soil organic carbon along with other organic substances such as humus 9 . Most of the structural and functional aspects of soil system such as moisture retention, complex formation with the metal ions and the cation exchange capacity of the soils are dependent on the soil organic carbon 10,11 . But the impact of SOC on soil is not unidirectional. Soil properties also influence the capture, quality, distribution and retention time of SOC depending on various external and internal factors such as land use category and seasonal fluctuations in temperature and moisture 12,13 .
Uptake and retention of atmospheric carbon in soil is a complex phenomenon. This intricate process involves multiple variables belonging to all the spheres (atmosphere, biosphere, hydrosphere and lithosphere) of the ecosystem 14,15 .
Within the soil regime, the spatial and temporal changes in soil organic carbon stocks is very much dependent on the innate properties of soils. However, the statistics on the distribution of soil organic carbon (SOC) stocks in the soil profiles in relation to the soil variables are barely adequate.
Following research provides a holistic picture of different variables shaping the SOC stocks in the top soils of Islamabad-Rawalpindi region.

Results and discussion
Multiple researches have been conducted on estimation of level, quality and distribution pattern of the SOC stocks all around the globe [23][24][25] . But the data on the factors reshaping these stocks is scarce. In recent times, management of soil organic carbon has been considered as one of the major and efficiently exploitable tools combating the rising greenhouse gas emissions (particularly carbon emissions). There are a number of factors governing the uptake, retention and variability in the behavior of soil organic carbon 14 . Amongst the other controlling factors such as climatic conditions, hydrological regime, biomass input and land use variation, the one major entity/ system that consistently affects the SOC uptake as well as its storage is the soil regime. Soil variables, as a unit, can be regarded as one of the most significant and decisive factors in either enhancing or regressing the entire process of carbon uptake and its preservation.
Availability of comprehensive knowledge regarding the impact of soil variables on SOC stocks is undoubtedly a vital prerequisite for increasing the rate of carbon sequestration in a particular area.
Partial least square regression is a predictive statistical model which projects the dependent variable on the basis of single or multiple independent variables. In present research, the current soil properties were addressed as the independent variables and the SOC stocks were addressed as dependent variable. On the basis of the whole summer data distribution, the one datum i.e., S23 was selected as the validation dataset. The summary statistics of the summer and winter dataset along with their respective validation datasets are provided in Tables 1 and  2 respectively. During summer season, SOC Stock in topsoil of study area ranged from 2.352 Mg/hectare to 42.453 Mg/hectare with a mean value of 11.131 and standard deviation of 7.391.

Predictions and residuals.
On the basis of the explanatory datasets of each season, the SOC stock predictions were made. The detailed datasets along with the standardized residuals for each season i.e., summer and winter are given in Tables 3 and 4 respectively. For majority of the observations, the predicted SOC stock values were found to be lower than the current SOC stock values.
Factors of soil formation. The dynamics of SOM accumulation and stabilization differ in different types of soils 27 . Among other governing factors such as climate, microbiota and vegetation types; the pedogenic processes not only regulate the storage of SOM but also predict the behavior of SOM in long term. The innate structural properties of soil such as its texture and bulk density define how SOM will be retained within different soil horizons. Similarly, other soil formation processes i.e., deposition and removal of nutrients (illuviation and eluviation) define how SOM will get adsorbed onto mineral surfaces in both short as well as long run 28 . Furthermore, the physicochemical characteristics of soil also govern the permanence and stability of SOM.
The current research discusses these soil formation processes such as the innate physicochemical properties and the nutrient deficiencies (nitrates and sulphate deficiencies) within the alkaline soils of Islamabad and Rawalpindi in terms of variables of importance.
Variable importance in the projection. As per the results of the summer dataset PLSR, the most important variables impacting the whole summer dataset were found to be bulk density, soil organic carbon, calcium ions, soil EC, and soil nitrates as shown in Fig. 1a. While the most important variables playing the key role in the winter season were soil organic carbon, nickel concentrations, moisture content of the soil, phosphate content, potassium ion, soil EC and soil nitrates as shown in Fig. 1b. The standardized coefficients of independent variables against the SOC stock as dependent variable for summer season and the winter season are shown in Fig. 1c,d.
Among the samples collected in the summer season, a huge number of samples i.e., 77% had bulk density value less than 2 g/cm 3 . And in winter dataset, a major percentage i.e., 82% of the whole data lied within the 1 g/ www.nature.com/scientificreports/ cm 3 to 2 g/cm 3 . These values are also in consensus with the available literature of the region i.e., Islamabad and Rawalpindi [29][30][31] . Thus, they can be considered as the predominantly prevailing bulk density values in the region. The range of bulk density in the study area denotes a potential for increased SOC retention. Naturally, the bulk density consists of air spaces and the SOM. In the sandy-textured soils such as sandy loams of the study area, bulk density values greater that 1.7 g/cm 3 hinders the natural root growth whereas in fine-textured soils this value further drops down to 1.5 g/cm 332, 33 . In terms of carbon storage, bulk density values less than 1.3 g/cm 3 is considered to be as good, while greater than 1.8 g/cm 3 is considered to be as very bad 32 . Besides soil texture, the nutrient concentration also leads to higher values of bulk density. Chaudhari et al. reported negative relationships of soil bulk density with soil silicates, soil calcium carbonate and total micro and macro nutrient contents 34 . Hence soils of Islamabad and Rawalpindi should be managed for lowering the bulk density.
Within the soils of Islamabad and Rawalpindi, soil organic carbon in both the seasons majorly lied from 0.16 to 0.25%. Majority of the soil samples had SOC percentage less than 0.25%. Around 47% of the samples in the summer dataset and 45% in the winter dataset had SOC ranging from 0.16% to 0.25%. SOC values less than 2% indicate that the soil is of poor-quality in terms of its structural stability 35 . In agricultural criterion too, the soils having less than 2% SOC value are widely considered to be the ones that have more chances to get deteriorated in terms of productivity and yield 36 . In the dry land cropping system, even no-tillage doesn't work alone to cause increase in the SOC percentage [37][38][39] . The dry-aggregates associated carbon is the prime SOC stock stored in the semi-arid soils of Pakistan 40 . So, in order to increase it, the focus should be shifted to broader soil management strategies such as integrated nutrient management and planned planation.
Majority of the samples (57% in the summer and 35% in the winter) lied in the range of 1000 ppm to 1500 ppm of calcium content in both the seasons. Available research also supports the data ranges 41 . In comparison to the calcium carbonate content, the exchangeable calcium plays an important role in the protection of SOC. However, in the presence of adequate nutrients especially organic amendments in the form of compost, the calcium carbonate is readily converted to the exchangeable calcium. This change within the soils of a region can further enhance the process of organo-calcium complex formation and thus can play a key role in long term preservation of soil organic carbon [42][43][44] .
For both the seasons, most of the samples had nitrate concentration less than 3 ppm. In summer season dataset, a major percentage of samples i.e., 71% lied from 0 to 3 ppm. While in winter season, 88% of the samples had nitrates level less than 1 ppm. Within the winter dataset, the number of samples continuously decreased with an increase in the nitrate content. Shaheen, 2016 also described the soils of Rawalpindi region to be nitrate deficient. The possible reasons for this deficiency might be leaching because of the coarse-textured soils 45 of the region and volatilization 46 due to alkaline pH of the soils as well as the increasing temperatures 47-49 . www.nature.com/scientificreports/ The minimum required soil nitrate concentration available for the plants is 10 ppm (preplant season) to 30 ppm (growing season) 50 . Whereas soils of Pakistan are deficient in nitrates 51,52 .
Due to its high mobility, the nitrate ion does not get adsorbed at the cation exchange site within the soils. Hence, it is readily lost, especially from the calcareous soils such as those of Pakistan 53,54 . This loss not only affects the quality of the soil but it also translates into economic loss in terms of annual crop production [55][56][57] . Nitrogen use efficiency within the Pakistani agriculture system rarely go beyond 40%. Mostly (around 22-53%) of the nitrogen content added is lost in the form of ammonia, due to alkaline nature of soils rising temperature. The other factors that contribute to this volatilization are low moisture content and salinity. This volatilization can be controlled up to 80% by adopting good soil management practices 52 .
Phosphate content for both the seasons majorly lied from 5 to 100 ppm. The summer dataset majorly concentrated from 5.1 ppm to 10 ppm with having 61% of the samples from the whole data. And in winter season, major percentage of samples i.e., 61% lied from 0.1 ppm to 15 ppm. The phosphate concentration ranges also coincided with the available studies of the region 58,59 .
According to the Minimum Levels for Sustainable Nutrition (MSLN) guidelines, phosphate range in healthier soils is 7 ppm to 50 ppm. So, the levels of phosphates within the soils of Islamabad and Rawalpindi are satisfactory 60,61 .
Nickel in both seasons' datasets lied under 6 ppm. In the summer dataset, about 80% samples lied in the range of 3.1 ppm to 6 ppm. In winter season, the highest percentage of samples i.e., 35% lied in the range of 5.1 ppm to 6 ppm. Results of the present study lied close to nickel concentrations reported by Ashraf et al. 2019 for the study region 62 .
Presence of nickel ions is crucial for the soil organic matter. In alkaline soils having pH values higher than 8, the formation of Ni-calcite complex leads to the protection of organic matter 63,64 . In a recent in-situ research, nickel nanoparticles were used to enhance the mineralization process of CO 2 by using brines to permanently sequester the impure carbon dioxide into carbonates 65 . However, in ex-situ environment, role of Ni for the sequestration has not been studied in-depth lately. In 1979, research was conducted to assess the role of nickel as an oxide in the sandy soils of different pH ranges on mineralization of carbon. This research concluded that the mineralization diminished with increasing nickel concentration, however, at their highest soil pH i.e., 7.6, the extent of this mineralization was not the same as that of the lower pH soils 66 . So, there is potentially a research gap that needs to be addressed regarding the role of nickel in carbon sequestration.
Potassium content for majority of the samples (i.e., 67%) in summer season, lied in the range of 1 ppm to 55 ppm. Around 22% of the samples lied in the concentration bracket of 56 ppm to 110 ppm. In the winter dataset, 75% of the samples lied in the range of 1 ppm to 40 ppm. The number of samples decreased with an www.nature.com/scientificreports/ www.nature.com/scientificreports/ www.nature.com/scientificreports/ increase in the concentration of potassium during both seasons. The data range also coincides with the available literature 41,[67][68][69] . Potassium content is predominantly high within the soils of most of Pakistani regions including the region of Pothwar which has mica in the parent rock material 70 . The alkaline soils having high sodium and potassium values should be irrigated with low-salt water 71 . The other soil factors associated with the regulation of potassium ion are the charge balance and the enzymatic activity regulation within the soils' microbial population as well as the higher plants 72,73 . It also affects the moisture content of the soil by regulating the osmotic uptake in plants, thus indirectly, it plays an important part in the carbon retention within the soil 74,75 .   Fig. 2a,d. Then, based on the standardized residual values the SOC predictions were made which are given in Fig. 2b,e for summer and winter season respectively. According to the PLSR outcome, majority of the predicted SOC stock values were found to be concentrated around 10 Mg/hectare as shown in Fig. 2c,f for summer and winter season datasets respectively. The major probable reasons (in terms of soil properties) behind such a low level of SOC stock values might be the higher bulk density values, low nitrate concentrations as depicted in the variable importance (Fig. 1a,b). www.nature.com/scientificreports/ Outliers analysis. Outliers' analysis results are provided in Fig. 3. The distance to model (DMoD) values for both the summer and winter dataset was found to be 1.368 for the explanatory variable as shown in Fig. 3a,c. And the DMoD for the dependent variable i.e., SOC stock was found to be 1.510 for the summer dataset (Fig. 3b) and 1.518 for the winter dataset (Fig. 3d).

Conclusion and recommendations
According to the outcomes of the partial least square regression, it can be concluded that if the current status of soil variables prolongs unaltered, the SOC stocks of the topsoil of Islamabad and Rawalpindi region will be likely to deteriorate further. However, if the variables identified as most important in both season's datasets (such as bulk density, soil nitrates and moisture content) are managed at the regional level, the SOC stock would likely to improve and thus could play a major role in mitigating the impacts of climate change particularly the warming of this region. Some of the recommendations for enhancing carbon stocks in alkaline soils of Islamabad and Rawalpindi are as follows.
• In order to increase soil carbon stocks, integrated nutrient management approaches can be applied.
• The alkaline soils of the region should be irrigated with low-salt water in order to naturally optimize their alkalinity. • The coarse-textured, alkaline soils of Islamabad and Rawalpindi are deficient in the nitrate content. This deficiency can directly attribute to the decreasing soil carbon stocks. Hence, nitrate content of the soil must be managed in order to improve the current soil carbon stocks.

Data availability
The data that supports the findings of this study are available from the corresponding author upon reasonable request.