Modelling of permanent wilting point from routine soil properties on a typical alfisol

Soil water holding capacity at permanent wilting point is imperative for plant water stress in specific soil type. This study was undertaken to formulate a regression model or equation for predicting permanent wilting points (PWP) of soils on a typical Alfisol of basement complex origin at the Teaching and Research Farm of the University of Ilorin. A total of forty five (45) disturbed and forty five (45) undisturbed soils samples were collected along a toposequence (upper, middle and bottom slope) at 3 depths: 0 cm – 30 cm, 30 cm – 60 cm, and 60 cm – 90 cm. Soil properties of the disturbed and undisturbed samples were determined using basic experimental methods and/ or calculated using reputable techniques. The measured soil properties include the proportions of soil separates, bulk density, total porosity, PWP and organic matter. Three different models were developed for predicting PWP of soil using regression model technique. There was no significant relationship between PWP and soil separates, bulk density and total porosity. However, only the silt content was positively correlated with PWP (r=0.22; p<0.05). Although, model three of PWP with the highest adjusted coefficient of determination (0.2952) emerged as the optimal choice. The model clarifies 30 % of part of variance in the mean square error of PWP with sand, silt and clay contributing statistically to the model. This implies that additional variables and techniques such as spatial and machine learning aside those used in the present study would provide a more reliable pedotransfer function for predicting PWP in the soil.


Introduction
The essence of modeling lies in simplifying complex realities into manageable representations called models (Kinoshita et al., 2012).These models capture key elements that are essential for understanding or predicting a specific outcome (Van Looy et al., 2017).For example, in soil science, models can represent the behavior of soil systems (Obi et al., 2012).One key parameter in such models is the permanent wilting point (PWP) (Kinoshita et al., 2012).This represents the moisture level in soil at which foliage permanently shrivel due to their inability to conduct moisture from soil.Understanding soil water retention, including the PWP, is crucial for optimizing water management practices in agriculture, ultimately leading to sustainable and improved production.Knowledge of soil moisture content is vital to decide plant variety to be grown, available water for plant growth, water stress as well as solute movement, evapotranspiration, cropping systems, tillage management, infiltration, drainage and irrigation scheduling and other assorted hydro-physical processes.
Soil water holding capacity is critical for simulating the hydrological behaviour of landscapes and assessing the suitability of soil for various applications (Kukal et al., 2023).This ability to retain water is primarily determined by capillarity, the physical phenomenon arising from the interaction between water molecules and soil particles.The effectiveness of capillarity, in turn, depends heavily on the structure of soil pores.
Several key factors collectively influence soil water retention, including texture (the size and proportion of particles greatly impact pore size and water holding capacity), structure (the organization of particle and aggregates affects pore connectivity and water movement), bulk density (the density of soil particles influences the volume of pores available for water storage), organic matter (OM) content (organic matter increases pore space and water holding capacity), etc. (Amsili et al., 2022;Vereecken et al., 1989).Since water retention is pretentious by these aforementioned physical properties, empirical relationships accelerated by advances in computer modelling (Minasny et al., 1999) could be developed for their prediction in soils, which can help in making informed decisions about land use and management.In line with this, pedotransfer function (PTF), defined as converting available data (those we have) into useful information (what we need) was devised by Bouma (1989).This allowed for the creation of functions that predict the values of specific soil properties based on other, more readily and economically measurable properties.These PTFs are often established from empirical observations whose applicability would however be restricted to the datasets employed in generating the model (Donatelli et al., 1996;Wosten et al., 1999).The general form of the linear regression equation is: where Y denotes the dependent variable such as water content at selected water potential, b 0 the intercept, b 1 to b 6 are the regression coefficients and x 1 to x 6 represents the independent variables signifying the basic soil properties.
The diverse characteristics of soil across a toposequence, a sequence of soils along a slope, arise from a complex interplay of natural and human influences.Geological processes establish the initial foundation, while soil formation further shapes these properties.Land use and management practices over time leave their mark, and natural forces like erosion and deposition further sculpt the landscape, leading to the unique mosaic of soil properties observed across the toposequence (Phillips, 2007).
Research has shown a strong link between soil particle size, its derivatives, solute transport properties, other characteristics (Amsili et al., 2024;Mbagwu et al., 1983;Ogban & Ekerette, 2001), as well as even mineralogy (Souza et al., 2009).Understanding these relationships could help model soil properties related to the PWP (-1500 kPa), the critical soil moisture threshold for plant survival.This study purposes to create a regression model for predicting the permanent wilting point in Alfisols soils at the Teaching and Research Farm of the University of Ilorin.By establishing these relationships, the model could aid in optimizing water management and ensure timely irrigation in agricultural practices.

Description of Study Area
This research was carried out on a toposequence (upper, middle, and bottom slope positions) at the Teaching and Research Farm of the University of Ilorin, Ilorin, Nigeria.The region lies within the Southern Guinea Savanna zone (Lat.9° 29' N, long.4° 35' E, 307 m elevation) and features a tropical climate with bimodal rainfall (1000-1240 mm annually) and temperatures ranging from 20 °C -35 °C (Kolo et al., 2012).The dominant soil type is gravelly Alfisols, formed over a basement complex (Olaniyan, 2003).Historically used for agriculture, the site was in a fallow state during the sampling period.

Soil Sampling and Analysis
Soil were sampled along the toposequence (upper, middle, and bottom slope positions) at three depths (0 cm -30 cm, 30 cm -60 cm, and 60 cm -90 cm).Fifteen mini-pits were excavated, with five located at each slope position.Ninety samples were collected (45 disturbed, 45 undisturbed).Undisturbed samples were collected using metallic cylinders (8.3 cm height, 5.5 cm diameter).The soil within the cylinders was held in place with calico and rubber bands, then carefully labelled.A soil auger was used to collect disturbed samples and put in labelled polythene bags.These samples were transported to the laboratory for further physical and chemical properties determination using standard procedures.

Preparation of Soil Samples
The disturbed soil samples were air-dried, then ground and sieved to pass through a 2 mm mesh size sieve prior to analysis.

Particle Size Analysis
Particle size analysis was conducted by the hydrometer method reported by Gee & Or (2002) employing sodium hexametaphosphate (calgon) as dispersant.

Bulk Density
Core method was employed for the determination of Bulk density, following the procedure outlined by Blake & Hartge (1986).First, undisturbed soil was dried using hot air oven at 105 °C till constant weight was attauned.Bulk density was computed using the formula: Where, ρ b = bulk density (kg/m 3 ), Ms = oven-dried soil mass (kg),

Soil Organic Matter
The determination of soil organic carbon (OC) was done using the Walkley-Black wet oxidation method (Nelson & Sommers, 1982).To convert the measured OC content into soil organic matter (OM), a standard conversion factor of 1.724 was applied.This factor accounts for the approximate 58 % carbon composition of SOM (Brady & Weil, 1999).

Permanent Wilting Point (PWP)
The determination of PWP was done using a procedure adapted from Odu et al. (1986).300 g of air-dried soil that had been made to pass through a 2 mm sieve from each sampling point was moistened in pots, and three maize seeds were planted in each pot.After thinning to one seedling per pot, an aluminum ring was placed around the base of each plant before the opening of the coleoptile and pressed slightly into the soil.
The plants were then allowed to grow until they reached the four-leaf stage.Next, the soil surface of each pot was sealed with a ¼-inch layer of molten paraffin wax.Finally, cotton wool was used to fill any gaps between the seedling stem and the aluminum ring, ensuring that water loss could only occur through the plant.The seedlings were left to grow and monitored for clear signs of wilting.Upon initial wilting, plants were shaded overnight to observe if they recovered.
If wilting persisted, PWP was confirmed.The samples were then weighed prior to oven-drying and afterwards, with the difference in weight (as percentage dry soil) determining the PWP.

Statistical Tool Correlation Analysis
When two or more quantities vary in sympathy such that movement in one tend to be accompanied by analogous movement in the others, at that time such quantities are ascribed to be correlated.
A multiple correlation was employed for the exploration of the relationship between measured pair of values.The statistical tool was given as: is the standardized values of the response variable.
is the sample variance of the response variable This correlation coefficient is generally called Karl Pearson's coefficient of correlation.
Correlation coefficient quantifies the level of association (linear relationship) amid two or more random variables.Correlation coefficient may be positive or negative.A significant positive correlation coefficient indicates that an increase in one variable is accompanied by a similar increase in the other.While significant negative correlation coefficient indicates an inverse relationship between the two random variables in question.On the other hand, a non-significant correlation coefficient implies zero relationship between the variables under study.
A test statistic suggested by Morrison (1976) shall be used to test for significance of this correlation coefficient.This test is given as: where, r is the so called Karl Pearson's correlation coefficient; n is sample size.It can be shown that when H o is true, t ~ t α⁄2 (n-2) (8)

Permanent Wilting Point
An underlying assumption in multiple regression is that the predictions (independent variables) are known without any uncertainty in their given values.Consequently, for the model to be applicable for both the estimation and predictions of the datasets, all assumptions were investigated and necessary violations corrected based on the fit of the models.Generally, the diagnostic plots from the regression models was used to check assumptions including normality of residuals and presence of potential outliers.This was done by using "Residuals versus Fitted" charts to show if there was a trend to the residuals and Shapiro-Wilk's test of residuals normality in the R software for statistical computing and graphics (R core team, 2022) to confirm residuals normality.To ensure the validity of the developed model, it's crucial to assess whether the model's residuals (errors) follow a normal distribution.Two methods were employed for this purpose.First, a residual plot was visually inspected.Ideally, this plot should exhibit a random scattering of points around zero, indicating no significant relationship between the residuals and any individual predictor variable used in the model (Olorede et al., 2013;Olorede & Mudasiru, 2013).Second, a quantile-quantile (Q-Q) plot was generated.
In this plot, if the points, especially those in the central region, fall close to a diagonal line, it suggests good agreement between the observed data and a normal distribution.Standardized residual criterion was employed to check effects of potential outliers in the dataset on the models by removing observations with standardized residuals outside the interval (Barnett & Lewis, 1994;John & Prescott, 1975;Stefansky, 1972).Also, the residuals of the models were standardized, this was achieved by the division of each by the root mean square error of its respective model.The expectation was for the lowest standardized residual to lie within ±1, and the highest to be within ±2; deviations from these ranges indicated potential outliers.All statistical analyses, model fitting and diagnostics were done using the language R version 4.2.2 (R Core Team, 2022).
In multiple linear regression scenarios like this, it is commonly acknowledged that various hypothesis tests concerning the model parameters are valuable for assessing the model's effectiveness.Therefore, the need to describe and test hypotheses about parameters of the proposed regression model collectively as well as individually.This would ascertain whether there is a notable correlation between the response variables (PWP) and a specific subset of the predictors.The built models were thus screened based on number of significant parameters, maximum amount of proportion of variability about the response by these parameters and satisfaction of model assumptions using diagnostic plots.
Rejection of null hypothesis about the full model (model with all parameters) implies that at least one of the predictors donates meaningfully to the model.Rejection of null hypothesis about individual regression parameter (ANOVA table) indicates that the variable cannot be deleted from the model.Coefficient of multiple determination (R 2 ) measures magnitude of decrease in the variability of permanent wilting point gotten by using the 6 predictors in the model.Merely having a high R-squared value doesn't automatically indicate the regression model's quality (Myers & Montgomery, 1995).Adding an extra predictor continually boosts R-squared, regardless of the statistical significance of the added variable.Consequently, models with high R-squared values might yield inaccurate predictions for new observations or mean response estimates.Hence, certain regression model developers opt for adjusted R-squared (Myers & Montgomery, 1995).Generally, will not constantly escalate as variables increase in the model.If the additional variables are superfluous, the will repeatedly drop.A marked difference between R 2 and is a worthy coincidental that the model includes terms that are not statistically significant.It is imperative to mention that presence of multicollinearity in the data set will make estimations of coefficients from the least squares fit imprecise and statistically insignificant (Martens & Naes, 1989).Hence, when the aim is basically to forecast Y using a set of X variables, multicollinearity is not a significant concern.The predictions are still precise, and the overall R 2 (or ) indicates the accuracy of Y values prediction.However, if the objective is to comprehend in what ways various X variables affect Y, then multicollinearity is a big issue.The first challenge that individual p-values may be deceptive; a high p-value may suggest insignificance even if the variable is crucial.Second, confidence intervals on regression coefficients could be wide, potentially encompassing zero.This ambiguity makes it difficult to determine whether an increase in X corresponds to a rise or fall in Y. Furthermore, wide confidence intervals mean that excluding or adding a subject can drastically alter coefficients, possibly even changing their signs.
In some instances, multiple regression results may appear inconsistent.Despite a low overall p-value, individual p-values for all X variables are high.This situation arises when two X variables are highly correlated, essentially conveying the same information such that one becomes redundant once the other is included.However, together, they significantly contribute to the model.Removing both variables would degrade the model fit considerably.Thus, while the overall model fits the data well, neither X variable makes a substantial contribution when added individually.This scenario indicates collinearity among the X variables and manifests as multicollinearity in the results.

Results
The results of the descriptive statistics obtained for soil properties is presented in Table 1.Results for Pair-wise correlation of variables measured during PWP experiment is obtainable in Table 2. Results obtained for parameter estimates for the models generated for PWP is accessible in Table 3. Data obtained for the Analysis of Variance (ANOVA) for the multiple regression models established for permanent wilting point are presented in Table 4.
The data implied a high sand content, indicating a coarse-textured soil, with a loose, crumbly structure while silt ranged between moderate to high; and clay content varied between high and very high.Thus, the soils employed in the present study are characterized as having low water availability, moderate drainage and high water retention following the soil separates.Bulk density, an index of soil compaction was high, thus, the soils has low available water capacity and permanent wilting point since water fills small pores limiting plant availability, thereby plants struggle to extract water from the dense soil.Total porosity rated moderate to high while organic matter content varied between low to moderate, and PWP data classifies the soil as low (10 % -20 %), moderate (20 % -30 %), and high (> 30 %).

Pair-wise correlation
A negative relationship between sand and silt (r = -0.870*)and clay (r = -0.9098*)was obtained, though a positive relationship existed between sand and OM (r = 0.668*) (Table 2).The implication of this is that as sand increases, silt and clay will decrease and as sand increases OM will increase.Silt was positively correlated with clay (r =0.603*) and negatively correlated with OM (r = -0.512*).As silt increases, clay will increase and organic matter will decrease.Clay was inversely correlated with OM (r = -0.689*).
As clay increases organic matter will decrease.
Bulk density was negatively correlated with total porosity (r = -0.999*),when bulk density increases, total porosity decreases at the same rate.Permanent wilting point didn't show any relation with the basic soil properties.

Permanent Wilting Point Model Development
Three models were created for permanent wilting point, each with p-values below 0.05, with the corresponding R 2 adjusted values of 0.2351, 0.2949 and 0.2952 for models 1, 2 and 3 as shown in Table 3. Model 3 of permanent wilting point was selected as the best predictive model since it recorded the highest R 2 adjusted value (0.2952).This means that the model accurately represents the data related to the permanent wilting point with the three predictors included after removal of observations with potential outliers and non-significant predictors bulk density, total porosity and organic matter to assess the model's predictive capability.The model accounts for 30% of the variation in mean squared errors of the permanent wilting point with sand, silt and clay making statistically significant contributions to the model.For model 3 of permanent wilting point, the residuals are now normal and there are no potential outliers as shown in Figure 1.In fact, the Cook's distance plot now confirms this by having all the Cook's distances for all the predictors less than 1 as shown in Figure 1.The residuals plot and normal quantile-quantile plots presented in Figure 1 also support this.Normality of the residuals is further confirmed by the histogram presented in Figure 2.

Based on the individual statistical significance of each predictor variable, the findings shown in
The regression equation created for model 3 of permanent wilting point was used in predicting the permanent wilting point values: observed, predicted and residuals of the prediction as revealed in Table 5.Data obtained for observed and predicted field capacity as presented in Table 5 were analysed for extent of relationship and presented a correlation coefficient of 0.5727.Based on the guideline outlined at http://www.westgard.com/lesson42.htm for assessing correlation coefficients, it is observed that when r falls within the range of 0.90 to 1.00, 0.70 and 0.89, 0.50 and 0.69, 0.30 to 0.49, and 0.00 to 0.29, they are said to show very high, high, moderate, low, and little if any correlation, respectively.It indicate that permanent wilting point predictions of model 3 of permanent wilting point have   3.This shows the trend of observed and predicted values of PWP.
Variable importance based on the parameter estimates for PWP (Table 3) model showed that sand and silt were the most influential predictor variable for model 1.While, clay was the most influential predictor for model 2, followed by sand, and then clay.Although, model 3 also had the three soil separates as it influential predictor, silt ranked first, followed by sand and then clay.Sand content is the principal determinant of total porosity, hence, there was a negative relationship between sand and PWP.Although, both silt and SOM were of secondary importance as predictor variables of PWP.Silt was relatively more important than SOM for enhancing the prediction of PWP as indicated by a larger standard error (SE) and negative t-value (Table 3).This corroborates with the submission of Amsili et al. (2024) who opined that PWP were mostly defined by texture and SOM.
Silt was the most important variable for the prediction of Ɵ PWP , followed by sand, and then clay (Model 3: , Table 3).Silt content importance in predicting PWP is logical because is the primary determinant of the proportion of total porosity with a pore diameter equal to less than 0.02 µm, which is the theoretical pore size that can hold water at -1500 kPa in the coarse textured soil studied with relatively low mean clay (166.53 g.kg -1 ) content (Table 1).This finding was further ascertained by the weak positive (0.22) relationship between silt content and Ɵ PWP (Table 2).SOM also had a positive influence on Ɵ PWP , attributable to increase total porosity with a diameter less than 0.20 µm (Libohora et al., 2018) while sand had a negative effect on Ɵ PWP , because increases result in a smaller proportion of total porosity with a diameter less than 0.20 µm.There was a strong negative relationship between clay and SOM (r=0.69*,p<0.05), which implied that clay content of the soil decreased with SOM and vice-versa.This findings is in agreement with previous research that found that SOM had a small impact on Ɵ PWP in soil but counter to this submission as the soil in the present study had relatively low clay content (Minasny & Mcbratney, 2018;Saxton & Rawls, 2006).Hence, the statement that texture componentssand, silt and clay were sufficient for predicting PWP is not countenance in the present study evident from the relatively low coefficient of determination of 29.54 %.The low predictive power of this model could therefore be attributed to the heterogeneity of soil and complexity of its properties such as soil texture, structure, organic matter and mineral composition coupled with the limited sample size along varying topographic positions which was probably not representative of the entire population, thus, leading to reduced model performance.Also, the relatively high correlation between predictor variable -siltaccounted for a reduced model's explanatory power.
However, the model multiple linear regression (MLR) R 2 value had 5.57 % higher than model adjusted R 2 , which indicates inclusion of additional variables beyond sand, silt and clay would not provide meaningful improvements to Ɵ PWP .Also, diagnostic plots for residuals, standardized residuals and Ɵ PWP , gave the same adjusted R 2 value (0.30); indicating simple model approach, MLR, employed in the present study for estimating Ɵ PWP was appropriate.Consequently, MLR would be adequate when 3 to 4 predictor variables are being used.This is in tandem with earlier submission by Amsili et al. (2024) who also observed that the silt fraction was the most important variable for predicting Ɵ PWP , owing to its particle size diameter range 2 µm -53 µm, corresponding most closely to the theoretical pore size range that prevent permanent wilting in plant.

Conclusion
The research conducted at the Teaching and Research Farm of the University of Ilorin for developing a model for predicting permanent wilting point (PWP) of the soil formulated three different models for estimating PWP of the soil.Model three () was selected as the optimal choice with the highest R 2 adjusted value of 0.2952.Prediction of permanent wilting point of soil of a representative Alfisol developed on basement complex at the Teaching and Research Farm of the University of Ilorin, Ilorin Kwara State will commendably be contingent on dependability of determination of the proportion of particle sizes The present study was carried out across three topographic positions whose changes along the sequence would inform variability in soil properties and characteristics.Thus, this variability resulted in systematic underestimation of the soil available water capacity which accounted for the relatively low coefficient of determination that explains just 29.52 % of the soil variance for PWP.Hence, the model is not viable, efficient or effective and should be re-evaluated with the present model serving as an insight for future studies in its current form.Accordingly, it is however advocated that further research be conducted to identify additional predictors, thereby improve the model explanatory power through use of more robust and representative data, consideration of multiple soil properties and characteristics across toposequence, explore nonlinear relationships of predictor variables, check for measurement errors to improve measurement accuracy while also using techniques such as spatial and machine learning to enhance model accuracy and generalizability across different soil types and topographic positions, as well as undertake the comparison of the derived model with actual field permanent wilting point experiment to ascertain authentication of the predicted values with actual field values/data.
Null hypothesis (H 0 ): β 1 =β 2 =...=β 13 =0 [No single predictor showed a statistically significant contribution to the model] Alternative hypothesis (H 1 ): β j ≠ 0 for at least one j [At least one predictor showed a statistically significant contribution to the model] Tests statistic: F ratio = MS Regression /MS Error[Global  F-test]

Figure 1 :
Figure 1: Diagnostic Plots for Model 3 of Permanent Wilting Point

Figure 3 :
Figure 3: Graph of Observed versus Predicted Permanent Wilting Point Decision rule: Reject the null hypothesis if and accept the alternative hypothesis at 0.05 significance level, or else do not reject the null hypothesis.

Table 2 : Pair-wise correlation results of variables measured during permanent wilting point experiment
Table 4 for model 3 of permanent wilting point indicate that the null hypothesis for both silt and clay with corresponding p-values of 0.003126

Table 3 : Table of Parameter Estimates for Permanent Wilting Point Models
*implies significant at 1 or 5% level of probability

Table 4 : ANOVA Table for Models developed for Permanent Wilting Point Model No. Source of Variation Degree of Freedom Sum of Squares
*implies significant at 1 or 5% level of probability