Determination and Prediction of Some Soil Properties Using Partial Least Square (PLS) Calibration and Mid-Infra Red (MIR) Spectroscopy Analysis

Soil chemical, physical and biological analyses are a crucial but often expensive and time-consuming step in the characterization of soils. Rapid and accurate predictions and relatively simple methods are ideally needed for soil analysis. The objective of this study was to predict some soil properties (e.g. pH, EC, total C, total N,C/N, NH4-N, NO3-N, P, K, clay, silt, and sand and soil microbial biomass carbon) across the Wickepin farm during summer season using a Mid-Infra Red - Partial Least Square (MIR–PLS) method. The 291 soil samples were analyzed bothwith soil extraction procedure and MIR Spectrometer. Calibrations were developed between MIR spectral data and the results of soil extraction procedures. Results using the PLS-MIR showed that MIR-predicted values were almost as highly correlated to the measured value obtained by the soil extraction method of total carbon, total nitrogen and soil pH. Values for EC, NH4-N, NO3-N, C/N, P, K, clay, silt, sand, and soil microbial biomass carbon were not successfully predicted by the MIR – PLS technique. There was a tendency for these factors to correlate with the MIR predicted value, but the correlation values were very low. This study has confirmed that the MIR-PLS method can be used to predict some soil properties based on calibrations of MIR values.


INTRODUCTION
Available online at: http://journal.unila.ac.id/index.php/tropicalsoil DOI: 10.5400/jts.2011.16.2.93 Soils are r arely homogeneous and the variability occurs both laterally and with depth. It can result from changes in the chemical balance of the soil associated with agricultural practices including nutrient uptake, crop rotations, fertilizer use, lime application (Viscarra Rossel and McBratney 1998) and leaching. To understand more about the variation in soil properties, spatially dense soil analyses are often required. Soil analytical procedures also need to be able to cope with a large number of samples.
An alternative method for assessing soil properties and their variability across a landscape is mid infrared (MIR) spectr oscopy. MIR spectroscopy has considerable advantages and offers a possible alternative to conventional methods through increased speed and sensitivity . The fundamental molecular frequencies of this technique are generally in the MIR 2,500-25,000 nm wavelengths or 4,000-250 cm -1 range (Janik et al. 1998). This method has been used to determine some macro and micro elements of soil including soil carbon (McCarty et al. 2002), organic matter composition (Cheshire et al. 1993;Spaccini et al. 2001) and many other soil properties (Janik et al. 1998) as well as soil minerals . However, none of the above studies were conducted under agricultural systems which had different land use histories or at different occasions.
Mid infrared-partial least square (MIR-PLS) has not been widely used for soil property prediction in agricultural systems. However, some studies have been conducted for (i) prediction of organic carbon, nitrogen and other properties of peat soil (Holmgen and Norden 1988), (ii) the contamination of hydrocarbon in wet soil (Hazel et al. 1997), (iii) a range of soil properties including mineralogical analysis, organic carbon, nitrogen, carbonate, airdry moisture and cation exchange capacity Murphy and Milton 2004) and (iv) prediction of total carbon of soil, clay content and soil N supply (Murphy and Milton 2004).
In relation to soil biological characteristics, the MIR spectral contribution of in-situ soil microorganisms is expected to be minor because only about 5% of the total organic carbon in soil is microbial carbon (Janik et al. 1998). The reason for this is that the signatures of microorganisms are usually difficult to detect in situ in most soils due to masking of soil mineral and other organic peaks. However, a study by Pankhurst et al. (1997) showed that there was a correlation between GC-FAME spectra and MIR spectra of the same soils using PLS regression analysis. The correlation values ranged between 0.5 to 0.75, showing that the MIR spectra could explain 75% of the variability in the FAME data. This study suggested that MIR spectra can be obtained much more rapidly compared with the conventional methods for measuring bacteria and fungi. Therefore, there is an opportunity to use this predictive capacity of MIR-PLS models as a simple bio-monitor of soil health and for soil biological mapping.
Another study (Grube et al. 1999) showed that the principal component concentration changes in microbial cells can be studied by quantitative estimation of the biomass MIR spectra. In addition, MIR absorption spectra had also been applied to determine bacterial biomass (Zymomonas mobilis) which was grown in sucrose or glucose medium (Grube et al. 2002).
The objective of this study was to predict some soil properties across the farm during summer season using MIR -PLS method.

Location and Soil Sampling
Two hundred ninety one (291) soil samples were collected from a farm (Fairlawn) in the Wickepin Shire in south-western Australia, about 300 km south east of Perth, 32° 47' 4'' S, 117° 30' 9'' E during summer season in 2004. The farm covers an area of 647 ha and was divided into 14 paddocks. The region had a Mediterranean climate with hot, dry summers and cold winters, and elevations ranged from about 250 to 350m. The area was dominated by flats with some rolling hills. The major soil textures in this area were medium loam, sandy loam, clayey loam and clay. The major land use was principally sheep farming and wheat production, with annual pasture dominated by grass species.

Soil Analysis
The soil samples were oven dried, ground and sieved to a size fraction smaller than 2 mm. The number of samples collected and the soil analyses performed are summarized in Table 1.

MIR Analysis
Two hundred ninety one (291) Rayment and Higginson (1992) Microbial ciomass C (mg kg -1 ) 291 Fumigation-Extraction (FE) method Vance et al. (1987) and Sparling (1990). Total C and total N (%) 291 CHN-1000 Elemental Analyzer version 1.1 (1991) LECO Rayment and Higginson (1992)  pass through a 0.2 mm mesh. A small amount of the samples was poured into 10 mm diameter stainless steel cups and the top of the powders were leveled without applying pressure. Samples were scanned using MIR from 7,000 to 500 cm -1 at 2 cm -1 resolution on a Spectrum One Fourier-Transform mid Infrared (FT-IR) Spectrometer Vers. B Model L120000B. Each sample was scanned in duplicate and the spectra was averaged. The infrared spectra were recorded by the diffuse reflectance (DRIFT) sampling accessory. Peak areas from spectra were used for statistical analysis.

Data Analysis
The data files containing the spectral data points were imported as GRAMS format for the PLS calculation. The MIR calibrations were performed by Partial Least Squares (PLS) regression using the computer program for multivariate modeling 'Unscrambler' Version 7.5 (Camo A/S, Trondheim, Norway). The PLS was used to determine the best correlation between chemical reference data and spectral data to develop a calibration. The PLS regression model was built using a calibration set with 2/3 of the spectra of 213 and 291 for the overall data randomly selected observations. Cross validation using partial least regression was performed for each soil variable.
The predictive model was performed using validation of 1/3 of the unused data set of independent 78 and 70 of the whole observations. The calibration and predicted model were evaluated from their correlation coefficients (r), r 2 , the root mean standard error of prediction (RMSEP), and the standard error of prediction (SEP). The values in the prediction set were recorded and this procedure was repeated two times with different samples in the validation set until all samples were predicted.
The best calibration and prediction model were then judged against the one with the lowest standard error of prediction and the highest correlation coefficients. Two other statistical analysis were used to evaluate the calibration: (i) RPD, the ratio of the standard deviation (SD) of the measured or references value in the prediction set to root mean square of error prediction (RMSEP), and (ii) RER, the ratio of the range of measured/references value in the prediction set to the RMSEP (Malley et al. 1999).

DRIFT Spectra for Whole Soil
The DRIFT spectra for a typical soil from the Wickepin farm are shown in Fig. 1. The high peak around 900-500 cm -1 reflected other clay mineral characteristics Nguyen et al. 1991). This showed that mid infrared spectra contain definable peaks which could be used in spectral interpretation and differentiate between two or more samples.

Soil Characteristics
The summary of soil characteristics across the farm is presented in Table 2.   Table 2. Summary statistics of 291 soil samples for 13 soil characteristics at a farm scale at the Wickepin farm (soils sampled in summer).

Calibration of Results
The results obtained from the cross validation procedure are presented in Table 3 and confirmed the feasibility of predicting some soil variables. Total C, total N, EC and soil pH were successful predicted compared to other soil properties for which the correlation values were low. The best correlation for total C, total N and soil pH were also followed by acceptable RPD and RER values (Table 3); RPD>3 and RER>10 are considered acceptable. This indicated that the infrared spectra could explain 76% of EC and soil pH, 86% of total nitrogen and 90% of total carbon of the variability in the laboratory extraction analysis data. Other soil properties were not successfully predicted using this technique because the correlation values were lower than 50%. Among the soil variables measured in this study, only total carbon, total nitrogen and soil pH were best predicted with a PLS-MIR calibration. Total soil carbon is the most common soil property used in NIR and MIR studies and was reported to have very high correlations with conventional soil analytical data compared to other soil properties (e.g. McCarty et al. 2002). However, the study by Brown et al. (2005) to predict soil C using the same technique failed at some sites due to "pseudoindependent" validation (random selection of nonindependent test samples) that can overestimate predictive accuracy relative to independent validation.
A major advantage of diffuse reflectance spectroscopy for soil analysis is that from a single spectrum many properties may be (accurately) determined, thus offering the possibility for considerable cost savings and increased efficiency over conventional laboratory analysis. Furthermore, the technique is rapid, making it possible to analyze a large number of samples in a practical and timely manner. These properties make spectroscopic analyses combined with PLSR very attractive for environmental monitoring, modeling and precision agriculture (Viscara Rossel et al. 2006).
Other studies have also shown that not all conventional soil extractions can be predicted using the MIR-PLS technique (Janik et al. 1998). They showed that soil properties such as EC, K-available, P-available, P buffer, DTPA-extractable Zn and Cu, CaCl 2 , exchangeable Na and K had very low correlations (r 2 = <0.50) between conventional and infrared assays. Reeves et al. (2001) found that soil pH (r 2 = 0.94) was significantly better than other soil properties such as total carbon and nitrogen, and some biological properties of soil. Other soil properties measured in this study were less well correlated using the MIR-PLS technique. Although the prediction was less successful for these variables, MIR may be useful for screening purposes.

CONCLUSIONS
The calibration values can be developed using mid infrared spectra and conventional assays for several soil parameters, particularly total carbon, total nitrogen and soil pH. This study also provides evidence that total carbon, total nitrogen and soil pH can be simultaneously measured by rapid and non-destructive mid-infrared spectroscopy. The calibrations developed in this study have not been validated beyond this location. There is an indication that the calibrations might not be widely applicable because only the calibration for soil pH data corresponded well to the other data, but this needs to be further investigated.