Review of preprocessing techniques used in soil property prediction from hyperspectral data

Soil properties are neither static nor homogenous with space and time. Capturing the spatial variation of soil properties through conventional methods is a difficult task. Hyperspectral remote sensing data provide rich source of information produced in the form of spectrum at each pixel which can be used to identify surface materials. Airborne and spaceborne narrowband hyperspectral sensors have come to the fore which provides spectral information across large area. Thus, it is a promising tool for studying soil properties and can be used as an alternative to conventional method. But atmospheric attenuation and low signal to noise ratio are major problems with this type of data. Preprocessing of hyperspectral airborne/spaceborne data is required to extract soil properties. This paper reviews previous studies on prediction of soil properties from hyperspectral airborne and satellite data during the past years and the preprocessing techniques used in these predictions. Subjects: Earth Sciences; Engineering & Technology; Environment Agriculture


Introduction
Remotely sensed hyperspectral satellite data have great potential for quantitative assessment of soil and vegetation parameter at spatial scale. The development of methods to map soil properties using optical remote sensing data in combination with field measurements has been the objective of several studies during the last decade (Ben-Dor et al., 2009). Also it has been a challenge to find the most appropriate technique for studying soil properties from optical data and thus reducing the time and effort involved in field sampling and laboratory analysis.
Soil reflectance in the visible near-infrared and mid-infrared regions has been widely used in many studies. Some of the soil properties predicted from reflectance data were organic matter (OM), soil ABOUT THE AUTHORS Our group works on application of Hyperspectral data for soil and vegetation discrimination applications. In the process of applying this data to any application, major issue to be addressed is to account for the effects of atmosphere on the hyperspectral data and to account for it appropriately. Though there are several algorithms are available to address this, there is no guideline on the application of them. One of the issues that we wish to address is to how best to account for the effect of atmosphere so that proper signal of the targets is extracted for further analysis.

PUBLIC INTEREST STATEMENT
The present review paper would be very useful in the process of digital soil mapping mission from satellite data. As soil is a precious non-renewable resource, it has to be examined periodically. Prediction from satellite data provides a continuous method of monitoring soil quality. The accuracy of prediction depends on the quality of satellite data. The methods to improve quality of data are reviewed in this paper. organic carbon (SOC), total nitrogen (TN), pH, moisture content (MC), electrical conductivity (EC), phosphorous (P), potassium (K), calcium (Ca), magnesium (Mg), sodium (Na), manganese (Mn), zinc (Zn), and iron (Fe) with various levels of prediction accuracy. Various prediction models such as multiple linear regression (MLR), principal components regression (PCR), stepwise multiple linear regression (SMLR), partial least squares regression (PLSR), artificial neural networks (ANN), etc. were used. These models work well with signals obtained under laboratory conditions, with minimal source of noise. Thus, performance of these models on remotely sensed airborne or spaceborne data is influenced by atmospheric interference and the occurrence of spectral noises. At this juncture, the role of preprocessing techniques on the prediction accuracy of soil properties from remotely sensed data needs to be studied.
Preprocessing techniques consist of atmospheric correction algorithms as well as spectral pretreatment and smoothening methods. Over the years, atmospheric correction algorithms have evolved from applied math approach to ways supported on rigorous radiative transfer (RT) modeling (Minu & Shetty, 2015). Noise and unwanted spectral signals are removed by spectral pretreatment and smoothening methods. Only good-quality data with better signal-to-noise ratios can be conveniently used for the purpose. Minu and Shetty (2015) review different hyperspectral atmospheric correction algorithms developed during the past years. Internal average reflectance approach (Kruse, Raines, & Watson, 1985), flat field approach (Roberts, Yamaguchi, & Lyon, 1986), empirical line (EL) method (Roberts, Yamaguchi, & Lyon, 1985), QUick atmospheric correction (Bernstein et al., 2005) etc. are empirical or semi-empirical atmospheric correction methods. RT codes try to simulate the transfer process of an electromagnetic wave in the atmosphere. The normally used RT codes are LOWTRAN (Kneizys et al., 1988), MODTRAN (Berk, Bernstein, & Robertson, 1989), 5S (Tanré, Deroo, Duhaut, Herman, & Morcrette, 1990), and 6S . There are a range of software programs available to model the atmosphere including ATmospheric REMoval algorithm (ATREM) (Gao, Heidebrecht, & Goetz, 1993), ATmospheric CORrection (ATCOR) (Richter, 1996), Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) (Adler-Golden et al., 1998), Imaging Spectrometer Data Analysis System (ISDAS) (Staenz, Szeredi, & Schwarz, 1998), High-accuracy ATmosphere Correction for Hyperspectral data (HATCH) (Qu, Goetz, & Heidbrecht, 2001), Atmospheric CORrection Now (ACORN) (ACORN 4.0, 2002) etc. Hybrid methods include combinations of empirical approaches and radiative modeling for the derivation of surface reflectance from hyperspectral imaging data. Each preprocessing technique is made of its own assumptions. So there is a need to analyze limitations of different preprocessing techniques and to come up with a universal method.

Prediction of soil properties from airborne/spaceborne hyperspectral data
Hyperspectral sensors operate with more than hundreds of bands with good spatial and spectral resolution producing continuous spectra. With the progress and maturity of technology, hyperspectral remote sensing has found a wide range of applications in mapping soil types and quantifying soil constituents. Review papers by Ben-Dor et al. (2009);Ge, Thomasson, and Sui (2011);Mulder, de Bruin, Schaepman, and Mayr (2011), etc. point toward it. Airborne sensors provide high spatial resolution (2-20 m), high spectral resolution (10-20 nm), and high SNR (>500:1) data. Even though satellite hyperspectral imageries have become available since 2000, only few attempts have been made to use them for mapping soil properties. This may be due to their low signal to noise ratio. Tables 1 and 2 summarize previous studies carried out using airborne and satellite hyperspectral imageries to predict soil properties. The preprocessing techniques used are also mentioned in the table.
It is seen that RT models are mainly used in preprocessing of airborne imagery. It may be due to the fact that more information on atmospheric conditions are available in the case of airborne sensors, so that modeling of atmosphere can be done precisely and it can be removed to obtain pure signal. Whereas semi-empirical models like FLAASH are mainly used in hyperspectral imageries. Comparison of different models are still lacking in this field. Also EL method which also requires ground information gives good results. But it is limited only to the areas where ground information is available. Also it is seen that prediction of SOC gives good results compared to other properties. This may be because the soil reflectance curve is affected more by presence of OM.

Inference
Several surface soil properties were modeled from remotely sensed hyperspectral imagery. Since soil is a more heterogeneous material, more careful spectral manipulations need to be done in assessing its properties from spectral data. For the best performance of any prediction system, the key influencing factors are to be identified and optimized. Although there are many soil properties prediction models, the prediction accuracy is found to be still very low.
The noises should be removed from the hyperspectral imagery in order to utilize it to the best. The signal to noise ratio should be maximum. Several spectral pre-processing methods are employed in various studies to improve the performance and robustness of the prediction models. Even though the preprocessing techniques affect the prediction model considerably, it was not given that much importance. So to develop a good model there is a need to perform a better preprocessing. In this percept, different preprocessing techniques used in various studies are listed in this review paper. Hybrid methods which combine physical model and image statistics need to be promoted. There is a need to give guidelines on selection of suitable preprocessing technique for the prediction of soil chemical properties.