Estimation of Soil Moisture Percentage Using LANDSAT-based Moisture Stress Index

Crop production systems are highly dependent on soil water availability. Soil moisture is a parameter in the water cycle that has been identified as the link between rainfall and crop growth [1,2]. Accurate ground-based measurements of soil moisture percentage over large regions are difficult, labor intensive, expensive, and time consuming. The process is also challenging, especially in the selection of representative field sites whose soil moisture measurements would accurately represent the region irrespective of the differences in soil properties, topography and land cover [3,4]. This difficulty in obtaining large scale soil moisture measurements through traditional ground-based sampling networks, has led to several studies aimed at the utilization of remote sensing techniques for large scale soil measurements [2,3,5,6].


Introduction
Crop production systems are highly dependent on soil water availability. Soil moisture is a parameter in the water cycle that has been identified as the link between rainfall and crop growth [1,2]. Accurate ground-based measurements of soil moisture percentage over large regions are difficult, labor intensive, expensive, and time consuming. The process is also challenging, especially in the selection of representative field sites whose soil moisture measurements would accurately represent the region irrespective of the differences in soil properties, topography and land cover [3,4]. This difficulty in obtaining large scale soil moisture measurements through traditional ground-based sampling networks, has led to several studies aimed at the utilization of remote sensing techniques for large scale soil measurements [2,3,5,6].
Satellite imagery captures soil surface and vegetation characteristics which are both affected by soil moisture. This forms the basis of using remote sensing to estimate soil moisture in various studies. Some studies have used active or passive microwave data to directly estimate volumetric soil water content in the surface soil layer (0-10 cm) [7][8][9] while others use indices derived from optical and/or thermal data to indirectly infer soil moisture status based on changes in bio-physical factors (e.g., vegetation cover, surface energy balance) affected by soil water availability [10][11][12]. Results from various studies all show the great capacity of thermal and/or optical derived vegetation indices in monitoring both surface and root zone soil moisture. Over the years, different vegetation indices have been applied in estimating soil moisture and vegetation response to its spatial and temporal variations [2,[13][14][15]. To date, there is limited work done to investigate the applicability of the Landsat-based Moisture Stress Index in estimating soil moisture at different soil depths.
The main objective of this study was to investigate the relationship between the Moisture Stress Index (MSI) and soil moisture percentages at selected Soil Climate Analysis Network (SCAN) sites in Alabama. Specifically, the study aimed to use in situ SCAN soil moisture

Abstract
The global agronomy community needs quick and frequent information on soil moisture variability and spatial trends in order to maximize crop production to meet growing food demands in a changing climate. However, in situ soil moisture measurement is expensive and labor intensive. Remote sensing based biophysical and predictive regression modeling approach have the potential for efficiently estimating soil moisture content over large areas. The study investigates the use of Moisture Stress Index (MSI) to estimate soil moisture variability in Alabama. In situ data were obtained from Soil Climate Analysis Network (SCAN) sites in Alabama and MSI developed from LANDSAT 8 OLI and LANDSAT 5 TM data. Pearson product moment correlation analysis showed that MSI strongly correlates with 16-day average growing season soil moisture measurements, with negative correlations of -0.519, -0.482 and -0.895 at 5, 10, and 20 cm soil depths respectively. The correlations of MSI and growing season moisture were low at sites where soil moisture was extremely low (<-0.3 at all depths). Simple linear regression model constructed for soil moisture at 20 cm depth (R²=0.79, p<0.05) correlated well with MSI values and was successfully used to estimate soil moisture percentage within a standard error of ± 3. Resulting MSI products were used to successfully produce the spatial distribution of soil moisture percentage at 20 cm depth. The study concludes that MSI is a good indicator of soil moisture conditions, and could be efficiently utilized in areas where in situ soil moisture data are unavailable. measurements to evaluate the derived LANDSAT moisture stress index in the estimation of soil moisture at different depths during the growing season and test the models' predictive ability at distant or unmonitored sites. This approach could be used to quickly determine soil moisture for large and out-of-reach regions. Findings from this study are likely to further enhace our understanding and applicability of remote sensing techniques in the estimation of spatial and temporal variations of soil moisture content.

LANDSAT satellite data
Landsat 5 Thematic Mapper (TM and Landsat 8 Operational Land Imager (OLI) satellite data was downloaded from the USGS Earth Explorer website (Table 1). Landsat 5 TM consists of seven spectral bands with a spatial resolution of 30 meters for bands 1 to 5 and 7 with band 6 (thermal infrared) having a 120 meters spatial resolution which is resampled to 30 meter pixels. Landsat 8 OLI & Thermal Infrared Sensor (TIRS) consists of nine spectral bands with a spatial resolution of 30 meters for bands 1 to 7 and 9, 15 meters for panchromatic band 8 and 100 meters for thermal bands 10 and 11. Table 1 shows the list of satellite data and dates for the research counties. The Images from 1985-2000 were obtained from Landsat 5 TM and those from 2013-2015 were obtained from Landsat 8 OLI.

Moisture stress index
Moisture Stress Index is used for canopy stress analysis, productivity prediction and biophysical modeling. It was proposed by Hunt, et al. [17] who first used the index to detect changes in leaf water content using the near-and middle infrared reflection ratio. As the leaf water content in vegetation canopies increases, the absorption around the 1599 nm region of the electromagnetic spectrum increases with absorption at 819 nm remaining nearly unaffected by changing water content. Interpretation of the MSI is inverted relative to other water vegetation indices; thus, higher values of the index indicate greater plant water stress and in inference, less soil moisture content. The values of this index range from 0 to more than 3 with the common range for green vegetation being 0.2 to 2 [17]. Because the index detects leaf water content, satellite images used in the study were downloaded for the growing season of each year (April-September). MSI was calculated for each of the satellite images using the near-infrared band 4 and the mid-infrared band 5 spectral bands of Landsat images as shown in equation 1.
GIS analysis was used to extract MSI values corresponding to the selected moisture probe points located within the different counties and to pair them with the corresponding SMP measured at the probe sites at depths of 5, 10, and 20 cm.

Statistical analysis and model development
Pearson product moment correlation: Statistical analysis was carried out using R language version 3. The data were analyzed for correlation using the Pearson Product Moment Correlation [18,19]. Correlation coefficients (r) were calculated between MSI and soil moisture at in-situ sites, for three depths (5 cm, 10 cm and 20 cm) during the growing season.

Regression and validation:
Based on the correlation results, the independent variables (moisture percent at different depths) with the highest correlation (>± 0.7) were selected to develop the regression models. A regression model was then developed for the 20 cm soil moisture depth, which had the strongest correlation with MSI values. The hypothesis was that SMP~MSI regression model developed at SCAN sites could be used to estimate soil moisture for unmonitored areas using MSI. Simple linear regression was employed to test this hypothesis and a regression model using MSI values as the independent variable and 16-day average soil moisture (soil moisture values for 15 days before plus the day the satellite image was taken) as the dependent variable at the selected SCAN sites. 10 K-Fold cross-validation technique was used for model validation to assess its predictive ability. Statistical findings are shown and discussed in results section. The resultant maps of soil moisture variability for Limestone and Autauga Counties are also

Soil moisture-msi correlation
The growing season for the study area is April to September, and this is the period when the MSI has the highest correlation with soil moisture. The MSI values and corresponding SMP values are shown in Table 2.
Correlation coefficients between MSI values and in situ 16-day average soil moisture during the growing season are shown in Table  3. The data indicate that MSI is sensitive to soil moisture fluctuations, increasing in value with decreases in soil moisture as evidenced by the strong negative correlations (P<0.05) at the various depths. The strongest correlation of MSI and soil moisture (r=-0.895) occurred at the 20 cm soil depth. These strong correlations suggest that growing season MSI not only reflects the response of various vegetation to soil moisture variation, but also can be used in the estimation of soil moisture variation. Such strong correlations have been observed between other vegetation indices and soil moisture [5,15]. As a result, growing season MSI and soil moisture data at 20 cm were used for the regression model to estimate soil moisture.

Linear model and validation
After verifying the MSI response to soil moisture variability, a linear model was developed using the l m function in R (Table 4) where, C=aMSI+c MSI had the strongest correlation with SMP at 20 cm (-0.895), therefore these were the variables used in the linear model. Table 4 shows the simple linear regression model components and statistics. The R² indicates that approximately 80% of the variation in SMP can be explained by MSI which for this type of data would indicate that MSI has at least a modestly high ability to predict SMP at 20 cm. However, this ability diminishes with increasing MSI values (decreasing water content in leaves). Such a weakness could be overcome by using SMP values obtained at deeper depths which are less transient and may have higher correlation with MSI values.
The results for the 10 K-fold cross validation using the cv.lm function (R library DAAG) are summarized in table 5. The function divided the data into k subsets and each time one of the k subsets was used as the test set, the remaining k subsets were put together to form the training set. This was repeated 10 times to get the 10 folds. To estimate the predictive error of the model, the mean square values from each of the folds in the analysis were averaged to produce a single estimation. The model (based on in situ soil moisture) has a predictive error within ± 3.08 (units), if used to predict the soil moisture percent for unmonitored sites or areas without soil measurement data. Figure 2, shows all the 10 folds in the cross validation analysis.
Research results suggests that growing season MSI could serve as a reliable proxy for soil moisture estimation at 20 cm depth. However, increasing dryness seems to introduce small errors into the estimation i.e. ± 3.08 standard error of the estimate. Figures 3 and 4 show MSI variability maps derived from satellite images of Limestone and Autauga counties. Based on the analysis of MSI and SMP values for SCAN probe sites, a location with an MSI less than 0.2 was classified as very wet and regions with an MSI greater than 2.2 were classified as dry based [17].

Conclusion
MSI, a remotely sensed vegetation index, was used in large scale soil moisture estimation in selected counties in Alabama. The index captures changes in leaf water content using the near-and middle infrared reflection ratio. In situ SCAN SMP measurements from different depths, were taken and related to this derived moisture index. In this study, index dependency on soil type or climate were SMP at 5 cm SMP at 10 cm SMP at 20 cm MSI -0.519 -0.482 -0.895      not investigated. The index showed strong correlation with in situ soil moisture percent measured at 20 cm depth. Consequent linear regression model developed showed that MSI had at least a modestly high ability to predict SMP (R 2 =0.8). However, the model revealed that this ability would decline with increasing MSI values (indicative of increasing dryness). This weakness highlights the sensitivity of the model to transient soil moisture content at shallow depths.
The preliminary analysis of model performance in unmonitored sites, suggest that the MSI methodology is robust for estimation of soil moisture over larger areas and that it may be insensitive to soil type, at least in Alabama. Further research is needed to assess the strength of the MSI-soil moisture correlation at depths greater than 20 cm and application efficiency of MSI in soil moisture estimation of arid/dry regions.