Retrieval of total suspended matter concentrations from high resolution WorldView-2 imagery: a case study of inland rivers

Satellite imagery has played an important role in monitoring water quality of lakes or coastal waters presently, but scarcely been applied in inland rivers. This paper presents an attempt of feasibility to apply regression model to quantify and map the concentrations of total suspended matter (CTSM) in inland rivers which have a large scale of spatial and a high CTSM dynamic range by using high resolution satellite remote sensing data, WorldView-2. An empirical approach to quantify CTSM by integrated use of high resolution WorldView-2 multispectral data and 21 in situ CTSM measurements. Radiometric correction, geometric and atmospheric correction involved in image processing procedure is carried out for deriving the surface reflectance to correlate the CTSM and satellite data by using single-variable and multivariable regression technique. Results of regression model show that the single near-infrared (NIR) band 8 of WorldView-2 have a relative strong relationship (R2=0.93) with CTSM. Different prediction models were developed on various combinations of WorldView-2 bands, the Akaike Information Criteria approach was used to choose the best model. The model involving band 1, 3, 5, and 8 of WorldView-2 had a best performance, whose R2 reach to 0.92, with SEE of 53.30 g/m3. The spatial distribution maps were produced by using the best multiple regression model. The results of this paper indicated that it is feasible to apply the empirical model by using high resolution satellite imagery to retrieve CTSM of inland rivers in routine monitoring of water quality.


Introduction
Total suspended matter (TSM) including organic and inorganic materials which plays an important role in water quality of aquatic ecosystem. The magnitude of concentrations of TSM (C TSM ) has a direct influence on clarity, turbidity, color and other optical properties of water column especially in coastal and inland waters [1][2][3]. Suspended materials serve as a carrier and storage agent of pesticides, absorbed phosphorus, nitrogen and organic compounds and can be an indicator of pollution [4]. Sediments transport also plays an important role in global carbon cycle since half of the terrestrial organic carbon exported by rivers is ultimately buried in marine sediment [5]. Therefore, it becomes an increasingly important substances of water quality monitoring along with the industrial revolution flourishing and urban expansion arising. Traditional methods of estimating C TSM take a lot of field and laboratory efforts including water sampling, filtering and dry weight measurements which are time consuming. As inland water variables are spatially heterogeneous, synoptic information can't be obtained from the above monitoring way. From this perspective, remote sensing provides a distinct and effective way in water quality management which can give spatially distributed information [6][7][8][9][10]. Since the 1980s, several multispectral satellites including the Coastal Zone Color Scanner (CZCS), Sea-Viewing Wide Fieldof-View Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer (MODIS), Medium spectral Resolution Imaging Spectrometer (MERIS) and Landsat Thematic Mapper (TM), were launched to get detailed information of the earth surface. These satellite sensors were previously applied in the open ocean called Case-Ⅰ waters in which the optical property dominated by phytoplankton. However, with the development of remote sensing techniques, it has been widely used to monitor inland and coastal water quality parameters (i.e., chlorophyll-a, turbidity, total suspended matters, temperature, chromophoric dissolved organic matter) [11][12][13][14]. Many scholars have analyzed surface water quality by using satellite multispectral data and demonstrated the relationships between remote sensing reflectance and C TSM [15][16][17][18][19][20][21]. Meanwhile, the empirical model and semi-analytical model are used most frequently in their researches. However, previous studies commonly take the medium or low resolution satellite images as the data source which are used to establish relations with in situ measurements. Since the 20 th century, a lot of commercial satellite companies provide high resolution remote sensing images (i.e., IKONOS, SPOT series, WorldView series, GeoEye-1) that make it more suitable for monitoring inland waters, especially for those quite narrow rivers. [22] adopted the approach based on analytical optical model to estimate C TSM taking TM and SPOT satellite data, while [23] examined and validated the retrievals of C TSM and chlorophyll-a with high resolution IKONOS multispectral data. Although these high resolution satellite data could not be analyzed using specific absorption characteristics due to its broad bandwidths and relative low signalto-noise ratio compared with medium and low spatial resolution satellite data, the advantages of high resolution make it capable of monitoring inland rivers and reservoirs due to that these water body usually has a relative small area.
Whether the high resolution satellite image WorldView-2 with a low radiometric resolution can be applied in C TSM retrieval over water body of inland rivers need to answer and evaluate. The objective of this study was to (1) develop a regression model by using high resolution WorldView-2 data to quantify C TSM in inland rivers in Deqing county of Zhejiang Province, China. (2) evaluate the capability of high resolution WorldView-2 to monitor the total suspended matter over a large spatial and high dynamic region. Our study may give some evidence that WorldView-2 image were capable of detecting water quality, and can provide an alternative approach to investigate the C TSM in inland rivers.

Study area
Deqing County is situated in the northwest part of Hangzhou, Zhejiang province, China. Several rivers flow through the city. The region of WorldView-2 image covers an area of approximately 214 km 2 from 30°28′27.83″N to 30°35′44.56″N in latitude, and 119°58′45.21″E to 120°10′0.98″E in longitude. The area of water covers one-tenth of its total area of Deqing County and the main channel of rivers and some aquaculture areas has a high C TSM respectively due to the resuspension of suspended sediments caused by the shipping and aquaculture. Figure 1 shows the geographical location of the study area.

Field campaigns
A series of surface water samples were collected from the designed sampling areas in different branch on January 9 and January 10, 2014. Levels of total suspended matter from 21 locations sampled in this study area and the sites were shown in figure 1. C TSM analysis were carried out using the weighting method, following the Chinese National Standard protocols [24]. Brief procedures were as follows: 1) water samples were collected at a depth of 0.3 m and filtered immediately onto 47 mm diameter dried and pre-weighed Sartorius™ acetate fiber filters on site, meanwhile, write down the related records of filtering.
2) The filters were stored in a refrigerator which has a −20 °C temperature before laboratory measurement. The weighing principle is that the filter should repeatedly dried in a thermal infrared dryer at 40 °C more than 4 hours and weighed until the error between the last two measurements was less than 0.01 mg. As mentioned above, sampling were collected from 1/9/2014 to 1/10/2014, whereas the overpass time of the WorldView-2 satellite is 1/10/2014. To illustrate a fact that the field data can regard as a nearly contemporaneous measurements belonging to WorldView-2 image acquisition day, meteorological data was analyzed and the result show there is no rainfall records during sampling. The satellite remotely sensed data used for this study was a high resolution WorldView-2 multispectral image acquired on January 10, 2014. Another WorldView-2 image with accurate geometric coordinates was selected as a reference image for geometric correction. WorldView-2 image has eight multispectral bands with spatial resolution of 1.8 m and swath width of 16 characteristics of WorldView-2 bands are given in table 1. Firstly, Geometric correction was carried out after the composition of multispectral bands referred to a corrected image which was precisely geo-registered by Ground Control Points (GCPs), geometric accuracy of the image was controlled less than one pixel. The next step was radiometric correction which can convert digital numbers (DNs) to spectral radiance. The equation used for this process was as follows [25].

DN Calcoef L Bandwidth
where L λ (μW/m 2 /nm/sr) is the radiance for spectral band λ (nm); DN λ is the digital value for spectral band λ ; Calcoef λ is the radiometric calibration coefficient; Bandwidth λ is the bandwidth of each spectral band at λ . Calcoef λ and Bandwidth λ of the WorldView-2 bands were reported by DigitalGlobe company and recorded in the metadata file called *.IMD, the detail calibration parameters of WorldView-2 acquired in January 10, 2014 were shown in table 2. In the study, geometric correction and radiometric calibration were operated by using corresponding module built in ENVI TM 5.1 software.

Atmospheric correction
The radiance received by satellite, also known as the top of atmosphere radiance was calculated after radiometric calibration. To get the parameters of water quality from satellite imagery, the derivation of reliable spectral from satellite measurements by appropriate atmospheric correction is essential [26], several methods of atmospheric correction aimed at water bodies were developed to minimize atmospheric effects [27][28][29][30][31][32]. In this study, the Fast Line-of-Sight Atmospheric Analysis of Spectral  given in table 3, it is worth noting that the mean elevation of this study region was obtained from digital elevation model (DEM), the remaining parameters were got from the metadata file or technique documents. After running the module of atmospheric correction, water surface reflectance was obtained by eliminating the reflectance caused by the atmosphere from the calculated radiance image.

Extracting WorldView-2 spectra
The water sampling sites shown in figure 1 were geo-located by global position system (GPS) receivers with positioning accuracy of 3 meters. All these sampling sites were later used to make relation to satellite image by recording the surface reflectance value and C TSM value at the corresponding pixel. According to the previous research, it was considered better to use an average pixel value from moving window to minimize the influence of the environment around the water for high resolution data which is liable affected by random noise [23] [37]. Different pixel size (window) for correlating WorldView-2 data with in-situ measurements was tested to determine the best one, a window of 5 × 5 was finally selected in this study.

Water body extraction
In order to identify water and land classes, a mask should be create to confine the satellite imagery to zone covered by water body. A decision tree classification (DTC) method which is a type of multistage classifier that can be applied to a single image or a stack of images [38] was used to distinguish the water and other terrain target. It is made up of a series of binary decisions that are used to determine the correct category for every pixel. The decisions can be based on any available characteristic of the dataset. In the study, the Normalized Difference Water Index (NDWI) was selected in the DTC module of ENVI TM 5.1 as a character for extracting the water body. As a result, there is a fine performance of water body extraction overall besides a slight error lies in mining region or shallow paddy fields.

Regression model development
Many studies have been carried out for C TSM retrieval by remotely sensed data, the algorithms can be summarized as two categories: empirical model and semi-analytical model. Empirical model was established on the basis of statistical relationships between C TSM and single-channel or multi-channel reflectance [19] [39]. The sensitive band selection is crucial for the development of a robust model. Semi-analytical model is based on relationship between inherent optical properties (IOPs) and water components, which may combine some empirical relationships [40][41]. Due to the restrictive spectra range of WorldView-2 imagery that it can't afford the specific absorption or other characteristic of IOPs. In this study, statistical techniques have been used to derive a correlation between satellite spectral data and in situ TSM concentrations. The existing algorithm for C TSM retrieval was frequently applied in the open ocean waters or coastal waters, scarcely used in inland rivers, there are two possible reasons as follows: 1) since inland productive waters were optically more heterogeneous and complex, suspended matter may not have a dominant influence on the signal from the water surface. 2) the coverage of inland rivers were relatively small and narrow, so the majority of satellite data can't afford an enough spatial resolution to identify the rivers, not to mention how to quantify constituents in these rivers. Statistical techniques were adopted in our study, analyses on the Pearson correlation coefficient were developed to perform the relative strength of correction between C TSM and WorldView-2 surface reflectance before building regression model. The Pearson correlation coefficients related to the sensitivity for C TSM of each band were list in table 4. The results indicate that there is an increasing correlation coefficient as the wavelength increases, all the Pearson correlation coefficient of each band exceed 0.75, and the maximum value is of 0.93 appeared at wavelength of 908 nm. It is also proved that the sensitive band of total suspended matter will move to the red or NIR bands in high turbid waters [42][43]. This is because the relative high scattering from suspended matter which dominates the signal of reflectance when compared to clear waters and chlorophyll-a dominated waters.
In order to improve the accuracy of C TSM retrieval, various linear regression algorithms were used to explore the best relationship between C TSM and WorldView-2 satellite data, the equation can be expresses as: A first single-channel model was built using a single variable of WV8 which have the highest correlation coefficient with total suspended matter. A significant relationship (R 2 =0.867, p<0.001) was observed, and the standard error of estimate (SEE) was 57.89 g/m 3 .The fitted regression line and its confidence and prediction range at confidence level of 0.95 are shown in figure 2.
Different combinations of WorldView-2 bands or band ratios were used as independent variables to check which one has a better performance of C TSM retrieval. The results of various regression models were shown in table 5. As for single variable model shown in figure 3, the variable WV5 lied in red band range have a relatively low relationship (R 2 =0.735, SEE=81.85 g/m 3 ) with C TSM compared to WV8 (R 2 =0.867, SEE=57.89 g/m 3 ), the model 3 with WV1 has a weakest relationship (R 2 =0.589, SEE=131.31 g/m 3 ), while the ratio WV8/WV1 had a higher correlation with C TSM . For multi-variable models, there was a better performance than single-variable models, the strongest relationship was observed of model 7 with variables WV1, WV2, WV3 and WV8. From the regression model (model 2, model 5, model 6 and model 7) established with various variables shown in table 5, there is an obvious fact that adding variables to WV8 generally increased the R 2 values slightly. To find the best model for C TSM retrieval, an evaluation rule was adopted using the Akaike Information Criteria (AIC) approach which is an estimator of the relative quality of statistical models for a given data set, the AIC values for each model are listed in   Considering the characteristics of the regression models and the AIC value, the model 7 (R 2 =0.916, AIC=115.14) which has the highest relationship and a smallest AIC value was chose as the best model for retrieving C TSM . The model can express as: where WV1, WV2, WV3 and WV8 are the atmospherically corrected reflectance of WorldView-2 band, and a 1 , a 2 , a 3, a 4 and b are the regression coefficients of equation (3), the ε is the residuals of the model.
Since El Saadi A M [20] and Dorji P [21] established their own regression model for C TSM retrieval based on WorldView-2, a comparison was made between their and our models. El Saadi A M developed the algorithm for estimating the total suspended matter using WorldView-2 data and stepwise regression model without atmospheric correction in a single branch of inland rivers, the model containing band 1, 3, 5, 6 and 7 which R 2 is of 0.39. However, in this study, band 1, 2, 3 and 8 were used to build the retrieval model after atmospheric correction which R 2 can reach to 0.916, much higher than the model built by El Saadi A M. Dorji P also developed a model based on the single band 5 of WorldView-2 over the coastal waters, and adopted the 6S (Second Simulation of a Satellite Signal in the Solar Spectrum) radiative transfer code in the atmospheric correction. However, we found that single band 8 of WorldView-2 have a best relationship with C TSM , furthermore, the R 2 value of single band models is smaller than that the model using multiple bands. It is worth noting that the dynamic  The model for C TSM retrieval is developed from the regression analysis by using statistical methods in our study. Table 5 presents the retrieved results of C TSM by using multiple regression model. Among the different regression model, WV1, WV2, WV3 and WV8 were finally chosen to develop the model to predict C TSM . To validate the capability of the model, we compared the model-estimated C TSM derived from the equation (3) with in situ data. The validation data were not used for modelling earlier.
The results were shown in figure 4, with an SEE of 62.95 g/m 3 and a mean error 63.29 g/m 3 .

Mapping the TSM concentrations
The regression model of equation (3) was used for mapping the C TSM by taking WorldView-2 remote sensing data as inputs. Figure 5 presents the spatial distribution of total suspended matter concentration over the study region. The results indicate that C TSM in fishing breading areas and shipping routes is higher than other region. The likely cause of this phenomenon was that fine particles can resuspend from the bottom to the surface due to influence of roiling by fish and disturbing by boats. Figure 6 shows the two representative areas, respectively. From the statistical perspective of C TSM in the study area, On the whole, the C TSM in northern zone is higher than these in southern part, while, the maximum value of the TSM concentration is 820 g/m 3 , with the standard deviation is of 36.61 g/m 3 . It should be noted that the C TSM of pond or paddy field colored with blue in the figure 5 may partly retrieved incorrectly, due to the bottom reflectance from water surface may dominate the signal of WorldView-2 satellite imagery.

Conclusions
In this study, C TSM retrieval in inland rivers from high resolution WorldView-2 multispectral imagery by multiple regression were demonstrated as a case study, the results give some evidence that high resolution satellite data can be applied in monitoring the water quality parameters, and give valuable information to analyze water quality variations over inland rivers. The technique used in this paper can provide an alternative approach to investigate C TSM in inland waters Despite the restrictive spectral range of multispectral WorldView-2 imagery, the sensitive analysis of each band in WorldView-2 is carried out. The results show that the sensitive band of WorldView-2 to TSM concentrations is appeared in red and near infrared band range which is consistent with the previous research, and the highest Pearson coefficient lied in WV8 (R=0.93).This provides a feasibility for the establishment of regression model between satellite remotely sensed data and in situ measurements.
Since the regression models have been the most commonly used approach to derive the TSM concentrations, the regression model for C TSM retrieval was built based on reflectance of WorldView-2 and in situ C TSM data. Eight regression models with various variables were constructed in this paper and results show that the multiple regression model with four independent variables (WV1, WV2, WV3 and WV8) exhibits a best performance(R 2 =0.916, SEE=53.30 g/m 3 ). The linearity verified between the reflectance and TSM concentrations and the precision of all tested methodologies could be related to the atmospheric correction and calibration procedure.
Although the regression models established in the paper are regional algorithms, the whole procedure and methods for C TSM retrieval might be extended to other regions and analyze the spatial pattern of TSM in other turbid waters. Future study is needed to collect more in situ data (including C TSM and bio-optical data) to try to map the C TSM distribution using other retrieval algorithms to make a comparison, such as neural network or semi-analytical approaches.