Study on Spatiotemporal Variability of Water Quality Parameters in Florida Bay Using Remote Sensing

Florida Bay is the other important waterbody of South Florida that is a dynamic and biologically productive system, which provides unique habitats. Based on the volume of fresh water flow coming from the Everglades, the salinity of the Florida Bay is varies in different wet and dry seasons. Therefore, in this area, the flow of freshwater from the Everglades determines the conditions in Florida Bay. The regular inflow from the everglades also contain significant amounts of nutrients, which provide the required energy for aquatic organisms, and the constant variation of inflow and sediments discharged into the region make the Florida Bay and its surrounded estuaries one of the most biologically productive systems on earth.


Introduction
Florida Bay is the other important waterbody of South Florida that is a dynamic and biologically productive system, which provides unique habitats. Based on the volume of fresh water flow coming from the Everglades, the salinity of the Florida Bay is varies in different wet and dry seasons. Therefore, in this area, the flow of freshwater from the Everglades determines the conditions in Florida Bay. The regular inflow from the everglades also contain significant amounts of nutrients, which provide the required energy for aquatic organisms, and the constant variation of inflow and sediments discharged into the region make the Florida Bay and its surrounded estuaries one of the most biologically productive systems on earth.
Optically active variables, such as chlorophyll-a, total suspended solids, and turbidity consist of the majority of the water quality studies using the application of remote sensing, while other important water quality variables such as total phosphorous (TP) and total nitrogen (TN) are rarely considered due to their weak optical characteristics and low signal noise ratio. Turbidity and total suspended matters are considered as important variables in many studies due to their linkage with incoming sunlight that in turn affects photosynthesis for growth of algae and plankton. Water turbidity is an optical property of water, which causes the scattering and absorption the light more than its transmitting. As water turbidity is mainly a result of the presence of suspended matter, it is used to measure the concentration of fluvial suspended sediment and is commonly regarded as the opposite of clarity. The complex nature of suspended substances in water changes the reflectance of the waterbody and therefore causes variation in color. To this end, interpretation of remotely sensed data just based on the color of water is not adequate and accurate. In this study, the spatial and temporal changes of four water quality parameters including turbidity, chlorophyll-a (chl-a), total phosphate, and total nitrogen (TN), were investigated by using the application of integrated remote sensing, GIS data, and statistical techniques. The simultaneous observed data of four studied parameters were obtained from 20 monitoring stations and were used for the development and validation of the models. The optical bands in the region from blue to near infrared and all the possible band ratios were used to explore the relation between the reflectance of water body and observed data.

Study area
South Florida is one of the unique parts of the United States with a subtropical climate. It contains two important vast waterbodies of Lake Okeechobee and the Everglades, which the first one is the Florida's largest fresh water lake and the second one is the largest subtropical wilderness in the United States. In this study, Lake Okeechobee, which is the largest and the most important lake in South Florida, is selected to be investigated for its bio-physical parameters associated with water quality using remote sensing. Florida Bay is located at the southern part of Florida and on the south-east, is bordered by the Florida Keys and is open to the Gulf of Mexico along its western margin. Runoff from the Everglades enters this area and a number of its northern creeks, which in its way flows through different saw-grass areas, sloughs and wetlands. Figure 1 shows the location of the study area and the selected water quality monitoring sites. The average annual temperature ranges from 19.2°C to 28.7°C and the annual rainfall in the entire area of South Florida is generally about 55 inches (1,400 mm). Considering the subtropical climate of South Florida, the average rainfall is still considerable in the dry season. In addition, during El Niño phenomenon, greater amounts of rainfall in dry season are observed in South Florida.

Limnological data
The monitoring stations, downloaded from the South Florida Water Management District's (SFWMD) geographic information systems data catalog, were overlaid with the Florida bay map in ArcGIS to design a network of sampling stations that include sufficient historical data to construct a robust statistical database of studied parameters, considering a suitable spatial distribution on the Lake. The limnological data on chl-a, turbidity, total phosphate, and TN used in this study were obtained from the DBHYDRO (environmental database of SFWMD), United States Geological Survey (USGS), the Environmental Protection Agency (EPA), and the National Water Quality Monitoring Council (NWQMC) from 20 selected monitoring stations. Then, a database was developed for the observed data of four studied parameters simultaneous or in the closest day of the month as the satellite image are obtained in wet (May 15 th through October 15 th ) and dry seasons (October 16 th through May 14 th ), and was used for the development and validation of the models. The SPSS 16.0 software package was employed for data treatment. The descriptive statistics of the four studied parameters are shown in Table 1.
Due to differences in units of the studied water quality parameters in the dataset, pre-treatment of data is required in order to the homogenization [1]. In this study, data pre-treatment methods, such as the elimination of non-informative variables, the treatment of missing data values, and the detection and treatment of outliers were performed before the statistical analyses.

Satellite data
The remotely sensed data were acquired from the Landsat Thematic Mapper (TM) and Landsat OLI sensors onboard Landsat 5 and 8, respectively, in 2000Landsat 5 and 8, respectively, in , 2007Landsat 5 and 8, respectively, in , and 2015. Pre-processing of the Landsat data including the radiometric calibration and atmospheric correction is essential for quantitative studies [2,3], especially for lake waters that the reflected light is small [4]. The radiance of small lakes in average is usually less than 10%, and even in many cases values less than 1% of total radiance are also reported [5]. For this purpose, in order to remove the effects of high local variability, the digital number values should be first converted to unitless planetary reflectance before the process of atmospheric correction. As two types of imagery were used in this study, two different methods were used for the radiometric calibration and Atmospheric correction of TM and OLI data. ERDAS IMAGINE 2014 and ESRI ArcGIS 10.0 platforms were used for image processing.
Preprocessing of landsat-5/TM data: The digital number (DN) values of each band were converted to radiance values to remove the voltage bias and gains from the satellite sensor as follows: where, L λ is the spectral radiance at the sensor's aperture in watts/ (meter squared × ster × µm), Lmin λ is the spectral radiance that is scaled to QCALmin in watts/(meter squared × ster × µm), Lmax λ is the spectral radiance that is scaled to QCALmax in watts/(meter squared × ster × µm), QCALmin is the minimum quantized calibrated pixel value (corresponding to Lmin λ ) in DN, QCALmax is and the maximum quantized calibrated pixel value (corresponding to Lmax λ ) in DN.
As the process of atmospheric correction, the radiance values are converted to at-satellite reflectance values that consider the variation of sun angle in different latitude, time of day, season, and the distance between the earth and sun. The simplified model of the effects of the atmosphere is [6]: where, ρ p is the unitless planetary reflectance, L λ is the spectral radiance at the sensor's aperture, d is the Earth-Sun distance in astronomical units, ESUN λ is the mean solar exo-atmospheric irradiance, and θ s is the solar zenith angle in degrees.
Preprocessing of landsat-8/OLI data: COST-DOS was used as the radiometric correction method for mitigating atmospheric effects recorded by Landsat-8/OLI data. It accounts for absorption, scattering, and refraction of atmospheric particles such as particulate matter and water vapor [7]. The improved cosine of the solar zenith angle (COST) method presented by Chavez [8] where, P λ is the dimensionless spectral reflectance value of surface water, π is a constant (3.14159265), L λsensor is the spectral radiance value, and L λhaze is the path radiance or upwelling atmospheric spectral radiance. d is the distance between the earth and the sun in astronomical units, and θ s is the solar zenith angle (°). ESUN λ is the solar spectral irradiance at to the top of atmosphere (TOA). The spectral radiance value (L λsensor ) at the satellite sensor's aperture (Wm -2 sr -1 μm -1 ) is calculated as follows: where, M λ is the band-specific multiplicative rescaling factor, A λ is the band-specific additive rescaling factor, and Q cal is the minimum quantized and calibrated standard product pixel value. Both M λ and A λ are provided in the Landsat 8 metadata file (MTL file).
where, L λhaze is the path radiance or upwelling atmospheric spectral radiance scattered in the direction of the sensor entrance pupil and within the sensor's field of view, L λmin is the minimum spectral radiance, and L λ,1% is the spectral radiance value of the darkest object on each band of the Landsat 8 and can be calculated as follows: The theoretical radiance of a dark object is then computed, under the assumption that dark objects have 1% or smaller reflectance [8,9].

Statistical methods
In this study, Pearson's correlation analysis was utilized to determine the linear relationship and calculate the correlation between two variables in order to characterize the relationship between various TM where, X band is the corrected reflectance value, X band is the mean of the corrected reflectance value, Y WQP is the in situ WQPs data, and Y WQP is the mean of the in situ WQP data. Then, a linear multiple regression analysis was conducted for all WQPs. The general formula for multiple regressions is as follows: where, WQP is the dependent variable and represents measured (or known) water quality parameters (chl-a, TSS, total phosphate and TKN) at study site k and X is the independent reflectance variable acquired from the Landsat-5/TM or Landsat-8/OLI images at study site k. The numbers represent the band number and a, b, c, d, and e are the model coefficients using both the measured water quality parameter value at a particular station and the known pixel reflectance values there, according to the least squares algorithm.

Results and Discussion
In this study, statistical techniques were also applied to find the most significant relationships between water quality parameters and reflectance values of visible band of TM and OLI data and their combinations [10][11][12][13][14][15]. Pearson's correlation was carried out between Landsat bands, and chlorophyll and turbidity concentrations to find the most significant relationships. Previous studies [12,14,[16][17][18][19] found a very significant correlation between chlorophyll concentration and water transparency and the relationships among the visible bands. As regards to chlorophyll, the prominent scattering-absorption features of chl-a include strong absorption between 450-475 nm (blue) and at 670 nm (red), and reflectance reaches to peak at 550 nm (green) and near 700 nm (NIR). The applicability of reflectance peak near 700 nm and its ratio to the reflectance at 670 nm to retrieve chl-a in turbid waters was tested by Gitelson [20]. By increasing the amounts of dissolved inorganic materials, the peak of visible reflectance relocates from green band to red band [21]. Alparslan et al. [22] obtained the amount of turbidity from Band1, Band2, Band3, Band4, Band5 and Band7 of Landsat-5 TM Satellite Image. He et al. [23] used a combination of Landsat TM bands 2, 3, 6 and 7 to correlate with the in situ turbidity measurements. In this study, the stepwise multiple linear regression analysis was applied on all the visible bands and their combinations to select significant variables. Visible band and their ratios constructed the independent variables and p values greater than 0.1 were considered as a limit for factor removal ( Table 2).
As for chl-a, in dry season the selected band by the statistical analysis were Blue/Red, Blue/NIR, Green/Red, Red/Blue, and NIR/Blue ratios, and in wet season were Red, Green/Red, Red/Green, and Red/ NIR ratios. Therefore, the functional model is: Wet season: Blue Green The variability of TSS concentration was also investigated using the same procedure, incorporating visible bands and their ratios as independent variables in the stepwise multiple linear regression analysis. The following functional model was selected from the stepwise variable selection procedure for dry and wet season: Wet season: The statistical values given in Table 3 were obtained, as a result of the regression analysis computation. The R 2 values, which show the correlation between the measured values and the estimated values from the satellite image, prove that the first four bands and their ratios of Landsat satellite image are well capable of being used in the measurement of chl-a and turbidity concentrations. The model coefficients are given in Table 4 to compute chl-a and turbidity at anywhere on the lake surface, based on the pixel reflectance values. Maps given in Figure 2 were made using directly the satellite data of the Florida Bay's entire surface, based on the equalities in Table 4.

Chlorophyll-a
The results of correlation analysis between landsat bands and chl-a concentration varied from -0.52 (Blue/NIR band ratio) to 0.60 (NIR/Blue band ratio) in dry season, and from −0.64 (Red/NIR band ratio) to 0.65 (Blue/Green band ratio) in wet season (Table 2). Extracted equations based on regression analysis between chl-a and Landsat bands showed correlation coefficients of 0.86 and 0.66 in dry and wet seasons, respectively. Due to the existance of missing values in the monitoring sites dataset, 40 data points in dry season and 58 data points in wet season were used in the statistical analysis. Table 4 presents the multiple regression models constructed through equations 9 and 10. Among the different combinations of bands and band ratios, this research selected a multiple regression model and revealed the best significant relationships in order to compare the estimated chl-a through the Landsat TM and OLI data with in situ measurement data. The Durbin-Watson values should be greater than 1.5 and less than 2.5 to indicate that multiple linear regression data is free of first order linear auto-correlation. For chl-a, the Durbin-Watson values were 1.667 and 1.712 in dry and wet seasons, respectively, which lay in the accepted range.

Turbidity
The correlation between the Landsat bands and TSS ranged from -0.57 (NIR/Green band ratio) to 0.80 (Green band) in dry season, and from -0.42 (NIR/Red band ratio) to 0.46 (Red/NIR band ratio) in wet season ( Table 2). The strongest correlations (R 2 =0.74 with blue, 0.80 with green, and 0.73 with red band) at a significance level of p<0.05 were found from the correlation with single bands. Particularly in wet season, the NIR band was closely related to turbidity as seen by the high numerical value of Blue/NIR, Green/NIR, Red/NIR, NIR/Blue, NIR/Green, and NIR/ Red (Table 2). To construct multiple regression models, the bands and band ratios that indicated a good correlation with turbidity were selected. In dry season, four band and band ratios of blue, red, green/NIR, and red/green showed significant relationships (R 2 =0.84), and in wet season, four band combinations containing green/blue, green/NIR, red/NIR, and NIR/blue revealed significant relationships (R 2 =0.63) with turbidity. The Durbin-Watson values were   Table 3: Statistical values obtained from regression analyses for chl-a and turbidity.

Dry Season
Chlorophyll  1.937 and 2.052 in dry and wet seasons, respectively, which lay in the accepted range.
The normal probability-probability (P-P) plots were generated based on the standardized residuals. If the residuals are Normally Distributed the values should fall on the diagonal line of identity. Straight lines in Figure 3 indicated the normal distribution for the studied variables in both dry and wet seasons.

Nutrients (Total phosphate and total nitrogen)
Phosphorus and phosphate are closely related to some other parameters like phytoplankton [24,25], turbidity and total suspended matters (TSM), and Secchi disk transparency (SDT) [26], which is the basis for remote monitoring of TP dynamics [27]. TP is closely related to Chl-a concentration, and total suspended matter usually acts as a carrier for TP loading, thus, TP is also closely related to secchi disk with an exponential equation [25]. Remote estimation of total phosphorus (TP) and total phosphate has been investigated based on their high correlation with optically active constituents [10,25,28]. Although there is a possibility that TP may be indirectly correlated to remote sensing measurements, few studies have been conducted to estimate TP concentration using remotely sensed imagery and can be quantified using visual spectral bands.
Multispectral Landsat TM data have been widely used to monitor and map the TP spatiotemporal pattern in different regions [28][29][30]. Empirical statistical regression models were used to study the relationship between the concentration of phosphorus with other water quality indicators, such as secchi depth and chl-a concentration [28]. Song et al. [9] studied the correlation between TP and TM1, TM2, TM3, and TM4 from the Landsat 5, and found that each band had a correlation with TP of 0.62, 0.59, 0.55, and 0.51, respectively. Wu et al. [28] used a combination of TM1, TM3/TM2, and TM1/TM3 data to correlate chl-a concentration and SD measurements with TP  concentration. Also, Alparslan et al. [10] used the first four bands of Landsat 7-ETM satellite data to map total phosphate in Ömerli Dam, Turkey. Later, Alparslan et al. [22] using Band1, Band2, Band3, Band4, Band5 and Band7 of Landsat-5 TM Satellite Image obtained the amount of total phosphorus concentration. Lim and Choi [13] used bands 2, 3, 4, and 5 of Landsat-8/OLI, and constructed 3 multiple regression models by selecting both single bands and band ratios, and obtained significant correlation coefficients.
Also, there is still little literature with regard to estimate nitrogen concentration in waterbodies from remote sensing. Hood et al. [31] studied two unique optical characteristics of chlorophyll-a (chl-a), absorption and fluorescence, are strongly related with the nitrogen concentration. Also, Hanson et al. [32] proved the points regarding that the fluorescence of chl-a would be influenced by nitrogen and phosphorous concentrations. Additionally, Edwards et al. [33] showed that suspended sediment concentration (SSC), chl-a and colored dissolved organic matters (CDOM) are important substance sources of nutrients elements. Gong et al. [34] measured different concentrations of nitrogen and phosphorus using the reflectance spectra in the laboratory and found their special features by hyperspectral remote sensing technique. Their result showed the reflectance peaks at 404 and 477 nm, and phosphorus at 350 nm, for nitrogen and phosphorous, respectively, and developed a quantitative retrieval model for these two parameters. Karakaya and Evrendilek [35] applied Landsat 7 Enhanced Thematic Mapper Plus (ETM+) data to measure the concentration of nitrite nitrogen (NO 2 -N) and nitrate nitrogen (NO 3 -N) using best-fit  multiple linear regression (MLR) models as a function of Landsat 7 ETM+ and ground data in Mersin Bay, Turkey. Based on the pearson correlation between limnological data and Landsat bands and ratios, and also the relationship between these nutrients, and turbidity and chl-a (Table 5), different models for total phosphate and TN were developed in two wet and dry seasons.
The selected bands by the statistical analysis for the estimation of total phosphate in dry season were correlated with chl-a, Blue/ Red, Red/Blue, and NIR/Blue ratios, and chl-a, Blue/Green, Green/ Blue, Green/Red, and Green/NIR ratios in wet season. Therefore, the functional model is: The variability of TN concentration was also investigated using the same procedure, incorporating visible bands and their ratios as independent variables in the regression analysis, and the following functional model was selected for dry and wet season: The statistical values given in Table 6 were obtained, as a result of the regression analysis computation. The R 2 values, which show the correlation between the measured values and the estimated values from the satellite image, prove that the first four bands and their ratios of Landsat satellite image plus the values of chl-a and turbidity are well capable of being used in the measurement of total phosphate and TN. The model coefficients are given in Table 7 to compute total phosphate and TN at anywhere on the bay surface, based on both the pixel reflectance values and chl-a and turbidity concentrations. Maps given in Figure 4 were made using directly the satellite data of the Florida Bay's entire surface, based on the equalities in Table 7.

Total phosphate
The correlation of Landsat bands with total phosphate concentration ranged from −0.53 (Blue/Red band ratio) to 0.87 (chl-a) in dry season, and from −0.40 (Green/NIR band ratio) to 0.57 (chl-a) in wet season. In particular, total phosphate concentration displayed a significant relationship (R= 0.90 and 0.57 in dry and wet seasons, respectively)     with chl-a at a significance level of p<0.05 (Table 5). Multiple linear regression models constructed for total phosphate estimation through band combination by selecting both correlataed water quality parameters from ground data and imagery showing high correlation coefficients of 0.86 and 0.69 in dry and wet seasons, respectively. Due to the existance of missing values in the monitoring sites dataset, 40 data points in dry season and 37 data points in wet season were used in the statistical analysis. Table 8 presents the MLR models constructed through equations 13 and 14. Among the different combinations of bands and band ratios, this research selected a multiple regression model and revealed the best significant relationships in order to compare the estimated total phosphate through the Landsat TM and OLI data with in situ measurement data. The Durbin-Watson values were 1.691 and 1.912 in dry and wet seasons, respectively, which lay in the accepted range.

Total Nitrogen (TN)
The correlation between the landsat bands and TN from -0.70 (Blue/Red band ratio) to 0.74 (Green/Blue and Red/Blue band ratios) in dry season, and from -0.32 (Red/NIR band ratio) to 0.75 (chl-a) in wet season. In dry season, the blue and red bands showed the highest correlation, and in wet season NIR and its combinations were closely related to TN (Table 5). To construct multiple regression models, the highly correlated chl-a and TSS, and the bands and band ratios that indicated a good correlation with TN were selected. In dry season, chl-a and turbidity, and blue/green, blue/red, green/blue, and red/blue showed significant relationships (R=0.82), and in wet season, chl-a and total phosphorous, and three band combinations containing red, NIR, NIR/Red band combinations revealed significant relationships (R=0.82) with TN. The Durbin-Watson values were 1.585 and 1.945 in dry and wet seasons, respectively, which lay in the accepted range. The     normal probability-probability (P-P) plots for the studied variables in both dry and wet seasons indicated the normal distribution for the total phosphate and TN in both dry and wet seasons ( Figure 5).
After identifiying the regression functions for the four studied water quality parameters, and considering their spatial distribution that were mapped for the whole Florida Bay surface in two seasons ( Figures  2 and 3), different spatial and temporal variations were observe. The generated maps were reclassified taking into account different ranges in order to define the trophic conditions of Florida bay in two seasons and in three years of 2000, 2007, and 2015, which is summarized in Table 8.

Conclusion
The water supplies for domestic and industrial use, irrigated agriculture, and livestock and mining activities require continuous monitoring to make sure that the required standards and criteria are met. However, due to anthropogenic activities and industrial development, water quality has dramatically degraded. A combination of remote sensing, GIS, and traditional in situ sampling can lead to perform a better monitoring program for water quality parameters in various waterbodies.
In this study, bio-physical parameters associated with water quality in Florida Bay were investigated based on atmospherically corrected data. The principal objective of this study was to monitor and assess the spatial and temporal changes of four water quality parameters including turbidity, chlorophyll-a (chl-a), total phosphate, and total nitrogen (TN), by using the application of integrated remote sensing, GIS data, and statistical techniques. For this purpose, three dates of Landsat Thematic Mapper (TM) data in 2000 (February 13), 2007(January 31), and one date of Landsat Operational Land Imager (OLI) in 2015 (January 5) in the dry season, and three dates of TM data in 2000 (August 7), 2007 (September 28), and one date of OLI data in 2015 (September 2) in the wet season of the subtropical climate of South Florida, were used to assess temporal and spatial patterns and dimensions of studied parameters in Florida Bay, USA. The simultaneous observed data of four studied parameters were obtained from 20 monitoring stations and were used for the development and validation of the models. The optical bands in the region from blue to near infrared and all the possible band ratios were used to explore the relation between the reflectance of waterbody and observed data. The results of Pearson's correlation between limnological data and Landsat bands and ratios, and also the relationship between these nutrients, and TSS, turbidity, and chl-a in Lake Okeechobee and Florida bay showed better correlations of totall phosphate and total nitrogen with limnological data and imagery in Lake Okeechobee and Florida bay, respectively. This can also indicate that these two waterbodies are under the influence of which nutrient.
The predictive models to estimate chl-a and turbidity concentrations were developed through the use of stepwise multiple linear regression (MLR) and gave high coefficients of determination in dry season (R 2 = 0.86 for chl-a and R 2 =0.84 for turbidity) and moderate coefficients of determination in wet season (R 2 =0.66 for chl-a and R 2 =0.63 for turbidity). Values for total phosphate and TN were correlated with chl-a and turbidity concentration and some bands and their ratios. Total phosphate and TN were estimated using best-fit multiple linear regression models as a function of Landsat TM and OLI, and ground data and showed a high coefficient of determination in dry season (R 2 =0.74 for total phosphate and R 2 =0.82 for TN) and in wet season (R 2 =0.69 for total phosphate and R 2 =0.82 for TN). The MLR models showed a good trustiness to monitor and predict the spatiotemporal variations of the studied water quality parameters in Florida Bay.