Mapping Water Quality Parameters in Urban Rivers from Hyperspectral Images Using a New Self-Adapting Selection of Multiple Artificial Neural Networks

Protection of water environments is an important part of overall environmental protection; hence, many people devote their efforts to monitoring and improving water quality. In this study, a self-adapting selection method of multiple artificial neural networks (ANNs) using hyperspectral remote sensing and ground-measured water quality data is proposed to quantitatively predict water quality parameters, including phosphorus, nitrogen, biochemical oxygen demand (BOD), chemical oxygen demand (COD), and chlorophyll a. Seventy-nine ground measured data samples are used as training data in the establishment of the proposed model, and 30 samples are used as testing data. The proposed method based on traditional ANNs of numerical prediction involves feature selection of bands, self-adapting selection based on multiple selection criteria, stepwise backtracking, and combined weighted correlation. Water quality parameters are estimated with coefficient of determination R2 ranging from 0.93 (phosphorus) to 0.98 (nitrogen), which is higher than the value (0.7 to 0.8) obtained by traditional ANNs. MPAE (mean percent of absolute error) values ranging from 5% to 11% are used rather than root mean square error to evaluate the predicting precision of the proposed model because the magnitude of each water quality parameter considerably differs, thereby providing reasonable and interpretable results. Compared with other ANNs with backpropagation, this study proposes an auto-adapting method assisted by the above-mentioned methods to select the best model with all settings, such as the number of hidden layers, number of neurons in each hidden layer, choice of optimizer, and activation function. Different settings for ANNS with backpropagation are important to improve precision and compatibility for different data. Furthermore, the proposed method is applied to hyperspectral remote sensing images collected using an unmanned aerial vehicle for monitoring the water quality in the Shiqi River, Zhongshan City, Guangdong Province, China. Obtained results indicate the locations of pollution sources.


Introduction
A fast and efficient computational method should be developed to quantitatively predict water contaminants because of the large area of contaminated water and the need for instant water monitoring.
The Shiqi River is affected by agriculture located along the river banks, in which remains of pesticide and fertilizers flow. The fishery industry along the river banks has some remains of fodder leaking into the river. Some light industries, such as textiles, contribute contaminants into the river [40,41]. The Shiqi River is adjacent to an estuary, and countercurrent frequently occurs, causing waste aggregation in the river [42]. Furthermore, the study area is under long-term inspection of contamination by the local environmental protection department. Effective supervision from point to area is important. The study area is relatively wide and open, enabling UAVs to work efficiently (more details about UAV can be found in Section 2.2.3). The management and supervision of the Shiqi River require decision makers to be informed about its current situation and its neighboring environment that can affect many economic sectors. A pilot area is selected for accurate detection of eutrophication-related substances, including nitrogen, phosphorus, and Chla, to investigate the effects of pollution on the dispersion and quantity of the above-mentioned water quality parameters [6,43].

Data Collection
This study will use the ground water surface reflectance and water quality parameter data to train and test an SSNN model, and then apply the new network to map the water quality parameters from UAV hyperspectral image data. Therefore, the above three types of data were collected and processed in different ways, which are described in the following sections. Section 2.2.1 will discuss the process of obtaining the ground water samples on 11 different routes to calculate the ground water surface spectral reflectance. Section 2.2.2 will demonstrate the process of the water parameter sampling, the method of storing the water parameter samples, and the experimental methods of measuring each of the water quality parameters. Finally, Section 2.2.3 will elucidate the instruments of this study, the process of obtaining the UAV hyperspectral image data, and the method of extracting the featuring bands from UAV hyperspectral image data and of transferring the featuring bands to input of the proposed model for estimation of water quality parameters.

Ground Water Surface Spectral Reflectance
The ground water surface spectral reflectance was collected using an ASD. The ASD used is FS HH 325-1075, with wavelengths ranging from 325 nm to 1075 nm. One calibrated reference board with known spectral reflectance is used to transfer the water surface radiance (or digital number (DN)) to reflectance [44], which can be expressed as: where L water,λ and L ref,λ are the measured radiance reflected by the water surface and the calibrated reference board under the same solar illumination, respectively. ρ ref,λ is the known remote sensing reflectance for the reference board in wavelength λ. We used spectral reflectance rather than remote sensing reflectance since no satellites and atmospheric correction were involved. Additionally, we referenced standard protocol to measure radiance from the study of Ruddick et al. [45]. The operators went to the different points using a boat and measured the water surface vertically downward at the moment without water waves or specular reflection. The operator held the ASD to make ASD as vertical to the water surface as possible with the intensive sunlight shining but without any shadow in the area of the measurement, where the distance between ASD and water surface was kept at 0.8 m. Meanwhile, another water sampling operator took one bottle of 500 mL water sample where we measured the reflectance. During this process, we measured the standard reference panel before measuring the water surface radiance every time we went to next sampling point. As shown in Figure 1, we conducted ground measurements on 11 routes (A1-A4, B1-B4, C1, D1, and E1) containing a total of 79 points for the training model. Each point was measured five times, and the reflectance average was regarded as the final value.

Data Collection
This study will use the ground water surface reflectance and water quality parameter data to train and test an SSNN model, and then apply the new network to map the water quality parameters from UAV hyperspectral image data. Therefore, the above three types of data were collected and processed in different ways, which are described in the following sections. Section 2.2.1 will discuss the process of obtaining the ground water samples on 11 different routes to calculate the ground water surface spectral reflectance. Section 2.2.2 will demonstrate the process of the water parameter sampling, the method of storing the water parameter samples, and the experimental methods of measuring each of the water quality parameters. Finally, section 2.2.3 will elucidate the instruments of this study, the process of obtaining the UAV hyperspectral image data, and the method of extracting the featuring bands from UAV hyperspectral image data and of transferring the featuring bands to input of the proposed model for estimation of water quality parameters.

Ground Water Surface Spectral Reflectance
The ground water surface spectral reflectance was collected using an ASD. The ASD used is FS HH 325-1075, with wavelengths ranging from 325 nm to 1075 nm. One calibrated reference board

Water Parameter Sampling and Measurement
A bottle of water with 500 mL sample volume was collected at the sampling site of water surface reflectance measurement and kept in a shaded environment before laboratory chemical testing on the same day. In the laboratory test procedure, (1) the content of total phosphorus containing substances, including dissolved phosphorus, particles phosphorus, organic phosphorus, and inorganic phosphorus, was collected through ammonium molybdate tetrahydrate spectrophotometry using a 722S visible spectrophotometer with a precision of 0.01 mg/L [22]; (2) the content of nitrogen, including ammonia nitrogen (NH3-N), free ammonia (NH3), and ammonium salt (NH4+), was measured through Nashi reagent spectrophotometry using a 722S visible spectrophotometer with a precision of 0.025 mg/L [46]; (3) COD was obtained through a dichromate method using a burette with a precision of 4 mg/L [47]; (4) BOD was collected through dilution and seeding method using an SPX-250BSH-II biochemical incubator for BOD with a precision of 0.5 mg/L [8]; (5) turbidity, including suspended and colloidal particles in water, was measured through spectrophotometry using a 722S visible spectrophotometer with a precision of 3 nephelometric turbidity units (NTU); and (6) Chla was collected through spectrophotometry with a precision of 2 µg/L [48].

UAV Hyperspectral Image Collection
The equipment used to measure the water surface spectral reflectance is ASD ranging from 325 nm to 1075 nm, with a total of 751 bands. The UAV used is DJI M600 that has loading capacity of 6 kg, highest flight height of 2500 m, hanging precision of 0.5 m in vertical direction and 1.5 m in horizontal direction, maximum speed of 18 m/s, and Lightbridge 2 as its high-quality digital figure transmission system, and the UAV hyperspectral imager is Gaia Sky-mini, with a push-scan on image device of 270 wavelengths ranging from 401.81 nm to 999.28 nm, 12 bits, flying at 120-m high in the sky with 40 cm resolution. The ground reflectance data were calculated on the basis of the standard target for calibration before flying the UAV. Ground reflectance is calculated using Equation (1) [45]. Multiple nonlinear regression models [14] were established to transfer the UAV reflectance data to ground reflectance data for each of the featuring bands from the collected UAV and ground data [49][50][51]. Several calibrated reference panels at various reflectance levels of 0.2, 0.4, and 0.6 were placed in the flight study area to calculate the water surface reflectance from the hyperspectral imager. The hyperspectral imager measured the panel images at the beginning and end of water surface data collection. The final water surface reflectance is calculated using Equation (1). Figure 1 shows the flight area image with a total of 5.65 km 2 . The dark area is the Shiqi River that flows from north to south. The bands' values from ASD and hyperspectral imager are different where the range of bands of the former covers that of the latter. After obtaining the ASD reflectance based on the protocols of measuring radiance and radiance-reflectance transferring method [44,45], we projected the wavelengths from the ASD to the hyperspectral imager, making them with the same center and number of wavelengths. Then, we selected a total of 145 feature bands ranging from 404.0 nm to 894.3 nm through feature engineering, including correlation analysis, f regression, χ 2 test with degree of freedom of six, where the explained variance is set to 99.99%, Single Value Decomposition (SVD) for extracting feature values using Equation (2). The UAV hyperspectral image data, the reflectance data including 270 wavelengths for each pixel point could be transferred to featuring bands including 145 wavelengths. Given that atmospheric correction was not conducted on the image, we chose ground points to eliminate the difference between the ASD and hyperspectral imager reflectance [49,51]. Thus, the reflectance from the hyperspectral imager should be close to the ASD reflectance to refine it as reflectance measured from the water surface using the ASD with multiple nonlinear models, which can be expressed as Equation (3), to be used as input to the model for estimation of each water quality parameter on large scale.
r n,1 r n,2 · · · r n,n In Equations (2) and (3), r n,1 r n,2 · · · r n,n is the ASD reflectance presented by a matrix using most reflectance from the original ASD reflectance data through denoising.
is the final obtained featuring reflectance through singular value decomposition, feature engineering, and flattening. a λ , b λ , and c λ are the coefficients at band λ. ρ UAV,λ is the reflectance from hyperspectral imager band λ, and ρ ASD,λ is the ASD reflectance band λ, which is the obtained through interpolation over featuring wavelengths from the UAV. In situ measurements were collected on the same day with collection of water samples. Additionally, some studies used spectral reflectance instead of remote sensing reflectance to estimate water quality parameters [49][50][51][52], the methods of which were mostly related to empirical methods and semi-analytical methods. Figure 2 shows the method used to estimate the water quality parameters. First, the ground sample contains two parts, namely, ASD reflectance data and water quality parameters, which are used to establish the SSNN model. Second, the UAV hyperspectral image data in the nonlinear reflectance transferring model [49,51] are used as input to refine the data by transferring reflectance from UAV to ASD over water surface measured by ASD. Third, the transferred reflectance from UAV is used in the established SSNN model for quantitative estimation of water quality parameters, and the package of ArcGIS is used to generate thematic images.  The proposed SSNN model mainly consists of three parts, namely, ANN, linear regression, and feedback machine. ANN is based on traditional ANNs of numerical prediction, including feature selection of bands, stepwise backtracking, and weight correlation. Linear regression is designed for tuning the final results. A feedback machine is dedicated for the self-adaption of the SSNN model, updating the settings for the ANN structure, such as the number of hidden layers, activation function, and the number of neurons of each hidden layer. The proposed method to monitor water quality related to ANN conducts numerical prediction on water quality parameters. Some other methods, such as combined correlation weights and the feedback machine, are incorporated into traditional ANN to quantitatively improve prediction accuracy based on a previous study [2,13,18,53,54]. The proposed SSNN model mainly consists of three parts, namely, ANN, linear regression, and feedback machine. ANN is based on traditional ANNs of numerical prediction, including feature selection of bands, stepwise backtracking, and weight correlation. Linear regression is designed for tuning the final results. A feedback machine is dedicated for the self-adaption of the SSNN model, updating the settings for the ANN structure, such as the number of hidden layers, activation function, and the number of neurons of each hidden layer. The proposed method to monitor water quality related to ANN conducts numerical prediction on water quality parameters. Some other methods, such as combined correlation weights and the feedback machine, are incorporated into traditional ANN to quantitatively improve prediction accuracy based on a previous study [2,13,18,53,54].

Methodology
The training data of SSNN includes water surface reflectance and content level of all contaminants in each point. Common ANNs with backpropagation only use one setting for all data types and ignore the changes in water bodies, resulting in low precision and compatibility. The proposed method compares all ANN-BPs to select the best one. Backpropagation, stepwise backtracking, Pearson correlation, and cosine correlation are conducted in the SSNN model. In machine learning, stepwise backtracking mathematically and computationally explains that the currently used learning rate is halved to retrain the data when the current error between the training and predicted values at current iteration step is larger than the previous error between the training and predicted values at previous iteration step. Otherwise, the halved learning rate will be halved again for small error or maintained as the current learning rate when the condition occurs again [55]. The current error between the training and predicted values is smaller than the previous error between the training and predicted values when the learning rate is small. However, the convergence rate will be slow when the initial learning rate is excessively small because no apparent changes occur between the current and previous steps, making it suitable to uses stepwise backtracking in this study. Figure 3 shows the basic structure of the improved SSNN model from traditional ANN for predicting water quality parameters. The ANN obtains the results using Equations (4)- (6). The loss function is defined as Equation (7), and the final results in the SSNN model is obtained using Equation (8), where x is the input feature vector, w n is the nth step weight vector, y is the ground-measured value vector,ŷ is the predicted value vector, x n is the nth layer obtained vector, Θ denotes the set of parameters, and S denotes the set of data samples. Various NNs with different numbers of hidden layers, numbers of nodes in each hidden layer, optimizers, and activation functions are used to choose the best NN among them on the basis of root-mean-square error (RMSE), F statistic, t statistic, and R squared value. F statistic is used to compare the statistical models fitted to the dataset to ensure the fitness of the chosen model in terms of population [56], which can be defined as Equation (9): where TSSŷ , y is the total sum of squares ofŷ and y, RSSŷ , y is the residual sum ofŷ and y squares, n is the number of samples, and p is the number of features.
where is the input feature vector, is the th step weight vector, is the ground-measured value vector, is the predicted value vector, is the th layer obtained vector, Θ denotes the set of parameters, and denotes the set of data samples. Various NNs with different numbers of hidden layers, numbers of nodes in each hidden layer, optimizers, and activation functions are used to choose the best NN among them on the basis of rootmean-square error (RMSE), F statistic, t statistic, and R squared value. F statistic is used to compare the statistical models fitted to the dataset to ensure the fitness of the chosen model in terms of population [56], which can be defined as Equation (9): The F statistic plays an important and indispensable role in ANOVA. F test is defined where the null hypothesis in model 2 does not significantly fit the data better than model 1. The t statistic is used to estimate the population mean from a distribution of sample where the population standard deviation is unknown [57]. In this research, the proposed method uses the t statistic with the corresponding p-value to measure the deviation degree of the predicted values from the measured values through the mean of each group accompanied with p-value threshold set to 0.05 for rejecting the null hypothesis, where the null hypothesis is defined as the mean of predicted values is equal to the mean of measured values at 95% confidence level. The difference between two independent samples is tested, where the first one is the sample of predicted values, and the other one is the sample of measured values at 5% significance level. The formula to calculate the t statistic of two independent samples with unequal variance is obtained from Welch's t test, as shown in Equations (10) and (11): where g Welch s t−test (ŷ, y) is the t score forŷ and y, d f Welch s t−test is the degree of freedom, s is standard deviation, and n is the number of samples. The null hypothesis is that the mean of the predicting model is equal to that of the idealized model, which is distributed as the real values with respect to one of the previously mentioned water quality parameters.
The fitting model appropriately fits the data when the R 2 value is more than 0.5, otherwise it is inappropriate. The proposed method can choose a better prediction model based on the above-mentioned criteria. However, the model adopts Pearson correlation and cosine correlation to pull the predicted value deviating far from the normal volume range of the water quality parameter when the predicted value deviates from the normal range of content level of contaminants by balancing each correlation method bias and considering the weight allocation of each correlation. The working mechanism utilized in this research can be explained by the SSNN algorithm (details can be found in Algorithm A2 in the Appendix A).

Experimental Results
In this study, the content levels of phosphorus, nitrogen, COD, BOD, turbidity, and Chla range from 0.09 mg/L to 0.52 mg/L, from 0.09 mg/L to 5.37 mg/L, from 5.0 mg/L to 58.0 mg/L, from 1.0 mg/L to 13.9 mg/L, from 10 NTU to 97 NTU, and from 3 µg/L to 238 µg/L, respectively. The turbidity, Chla, BOD, COD, and nitrogen of Figure 4j are the highest among the plots because the water samples are collected in fish-cultivating pools. Organic matter causes the high concentrations of BOD, COD, and nitrogen. Turbidity is highly concentrated in pools because of the absence of good outlet and inlet for water exchange, causing turbidity to rapidly increase in pools. The concentration of water quality parameters in other routes is relatively low because of the existence of water exchange and few living wastes (additional details of the sampling and measurement of water quality parameters can be found in Table 1).
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 31 Water samples were collected using a boat with ASD, and a water quality parameter instrument obtained some samples on the river concurrent with UAV taking hyperspectral images of the study area. Table 1 shows the mean and range of each parameter in the sample dataset.  Water samples were collected using a boat with ASD, and a water quality parameter instrument obtained some samples on the river concurrent with UAV taking hyperspectral images of the study area. Table 1 shows the mean and range of each parameter in the sample dataset. Figure 4a shows the drop around 400 nm and frequent fluctuations where turbidity, BOD, and COD have relatively lower values. In Figure 4b, most points have higher reflectance compared with that of Figure 4a, and turbidity and BOD values are higher than Figure 4a. In Figure 4c, a striking jump is observed near 400 nm compared with Figure 4a, where the turbidity values are similar to that in Figure 4b but higher than that in Figure 4a. A high jump in reflectance from 900 nm to 950 nm is observed, and BOD values are lower than without an apparent reflectance jump. In Figure 4d Figure 5 shows the changes in accuracy, which is defined as 1-MPAE, with the number of iterations ranging from 100 training iterations to 1000 training iterations with a step of 100 iterations and how the selected ANN-BP model outperforms the four other four models. The selected model is different from four other models in the number of hidden layers, number of hidden layer nodes, choice of optimizers, and choice of activations functions. Models 1 to 4 and the selected model are different from each other in one graph, and may not be the same as those in another graph because these models are the best models with respect to RMSE, F statistic, t statistic, and R 2 value. Models 1 to 4 in Figure 5a may be different from those in Figure 5b. If no model with respect to all criteria is constantly superior or outperforms other models in terms of RMSE, F statistic, t statistic, and R 2 value, this special case can be considered in terms of RMSE, followed by R 2 value, F statistic, and t statistic. Considering all criteria helps reduce the computational load and time of selecting the best ANN-BP model. As shown from Figure 5a-c, the selected ANN-BP model does not outperform the four other ANN-BP models under 100 iterations to 400 iterations, but gradually performs better than others. After 600 iterations, Figure 5a-c approximately obtain the relatively stable accuracy without apparent increase or decrease, and attain the balanced condition. Figure 6 shows that the selected ANN-BP model fits the data better when the data are close to the orange line, explaining that the predicted values approximately match the measured values pretty using the selected ANN-BP model by the proposed SSNN. The linear relationship is established between the predicted and measured values, where the R 2 value is high in each plot, indicating the strong relationship between the predicted and measured values through a linear equation. Compared with the previously mentioned R 2 , the R 2 values in each plot of Figure 6 are given by the linear model in the same plot, and they are based on the prediction with respect to measured values rather than on the prediction with respect to reflectance. The trend can be observed by the blue line in each plot. Table 2 gives the evaluation criteria of choosing the selected ANN-BP model and the p-value corresponding to t statistic. As shown in Table 2, turbidity and Chla generate the largest RMSE because the turbidity and Chla values are greater than others in magnitude and range of turbidity measured with unit NTU. The smallest RMSE is observed from the analysis of phosphorus because its range of values is smaller than others in magnitude. The F test with null hypothesis shows that model 2 does not significantly fit the data better than model 1. As shown in Table 2, a good ANN-BP model usually gives a large F statistic, and all models are compared with each other for only one water quality Remote Sens. 2020, 12, 336 14 of 28 parameter. The t statistic is specified as Welch's t test for two independent samples assuming unequal variances where the null hypothesis is that the mean of the predicting model is equal to that of the real model, which is distributed as the real values with respect to one of the water quality parameters. The confidence level is assumed to be 95%, and the significant level is 0.05. The null hypothesis is rejected when the p-value is smaller than 0.05, indicating the result is statistically significant. The p-values in Table 2 are all greater than 0.05, showing that the mean generated by one model is equal to that of the real model distributed as real values with respect to one of water quality parameters by accepting null hypothesis at 95% confidence level.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 31 and how the selected ANN-BP model outperforms the four other four models. The selected model is different from four other models in the number of hidden layers, number of hidden layer nodes, choice of optimizers, and choice of activations functions. Models 1 to 4 and the selected model are different from each other in one graph, and may not be the same as those in another graph because these models are the best models with respect to RMSE, F statistic, t statistic, and value. Models 1 to 4 in Figure 5a may be different from those in Figure 5b. If no model with respect to all criteria is constantly superior or outperforms other models in terms of RMSE, F statistic, t statistic, and value, this special case can be considered in terms of RMSE, followed by value, F statistic, and t statistic. Considering all criteria helps reduce the computational load and time of selecting the best ANN-BP model. As shown from Figures 5a-c, the selected ANN-BP model does not outperform the four other ANN-BP models under 100 iterations to 400 iterations, but gradually performs better than others. After 600 iterations, Figures 5a-c approximately obtain the relatively stable accuracy without apparent increase or decrease, and attain the balanced condition.    between the predicted and measured values, where the value is high in each plot, indicating the strong relationship between the predicted and measured values through a linear equation. Compared with the previously mentioned , the values in each plot of Figure 6 are given by the linear model in the same plot, and they are based on the prediction with respect to measured values rather than on the prediction with respect to reflectance. The trend can be observed by the blue line in each plot.  Table 2 gives the evaluation criteria of choosing the selected ANN-BP model and the p-value corresponding to t statistic. As shown in Table 2, turbidity and Chla generate the largest RMSE because the turbidity and Chla values are greater than others in magnitude and range of turbidity measured with unit NTU. The smallest RMSE is observed from the analysis of phosphorus because its range of values is smaller than others in magnitude. The F test with null hypothesis shows that model 2 does not significantly fit the data better than model 1. As shown in Table 2, a good ANN-BP model usually gives a large F statistic, and all models are compared with each other for only one water quality parameter. The t statistic is specified as Welch's t test for two independent samples assuming unequal variances where the null hypothesis is that the mean of the predicting model is equal to that of the real model, which is distributed as the real values with respect to one of the water The R 2 values are all greater than 0.5, explaining that more than 50% of corresponding variance in the dependent variable can be predicted from the independent variable. In other words, more than 50% variance can be explained. The closer the R 2 value to 1 is, the better fit to data model will be.
The comparison between hyperspectral sensor to detection-needed water quality parameters and hyperspectral sensor closer to detection-needed water quality parameters help to understand the necessity of reducing the interference of such medium, such as cloud, and dust, although its corresponding R 2 values are fine. The hyperspectral images in other studies were obtained from a long distance satellite with a highly expensive and lengthy process of image retrieval that involves many interferences, such as reflection, refraction, and heterogeneous medium.
In this study, the relationship between featuring bands' reflectance and content level of each water quality parameter is evaluated. Each graph in Figure 7 shows that the prediction with 5% deviation from bands' reflectance performs best and provides the least RMSE compared with the two other deviated reflectance, namely, 10% and 15% deviations. Prediction with 15% deviation from bands' reflectance gives the largest error relative to the measured values. Figure 7 shows the apparent and strong relationship between featuring bands' reflectance and content level of each water quality parameter because of the many biases deviating from reflectance bands and the low accuracy or high RMSE for predicting the content level of water quality parameters. The water quality parameters are derived using the SSNN method with the changes in reflectance.  This research uses 30 samples as the testing dataset collected after training the data. Figure 8 shows the comparison of the predicted and ground-measured values. The proposed method accurately and quantitatively predicts nitrogen, COD, and Chla, indicating its generality and validity to predict water quality parameters. Table 3 elucidates the performance of different methods, including SSNN, traditional single-layered ANN from Mohamad, and an empirical method from Liew et al., on the testing dataset of the entire area. The proposed method outperforms other methods in terms of RMSE and MPAE. As shown in Table 3, nitrogen calculated by SSNN achieves the best result because its MPAE is the lowest. MPAE is considered more rather than RMSE because it convincingly and effectively demonstrates the numerical prediction of the proposed method. More data should be collected from the entire area to ensure accurate numerical prediction of each water quality parameter. Therefore, more data will be collected to effectively investigate water quality in future studies. As shown in Figure 8, the R 2 value of nitrogen is larger than that of others, whereas some water quality parameters with high R 2 values may not have low MPAE because the random sample size is small, making it difficult to verify the direction of all water quality parameters. The prediction using the proposed method works properly for most of the water quality parameters, although the sample does not cover every point on the entire area where pixel points are initially spaced 40 cm apart.  Table 3 elucidates the performance of different methods, including SSNN, traditional singlelayered ANN from Mohamad, and an empirical method from Liew et al., on the testing dataset of the entire area. The proposed method outperforms other methods in terms of RMSE and MPAE. As shown in Table 3, nitrogen calculated by SSNN achieves the best result because its MPAE is the lowest. MPAE is considered more rather than RMSE because it convincingly and effectively demonstrates the numerical prediction of the proposed method. More data should be collected from the entire area to ensure accurate numerical prediction of each water quality parameter. Therefore, more data will be collected to effectively investigate water quality in future studies. As shown in Figure 8, the value of nitrogen is larger than that of others, whereas some water quality parameters with high values may not have low MPAE because the random sample size is small, making it difficult to verify the direction of all water quality parameters. The prediction using the proposed method works properly for most of the water quality parameters, although the sample does not cover every point on the entire area where pixel points are initially spaced 40 cm apart.  As previously mentioned, the ground ASD reflectance and water quality parameters were used as inputs to the SSNN model to establish the training model, and the UAV hyperspectral reflectance image was used as input to the SSNN model to predict the water quality parameters. Taking the small gouge marked by rectangle in the UAV image shown in Figure 1 as an example, Figure 9 shows the resulting image of the estimated water quality parameters under three wavelengths of 480, 550, and 670 nm for RGB color. In Figure 9, the distribution of each water quality parameter can be easily observed and local environment protection department can trace the distribution and the change of the content level of each water quality parameter over time to determine the source of pollution.
Although Figure 9 shows only a part of the entire study area, which is the area surrounded by red rectangle in Figure 1, its result is representative. The results show the places where people live or any factory producing leather and plastic are mostly contaminated with high contents of turbidity, COD, BOD, and phosphorus. The featuring wavelengths can quantitatively and qualitatively explain the changes in the water quality parameters. Figure 4b-d are rich in Chla, intensively fluctuating at the range of 450 and 700 nm, and Figure 4c,h have relatively high reflectance at the range of 400 nm to 900 nm corresponding to the change of above-mentioned water quality parameters over the featuring bands.

Discussion
In this study, we use the MPAE, RMSE, and as the criteria to determine if the proposed model fits our data properly. However, some other methods have a relatively high of 0.94 [35,54,58], and they are based on the prediction and measured values. Multiple stepwise regression analysis by Yu et al. shows the best result as 0.98 and worst result as 0.60, and that results fluctuate considerably and unstably, illustrating that the proposed method is insufficient to explain different situations or cases [34]. Some empirical methods have used dominant wavelengths that obtain poor accuracy, where the value is less than 0.7 on average [10,11,27]. Compared with some hyperbolic equations in describing the biological processes in wetlands, the traditional prediction method for the numerical values of COD is produced through linear regression [13,17]. Phuong et al. predicted BOD and COD removals through a traditional ANN method, using either COD or BOD as input in ANN to predict the other one [25], which was cost-inefficient and time-consuming. Furthermore, Mohamad's method provided values of 0.9, but the range of measured values is too narrow to have strong representativeness such that the variation of measured values can be hardly seen [18]. He input 24 bands to his ANN model to get one output correlated with input, and the fitting process created a new mathematical function to predict the content level of Chla with only one input, the ratio of bands 671 nm and 681 nm. However, this process lost much information about

Discussion
In this study, we use the MPAE, RMSE, and R 2 as the criteria to determine if the proposed model fits our data properly. However, some other methods have a relatively high R 2 of 0.94 [35,54,58], and they are based on the prediction and measured values. Multiple stepwise regression analysis by Yu et al. shows the best R 2 result as 0.98 and worst R 2 result as 0.60, and that results fluctuate considerably and unstably, illustrating that the proposed method is insufficient to explain different situations or cases [34]. Some empirical methods have used dominant wavelengths that obtain poor accuracy, where the R 2 value is less than 0.7 on average [10,11,27]. Compared with some hyperbolic equations in describing the biological processes in wetlands, the traditional prediction method for the numerical values of COD is produced through linear regression [13,17]. Phuong et al. predicted BOD and COD removals through a traditional ANN method, using either COD or BOD as input in ANN to predict the other one [25], which was cost-inefficient and time-consuming. Furthermore, Mohamad's method provided R 2 values of 0.9, but the range of measured values is too narrow to have strong representativeness such that the variation of measured values can be hardly seen [18]. He input 24 bands to his ANN model to get one output correlated with input, and the fitting process created a new mathematical function to predict the content level of Chla with only one input, the ratio of bands 671 nm and 681 nm. However, this process lost much information about other bands since only the ratio of bands 671 nm and 681 nm was considered without extracting the featuring information from other bands of initial input, 24 bands, such as that 660 nm and 665 nm are sensitive to change of the content level of Chla [30]. No concurrent sampling of remote sensing images and samples collected in the study area were conducted, resulting in an adverse effect that hyperspectral reflectance frequently changes and does not match with Chla sampling, causing the highly biased correlation relationship of his ANN-BP model and highly biased final mathematical function of predicting Chla.
The ANN method [54] by Alizadeh and Kavianpour was not generalized for other water quality parameters because some meaningful bands are neglected, and the simplified empirical equations lack scientific theoretical evidence to support themselves. The remote sensing in his study is highly expensive with a lengthy process of image retrieval and many interferences, such as reflection, refraction, and heterogeneous medium. The proposed method in this research overcomes the expensive retrieval of hyperspectral images from satellites and highly biased reflectance through concurrently cooperating work of UAV sampling and ground sampling and solves the loss of some significant information about other featuring bands using a generalized ANN-BP model for prediction rather than a highly simplified mathematical function for final prediction. Firrao et al. proposed a method [31] with accuracy achieving 75 out of 105 correctly assigned frames, illustrating that 75 of 105 testing observations are correctly classified into one of the three categories that are defined on the basis of some certain values of parameters. However, they did not consider the bands less than 750 nm that might contain some useful and important information. Their method achieved poor accuracy for the content level of fumonisins because it only classified the three class levels with R 2 value of 0.6 and prediction accuracy less than 0.8. The proposed method in this study provides a precise quantitative prediction for the water quality parameters of each pixel. In addition, the proposed method realizes the prediction for some water quality parameters and automates prediction and analysis by providing low errors and high R 2 values. For some of the above-mentioned water quality parameters, a previous study on Chla introduced a hybrid inversion method incorporating support vector machine, random forest regression [59], and other machine learning methods to predict Chla [6,60]. However, the hybrid inversion method did not consider all information in hyperspectral remote sensing images because some wavelengths with apparently different reflectance might need to be combined rather than only using one of them. Regional multiple stepwise regression ignores a large amount of features, adversely affecting the final prediction because of the importance of collinearity between bands, and its total amount of variables is approximately 20 and combined as spectral indices of reflectance, where the removal of useful and significant combinations of variables is time consuming, sophisticated, and can only predict one water quality parameter [34].
Ryan and Ali proposed a method using partial least square regression to predict each water quality-related variable [29]. However, their method could not update each parameter in partial least square regression to improve the prediction accuracy with the increase in the number of iterations, and the R 2 value (0.85) is derived from the equation established based on the relationship between the measured and predicted values, where a strong relationship is found, and a high MPAE of approximately 30% is obtained. A physics-based method using backscattering and coefficient absorption of water can be used to predict nitrogen and BOD related to water reflectance, and the optical properties could be obtained solely from the sensor data without needing additional in-situ data, where the R 2 value exceeds 0.8 through linear regression. Liew et al. proposed a method [61] that required many tests on experimental samples and empirical experiments on combinations of different bands because of the large amount of data, which is time consuming and expensive.
In Figure 9, a reasonable inference can be made that (a) has a high level near the bank where many dwellers pour their living waste, such as detergent, papers and cooking oil into the river, causing quick accumulative concentration of turbidity [24], where some of them suspend in the water surface rather than depositing at the bottom of the water, and others may deposit at the bottom of the water, causing the water to be muddy and turbid, black in color, and nontransparent. Some leather-processing and textile factories are located near the bank, and some of them discharge their industrial wastewater containing heavy metals, such as copper and lead, and dyestuff, such as bioresistant organic pollutants, as recalcitrant xenobiotic compounds that are difficult to be degraded and not exhausted on leather materials. Acid and direct trisazo dyes are dumped into the water through underwater pipes. A small accumulation of turbidity is observed in the bank because the living wastes or effluents in the industries need time and kinetic energy to transport off bank to the center of the river. Some chemical compounds consisting of heavy metals causing the water to be black and turbid cannot go far from the bank because of their weights [62]. The same condition is observed for the high concentration of phosphorus ( Figure 9d) because most living wastes, such as fertilizers and laundry detergents, contain phosphorus. As shown in Figure 9d, the concentration of phosphorus near the bank is higher than that far from the bank because laundry detergents and other phosphorus-containing things, such as soap, are frequently discharged near the bank of the living area of dwellers, causing the quick increase in phosphorus. Thus, the accumulation of phosphorus near the banks is constantly higher than that in far parts toward the center of the river, although phosphorus materials may have lower weights than other metal-related materials [63].
A previous study by the local government showed that a large quantity of algae need oxygen to grow and release oxygen, resulting in a high concentration of each water quality parameter [7,32] in Figure 9b,c, indicating that the concentration of each water quality parameter from the bank is high represented by the orange band and caused by cultivation industry near the bank, and a small concentration is observed everywhere else [64]. As shown in Figure 6b,c, COD and BOD have a similar distribution of concentration near the discharge outlet close to the bank because of the need for some cultivated aquatic plants or animals, and a low concentration of COD and BOD from the bank toward the center of the river is observed, forming a curvy band of relatively dense concentration along the convex parts of the bank [36]. COD and BOD may be caused by some aquatic animals because phosphorus has a relatively low concentration, causing the normal reproduction of aquatic animals rather than hampering their reproduction [32].
As shown in Figure 9e,f, the red color represents the high concentration of nitrogen and Chla with similar distributions because nitrogen is an important water quality parameter used to synthesize Chla, and the shortage of nitrogen will stifle the synthesis and growth of Chla because nitrogen and Chla are generated by some algae off the bank aggregating as clusters [7,65]. The remains of laundry detergent and agricultural fertilizer poured into water outlets are discharged to the river by fishermen or local dwellers who live near the pools, causing algal growth associated with the high concentration of Chla [29,33]. The concentration distribution of phosphorus is similar to the concentration distribution of nitrogen and Chla for some parts near the waste discharge outlet because the rapid accumulation of phosphorus can cause water eutrophication [6,66]. Phosphorus and nitrogen are necessary for the growth of algae, and overgrowth of algae occurs in any area with high concentrations of phosphorus and nitrogen. Some detrimental chemical materials, such as copper sulfate used in agriculture or plastic-made materials, are discharged from two convex parts of the bank, thereby destroying the growth of algae near the two convex parts of the bank [7] Chemicals, such as copper sulfate, are important and beneficial for agricultural cultivation of aquatic animals, such as fish, because it works as algaecide and pesticide to prevent some fish-related diseases for agricultural cultivation of aquatic animals under regular dosage control [67]. Thus, the concentrations of BOD and COD near the convex parts of the bank are higher than other areas, whereas the concentrations of nitrogen and of Chla are lower near the convex parts of the bank and higher than other areas.  Figure 9b,c because most oxygen demands are from aquatic animals rather than from algae, and few aquatic animals can be observed because eutrophication causes the reproduction of algae to reduce the amount of nutrients for other organisms, such as fish and planktons, indicating that algae deprive the necessary nutrients, such as oxygen and nitrogen, to other organisms. Therefore, the concentration distribution of BOD and COD is different from that of nitrogen and Chla [36]. The places far from dwellers and factories are mostly contaminated with high contents of nitrogen and Chla, in which synthesis requires nitrogen because the absence of dwellers and factories enables many opportunities to be exposed to light to grow faster than others compare with those with small chance to be exposed to light where light is important and indispensable for growing of plants [65].The results show that local environmental protection department can find the sources of contamination and monitor the change in water quality of the Shiqi River in situ using the SSNN model. BOD and COD are measured through the spectrum at wavelengths ranging from 400 nm to 800 nm, which are covered in the range of bands in our research [8,13,36]. Nitrogen is measured in the spectrum ranging from 350 nm to 2500 nm [21], phosphorus apparently changes at the spectrum ranging from 400 nm to 900 nm [23], and turbidity intensively fluctuates at the spectrum ranging from 400 nm to 850 nm. Chla is mainly affected at the spectrum ranging from 450 nm to 675 nm [29], which is mostly contained within our wavelengths ranging from 404.0 nm to 894.3 nm.
There are some limitations in this study, such as limit of data volume and the uncertainty of model of transferring UAV to ASD. For some other methods [1,67], they used hundreds of samples to build and test their models where some of their water samples are deployed in laboratory but not in situ samples. However, our water samples are in situ samples, which are time-consuming and highly expensive to be chemically analyzed. And we took the samples from 11 different routes and the collected samples were relatively representative since the sample-collecting area covered certainly meaningful parts of the whole area including a heavily polluted area, a lightly polluted area, and an unpolluted area. Additionally, our proposed model is a self-adapting selection model which is able to select the best model regardless of the diversity of data, and our model can perform better so long as the volume of data meet the basic requirement of establishment of the model. In our study, 79 samples basically satisfied the requirement of establishing model since we were constrained to certain financial support and time. Nevertheless, if we obtained more water samples, the prediction model would outperform the current model in terms of its quantitative predicting accuracy. Currently, our fitted model fits our research area, but if we obtain more different data, our proposed method will fit other area as well because of the self-adaption selection ability of our model. Deep and further study will continue with more sample data for better performance and more effective monitoring of the change of water quality.

Conclusions
In this study, the proposed SSNN is a general method used to predict water quality parameters, including phosphorus, nitrogen, BOD, COD, turbidity, and Chla. The proposed method is improved from the conventional ANN-BP method with a fixed and simple structure that only fits its training data. Furthermore, the proposed method combines concurrent UAV and ground sampling and uses an improved ANN-BP to predict the content level of the above-mentioned water quality parameters under modified water reflectance. The hyperspectral image data needs to be transferred to match ASD ground reflectance data at each wavelength piece-wise, since estimation for each water quality parameter based on reflectance has been proved by earlier studies in Section 2.2.3. Thus, the proposed method of SSNN can quantitatively and precisely predict the content level of each water quality parameter using the featuring testing bands' reflectance. Compared with other studies, our proposed method is novel and it predicts the quantitative content levels of water quality parameters based on hyperspectral reflectance. Furthermore, the predicting precision of the quantitative content levels of water quality parameters based on remote sensing reflectance in other studies are lower than that in our study. The R 2 values of most auto-selected models are more than 0.9, and their MPAEs are less than 10% on the testing dataset, demonstrating that the predicting model fits the data well. The R 2 values obtained using the linear regression equation established based on the measured values rather than predicted values are higher than 0.98. Thus, the proposed SSNN method outperforms other methods in terms of universality and precision.
The SSNN algorithm incorporating other algorithms, such as SplineIter (details can be found in Algorithm A1 in the Appendix A), is a relatively general means to balance the difference between the predicted and measured values through matching the featuring bands' reflectance with most of the training similar bands' reflectance to control the predicted values via combined correlation (details can be found in Algorithm A3 in the Appendix A). Compared with other traditional ANN-BPs, this research proposes a method combined with the above method that can select the best model with the best settings, such as the number of hidden layers, number of neurons in each hidden layer, choice of optimizer, and activation function because different data may fit different settings well. From the comparison with other studies in the methodology and experimental results, this research aims to estimate water quality parameters from a different and novel perspective and outperforms other methods in terms of diversity, universality, compatibility, and novelty because it provides higher accuracy, interpretability, and computational efficiency. Confronted with different hyperspectral reflectance data and precision requirements, our proposed method is able to self-tune to choose the best model for given data, where the only thing that needs manual manipulation is the setting threshold for each of the mathematical and statistical criteria. The proposed method is applied to hyperspectral remote sensing images collected by UAV for monitoring the water quality of the Shiqi River, Zhongshan City, Guangdong Province, China. The results indicate the locations of pollution sources.
The estimation of water quality parameters through low-valued water reflectance is relatively difficult to achieve because the content level of water quality parameters is relatively low in the Shiqi River. Thus, water quality estimation from remote sensing hyperspectral data technically obtains weak signals. Thus, the high quality of the obtained UAV data should be ensured. The current experimental instruments may not meet the requirements. The sampling conditions should be appropriately controlled. The featuring wavelengths may not be specific, and parameter-controlled experiments should be considered. The hyperspectral data are mainly obtained from the Shiqi River, and the generalization of the proposed model should be in accordance with the more accumulation of ground data.
For the results in the entire study area, the local environmental protection department randomly collected some testing samples from the entire area rather than in the study area. Future studies will focus on monitoring water quality and inversion of parameters. A cloud station for instant prediction of the content level of water-quality-related parameters can be established. An unmanned boat can be used to take water samples and measure the content level of water quality parameters that must be tested and transmit the information of the content level of these water quality parameters as samples to the cloud station, which can instantly compute the results and send them to users quickly through Wi-Fi and Bluetooth. With regard to the direction of future research, we will try to obtain more related data to build a more generalized and deepened model structure to fit a wider variety of UAV-borne hyperspectral reflectance data, which may be applied to other water bodies.