Approbation of the texture analysis imaging technique in the wastewater treatment plant

Monitoring of effluent turbidity is essential to evaluate the coagulation process in wastewater treatment plants (WWTPs). A digital imaging system based on the texture analysis of flocs has been tested in a Norwegian municipal WWTP to predict changes in coagulation conditions and outlet turbidity. Principal component analysis (PCA) was applied to prove that the textural features of flocs’ images depend on the inlet wastewater parameters and coagulation conditions. Partial least squares regression (PLSR) was performed for the outlet turbidity prediction. The best model resulted in 86.6% prediction accuracy using two wastewater quality parameters (inlet flow and inlet turbidity) and 4 textural feature vectors retrieved from the images of flocs. Furthermore, the outlet turbidity predicted by this model resulted in a lower amount of underestimated values compare to the model, which contained only wastewater quality parameters. A new term—floc texture index (FTI) summarizes the textural features of flocs’ images resulting in a single variable (thought linear combination). This further simplifies the current multivariate dosage control system. Analysing the plant data indicates that an FTI value below 6 would result in the outlet turbidity values above 5. This can be used as an early warning system of coagulation failure. The results of these studies demonstrate the potential of the digital imaging system to improve an existing online coagulant dosing control strategy. *Corresponding author: N. Sivchenko, Department of Mathematical Sciences and Technology, Norwegian University of Life Sciences, P.O. Box 5003-IMT, 1432 Aas, Norway E-mails: nataliia.sivchenko@nmbu.no, nataliia.sivchenko@gmail.com Reviewing editor: Claudio Cameselle, University of Vigo, Spain Additional information is available at the end of the article


ABOUT THE AUTHORS
The authors are members of Water, Environment, Sanitation and Health (WESH) research group at the Norwegian University of Life Sciences (NMBU), Faculty of Sciences and Technology (RealTek). WESH group focuses on water and wastewater related issues and is heavily involved in teaching and supervision of MSc and PhD programs in water engineering and technology. The group also has one of the largest externally funded research and educational project portfolios at RealTek, and collaborates with partners from EU, North America, Eurasia, Asia and Africa. It also has a number of Research, Development and Innovation projects with Norwegian partners. The main research and development areas of the WESH group are: process control and optimization of coagulation and biological treatment processes; membrane fouling and filtration processes; microbial water quality and risk assessment; decentralized wastewater systems; modelling of sewer systems.

PUBLIC INTEREST STATEMENT
Last decades researchers working with water and wastewater treatment often try to benefit from applying image analysis based systems to the different processes. Doing this we are able to monitor or even properly control these processes, for instance wastewater coagulation, which is often difficult otherwise. In this paper, we are testing how the digital imaging system based on texture analysis of particles (flocs) performs in a full-scale Norwegian wastewater treatment plant. Retrieving the textural parameters from the images of flocs obtained during different coagulation conditions in the plant, we were able to indicate the changes in wastewater inlet and outlet quality parameters. The prediction of outlet wastewater quality parameters (turbidity in this case) is necessary to provide efficient coagulant dosage control and/or develop the coagulation failure alarm system. The results of these studies are to be used to improve the existing online coagulant dosing control system.

Introduction
Coagulation is a well-known and widely used water and wastewater treatment method to remove suspended solids, phosphates and other water impurities. Particles which aggregate during the coagulation/flocculation process are called flocs. The amount of coagulant (chemical material, addition of which results in suspension's destabilisation), its nature, concentration and mixing conditions are the main factors influencing flocs aggregation and breakage. The size and surface properties of flocs formed under different coagulation mechanisms strongly influence their behaviour during further solid-liquid separation process (Bache & Gregory, 2007). Many researchers have studied different floc features, such as size distribution and fractal dimension (Chakraborti, Atkinson, & Van Benschoten, 2000;Li, Zhu, Wang, Yao, & Tang, 2006;Vahedi & Gorczyca, 2011). Some attempts were done to create online systems for flocs characterisation and monitoring. A method of online floc size evaluation based on nephelometric turbidity measurements was presented (Cheng, Kao, & Yu, 2008) and further developed into nephelometric turbidimeter monitoring system (NTMS) (Cheng, Chang, Chen, Yu, & Huang, 2011;Yu, Chen, & Cheng, 2017). Photometric dispersion analyser (PDA) is an optical instrument often used to study the aggregates' characteristics (Burgess, Curley, Wiseman, & Xiao, 2002;Chou, Lin, & Huang, 1998;Wu, Wang, Hu, & Ye, 2013). Nevertheless, many attempts were done so far to study, characterise and control particles aggregating during coagulation, there is still a gap in the application knowledge of how to use the floc features to optimize the coagulation process, provide cheap and robust dosage control system.
Nowadays the flow-proportional dosing concept is usually used for coagulation dosage control, while process optimisation often bases on results from the jar tests and the operator's experience (Ratnaweera & Fettig, 2015). Streaming current detector (SCD) was evaluated (Dentel, Thomas, & Kingery, 1989a, 1989b and tested in the drinking water treatment plants (DWTP) for automatic coagulation control (Critchley, Smith, & Pettit, 1990;Yavich & Van De Wege, 2013). The coagulant dosage strategy based on zeta potential measurements documented to be a promising control technique (Sharp et al., 2005(Sharp et al., , 2016. Advanced soft sensors and coagulation process control models employing artificial neural networks (ANN) have been tested in DWTPs (Baxter et al., 2002;Juntunen, Liukkonen, Lehtola, & Hiltunen, 2013;Valentin & Denoeux, 2001). Advanced dosing control systems based on multiple water quality parameters that could be measured online have confirmed to be successful both in DWTPs (Liu, 2016;Liu, Ratnaweera, & Song, 2013) and wastewater treatment plants (WWTPs) (VA-Support, 2012). Application of such systems enables a reduction of coagulant consumption (i.e. minimise the operational costs), reduces the sludge volumes and maintains the desired removal of particles and phosphates (Manamperuma, Ratnaweera, & Rathnaweera, 2013;Manamperuma, Wei, & Ratnaweera, 2017). With the growing need of wastewater treatment processes optimisation, the need for further development of intelligent, accurate and reliable online dosing control systems arises (Ratnaweera & Fettig, 2015).
In most cases the applicability of particles detection methods in wastewater coagulation process has been limited to lab scale due to complicated and inaccurate measurements in the field, hardware and software limitations. We propose a new approach to image analysis of flocs, which was previously tested on the laboratory scale batch tests (jar tests) (Sivchenko, Kvaal, & Ratnaweera, 2014, 2016. It is a comparatively easy method of image characterization, which bases on analysis of the whole image texture instead of concerning the shape characteristics of each particle in the image. Such approach simplifies the image analysis stage, e.g. no need of particles extraction/segmentation from the image and its count which found to be problematic in wastewater applications (Sivchenko et al., 2016). Hence, it gives significant simplification of the software to be developed. Furthermore, out of focus flocs, which are often present in the images, are not a problem for this texture complexity recognition method. Thus, potentially the cheap cameras could be employed for the floc sensor development. This paper presents the applicability of the concept in continuous mode with real wastewater in the context to use it as a dosage control technique for optimizing the coagulation process.

Wastewater treatment plant
Full-scale tests were conducted in the Frogn wastewater treatment plant (Drøbak, Norway) in September and October 2015. Frogn WWTP receives municipal wastewater from Drøbak city and the neighbourhood area. Average inlet flow is 4,600 m 3 /day during the days without snowmelt and/or precipitations. The tests period include days when the long precipitation period took place. The maximum flow rate to the plant reached 18,845 m 3 /day on the 18 September 2015.
Frogn WWTP is a mechanical-chemical precipitation plant. The treatment process consists of the next stages: screens, two parallel pre-sedimentation basins, three sequenced coagulation chambers with the different velocity gradients, and two parallel sedimentation chambers. The plant also has the sludge dewatering and thickening system. The inlet and outlet water quality parameters are measured by online sensors and recorded (average values) with the 15 min interval. The data is available for observation in the plant's SCADA system. The retrieved water parameters included inlet wastewater flow (QIN), inlet pH (PHI), inlet turbidity (TUI), inlet conductivity (CNI), coagulant dosage (Dose), pH after coagulant dosage (PHO) and outlet turbidity (TUO). The plant operators perform daily sampling of inlet and outlet total Phosphorous (total P). The summarised data of water quality parameters for the tests period is given in Table 1. Coagulant used in the Frogn WWTP is polyaluminium chloride (ECOFLOCK 90, Feralco), 9 ± 0.3% Al by weight, and density 1,356 ± 25 kg/m 3 .

Image acquisition and pre-processing
A special installation was designed to observe changes in flocs' structure in situ. The installation was set above the second flocculation chamber and consisted of the tube, peristaltic pump, acrylic cell for image acquisition, digital camera and computer (Figure 1). To minimise the potential danger of flocs breakage, the chosen tube was 3 cm in diameter, and the peristaltic pump was placed after the imaging cell. The water flow in the system was upstream and manually adjusted to approximately 40 l/h.
Images of flocs were constantly taken with a pre-set repeatability using free remote camera control software-DigiCamControl 1.2.0. Image capturing equipment used during the investigations was as follows: digital single lens reflex (DSLR) Nikon D600 camera, 105 mm Nikkor AF-S Micro 1:2.8 G ED lens (Nikkor, China), SpeedLite YN460 flash (Yongnuo, China). The size of the image-capturing zone in the cuvette was 3.3 × 10.3 cm. In order to obtain flocs with the proper depth of field, the black metal stripe was placed in the centre of the cuvette, which also became a background for the flocs. The choice of the background colour was based on the fact that the wastewater flocs are greyish coloured. Thus, using a contrasting background, it is easier to perform the further image analysis.
The obtained images have a resolution of 24.3 megapixels each. They were processed in the open source image analysis software ImageJ v.1.49 (Rasband, 1997(Rasband, /2016) that bases on plugins and macros. For each image 1380 × 3640 pixels (2.4 × 6.3 cm) area was cropped by manual investigation of the area. Because of slight changes in lighting conditions during image acquisition, all images were pre-processed in order to have the same brightness intensity. We wrote ImageJ plugin to mean centre the images' grey-tone values.

Image analysis by Grey level co-occurrence matrix (GLCM)
GLCM is a common method for the image texture measurement. Previously it was successfully applied in the laboratory scale data (Sivchenko et al., 2016). The GLCM method was chosen among the other texture analysis methods because it is quite simply computable and does not require heavy programming for the sensor prototype to be developed.
ImageJ plugin "GLCM Texture Too" v. 0.009 was used to obtain the GLCM feature vectors. The resulting output was given as a vector of the next 9 parameters per each image: contrast, correlation, inverse difference moment (IDM), entropy, energy, homogeneity, prominence, variance, and shade. Hence, the data matrix was obtained with the size 342 × 9. The detailed description, explanation, and equations for above GLCM texture features are given elsewhere (Conners, Trivedi, & Harlow, 1984;Haralick, Shanmugam, & Dinstein, 1973;Zheng, Sun, & Zheng, 2006).

Conjugation of two data sets
Three images for each 15 min were chosen to be representative. The measured GLCM feature vectors were averaged for each 3 images and matched with the retrieved water quality parameters. According to the tracer tests conducted in Frogn WWTP, the outlet turbidity values were 45 min shifted to meet the response lag between the coagulant injection point and outlet from the sedimentation tank. After the removal of missing values and outliers, the resulting data-set contains 114 samples. The data include 81.6% of samples under the normal operation conditions, 15.8% of samples under rainy weather conditions and 2.6% of samples which had high inlet turbidity values due to the periodic discharge from the septic tanks to the inlet.

Multivariate statistical analysis and modelling
The resulted data matrix was processed in statistical software The Unscrambler ® X 10.3 (CAMO Software AS, Norway) and in MATLAB ® using PLS toolbox 8.2 (Eigenvector Research, Inc., USA). Principal component analysis (PCA) was performed to find the relationships between water quality parameters and GLCM feature vectors. PCA is a statistical data analysis technique to reduce the dimensionality of the data-set, overview and describe the interrelationships among variables and to find so-called hidden structures in the data. Partial least squares regression (PLSR) was performed to predict outlet turbidity based on different combinations of water quality parameters and GLCM texture features.

Data overview
At the time when the tests were conducted, Frogn WWTP used the flow-proportional concept of coagulation dosage control with the ability of manual dose adjustment. However, now the plant has an advanced dosage control system-DOSCON ® (DOSCON AS, Oslo, Norway), which uses the inlet wastewater quality parameters to calculate the optimal dose. A detailed description of the control strategy can be found elsewhere (Liu & Ratnaweera, 2017;Manamperuma et al., 2017). Although the coagulant consumption and the plant performance were significantly improved, the system still depends on the reliability of the online equipment functioning in a harsh environment. Figure 2 shows an example of the load changes during the 9 days observation period in September 2015. The first 3 days represent the typical wastewater flow variations during the day under the normal (dry) weather conditions. On the fourth day, the rainy week started, and the wastewater load had significantly increased reaching a peak of 900 m 3 /h at the beginning of the ninth day. Corresponding flow-proportional coagulant dosages in μl/l are also marked in the figure. The images of exemplary floc structures appeared under different coagulation conditions are shown for the days when image analysis observations took place. Visual investigation of flocs can be described as next: the flocs formed during normal operation conditions tend to be bigger compared to those formed during the rainy days. This can be explained by the change in inlet wastewater composition, when domestic wastewater was highly diluted by stormwater. During the rain events avarage inlet flow to the plant increased 3 times, avarage inlet turbidity decreased 2-2.5 times, the average inlet conductivity of wastewaer decreased 1.3 times and average inlet pH decreased from 7.2 to 6.9. These changes in wastewater inlet parameters could result in a change of aggregates properties and size (Bache & Gregory, 2007). Even though the coagulant dosing control in the plant bases on the flowproportional concept, the dosage of coagulant was lowered manually during the rainy days period. Relatively high turbidity values of outlet water from the sedimentation tank during the last two rainy days point out on poor coagulation conditions with the non-optimal dosages. The multivariate statistical analysis was employed to test if the images of appearing during coagulation flocs reveal the information about coagulation conditions and have relations with the wastewater characteristics, doses and effluent water quality. https://doi.org/10.1080/23311916.2017.1373416

Principal component analysis
Previously obtained in the laboratory scale results, proved that the images of flocs are unique for the different water conditions and the texture analysis methods have a potential to be used for the further floc sensor development. Principal component analysis (PCA) applied to the full-scale data showed that the images, obtained from different inlet wastewater parameters and coagulation conditions, also contain unique information.     (QIN, TUI, PHI, CNI, PHO, TUO), GLCM textural features (9 variables) and the coagulant dosage in ml/s. It was done in order to compare how different would be the scores plot from the one described above. However, no huge differences can be observed in means of samples' classes. Samples corresponding to the rain events are more stretched by PC2, which associates with high turbidity inlet and outlet, high wastewater flow and coagulant dosage. Total explained variance for calibration: PC1 = 45.5%, PC2 = 73.6%, PC3 = 83.4%, PC4 = 89.6%; for cross-validation: PC1 = 38.3%, PC2 = 66.9%, PC3 = 76.9%, PC4 = 83.3%. Figure 3 shows the potential for the sensor prototype to be developed. However, in order to have the instrument functioning as a sensor, preferably, there should be only one signal coming out from the sensor. Thus, 9 GLCM feature vectors should be reduced to one variable. We are introducing the entirely new term-floc texture index (FTI). FTI is a sum of four GLCM feature vectors:

Floc texture index (FTI)
The other GLCM feature vectors were excluded from the equation, because they have very high (near 1) correlation coefficients to some of the included variables, what can be seen from the loadings in Figure 3. Since Variance has values in hundred scale, FTI was divided by 100 for simplification. The results of FTI calculation for 3 sample classes are presented in Table 2.
The calculated FTIs and corresponding observed outlet turbidity values are presented in Figure 4. Frogn WWTP aims to keep the effluent turbidity (TUO) below 5 FNU. Hence, TUO values below 5 FNU are marked as orange triangles, while the values above 5 FNU are marked as orange open triangles. The lower limit for FTI was chosen to be 6. While FTI is above 6, the corresponding TUO in most cases is below 5 FNU, and vice versa. Three events are highlighted in the figure. The first red dashed box highlights the rain event. The images of flocs quantified as FTIs showed an early indication that the outlet turbidity would exceed the maximum desired level. Even though the TUO values were shifted in the data-set to correspond the inlet wastewater quality and dosed amount of coagulant, the flow through the treatment plant is a dynamic system and not an ideal plug flow, so sometimes the time lag between a flocculation chamber and an outlet from the sedimentation tank is higher than 45 min. In such cases, the early indication of the changes in coagulation conditions by the images of flocs (FTI) is desired and a significant advantage of the planned dosage control system. The 2a and 2b red dashed boxes highlight the events when coagulant dosages were not optimal, which resulted (1) FTI = (Contrast + Entropy + Homogeneity + Variance) × 10 −2  in the increase of effluent turbidity. The third event highlighted by the red dashed box corresponds to the high inlet turbidity. Overall, FTI was able to indicate all non-optimal conditions of the coagulation process in this particular data-set.
With a bigger training data-set, it is potentially possible that FTI will have lower and higher limits to indicate changes in coagulation conditions and/or take action in raising or lowering the coagulant dosage.

Effluent turbidity prediction
Different PLSR models were tested in order to get the best prediction of the effluent turbidity based on inlet wastewater quality parameters and images of flocs. Nowadays DOSCON ® is using a multiparameter based feed-forward control strategy (Manamperuma et al., 2017). To strengthen the robustness and wider applicability of the dosage control system, the soft sensor should be developed to predict the effluent turbidity. If the operator of the plant knows few hours in advance that there is a potential danger the outlet turbidity will exceed the maximum desired value, he has enough time to take action and justify the coagulant dosage. Ideally, such system is to be developed to the self-standing dosage control strategy.
The models with highest explained variances are presented in Table 3. The response (Y) for all models was the effluent turbidity (TUO), while the X-matrix consisted of different variables. The prediction by only wastewater inlet parameters (QIN, TUI, CNI, PHI, Dose) resulted in 80.5% calibration Y variance explained by two factors. The simplest model based on inlet parameters QIN and TUI resulted in 79.7% calibration Y variance explained by one factor. With the addition of image analysis results (FTI), the calibration R 2 increased until 0.84. The best prediction model for TUO, about 87% calibration Y variance explained by two factors, included both inlet wastewater quality parameters (QIN and TUI), and some GLCM feature vectors (Variance, Prominence, Correlation and Contrast).
Even though the addition of FTI or GLCM variables to the models resulted in 4 and 7% increase of explained Y variance respectively, the main advantage of the image analysis supported prediction is a better estimation of the over ranged outlet turbidity values (more than 5 FNU). Figure 5 shows a comparison of two TUO estimation models. The first model (Figure 5(a)) is based only on wastewater inlet parameters and tend to underestimate TUO, which is acceptable for the TUO values below 5 FNU and close to it. However, there are two samples with considerably high TUO (over 10 FNU), which were underestimated by the model (predicted values lower than 5 FNU). The addition of the GLCM feature vectors (Variance, Prominence, Correlation and Contrast) as predictor variables increased the efficiency of estimation and resulted in the lower amount of underestimated effluent turbidity values.

Difficulties and further research needs
Some weaknesses of the described image analysis installation are that the employed camera is quite expensive and difficult to be properly controlled. Also, it has the battery charge limit, since cannot be charged directly from the electrical plug. The system was not able to work continuously, and this is the main reasons the test data-set resulted in a quite low number of samples. In addition, the resulting images have a high resolution but require quite much space for the storage. Currently, all the calculations of GLCM features and FTIs are done in an external computer. However, we see the potential for the sensor prototype to be further developed. In the next stage, it should be a self-standing computational system with a cheaper camera, which can constantly work with the pre-set settings.
The further research studies are necessary to develop the fully automated floc sensor prototype. The concept of the system will be based on automated flocs image acquisition and its texture image analysis with further matching of the resulted data to different mathematical models, defining the optimal coagulant dose. The improvement of existing on-line dosage control system is a key focus of this research.

Conclusions
The images of flocs give a sharp indication of the changes of inlet wastewater parameters and/or coagulation conditions. The images of flocs are unique for different wastewater qualities and coagulation conditions. GLCM textural features (quantified images of flocs) can distinguish and separate different wastewater coagulation conditions: normal, during the precipitation events, wastewater with the high inlet turbidity.
Floc texture index was introduced and calculated by summarising four GLCM feature vectors-Contrast, Entropy, Homogeneity and Variance. FTI can be used as an early indication parameter of the changes in wastewater qualities and coagulation conditions, which lead to the increase of effluent turbidity. However, further research is needed with the bigger calibration and validation data sets.
Effluent turbidity values can be predicted by few inlet wastewater parameters-flow and inlet turbidity, with R 2 = 0.79. The addition of processed flocs' images data increases the outlet turbidity prediction up till R 2 = 0.87.
The study shows a potential possibility of the floc sensor prototype to be developed. The images of flocs may be used online for troubleshooting and to improve the existing coagulant dosage control system.