Predictive model for monitoring water turbidity in a subtropical lagoon using Sentinel-2A/B MSI images

ABSTRACT Ensuring prompt and effective water quality monitoring is increasingly important. Remote sensing has been shown to be an effective tool for simplifying and speeding up this process. The aim of this study is to develop an empirical model to map the spatial and temporal dynamics of turbidity in Mirim Lagoon, located in southern Brazil. To achieve this, Sentinel-2A/B MSI sensor data were combined with turbidity data collected in situ. The model was applied to monthly images (with cloud cover ≤ 20%) in 2019 and 2020 using the Google Earth Engine (GEE) platform. Mean turbidity values in the lagoon did not vary significantly, remaining between 30 and 75 NTU overall. However, there were differences in turbidity levels between the northern and southern regions of the lagoon in some months of the investigated years. By applying this methodology and analyzing the results, we were able to better understand the behavior of turbidity throughout the lagoon and gain insights into the quality of this important freshwater source.


INTRODUCTION
Water-quality monitoring processes must take into consideration water biological, chemical and physical features that play an essential role in living beings' survival and in maintaining water use purposes, such as supply, irrigation and recreation (International Ocean Colour Coordinating Group, 2018).Monitoring methods traditionally comprise measurements taken in situ; however, these punctually collected data have limitations that are mainly associated with the spatial and temporal representativeness of quality parameters (Duan et al., 2013).Thus, Remote Sensing (RS) is a valuable resource associated with the environmental control of water, since it overcomes these limitations by enabling waterquality monitoring at wide spatial and temporal scales.According to Sagan et al. (2020), this technique lies in monitoring the concentration of parameters known as Optically Active Components (OAC) (Barbosa et al., 2019;Silva et al., 2008).
The presence of suspended sediments leads to increased reflectance in the visible and near infrared (NIR -Near Infrared) spectrum, being turbidity the light response to this increasing.In addition, it has a negative effect on the transparency of, and light availability within, water columns (Caballero et al., 2019).This qualitative parameter expresses the amount of particulate matter deriving from soil erosion in areas adjacent to water bodies and to their rocky bed, since erosion is often the main factor accounting for nutrients and contaminants' transport in water bodies (Dörnhöfer & Oppelt, 2016).Temporal variations in turbidity levels may be linked to both weather pattern and human activities performed along the margins of water bodies (Luis et al., 2019).Therefore, it is recommended to conduct multi-temporal studies that use remote sensing to analyze how changes in turbidity can impact the optical properties of water columns.
Multispectral sensors, such as the MultiSpectral Instrument (MSI), aboard the Sentinel-2A and B (S2A/B) satellite, are a promising technique to enable monitoring inland water turbidity in coastal and inland waters (Sebastiá-Frasquet et al., 2019).Properties of the S2A/B satellite enable acquiring images every 5 days (temporal resolution) -at spatial resolution ranging from 10 m to 60 m, and 12-bit radiometric resolution (Hedley et al., 2012).In addition, advances in cloud computing provided by the Google Earth Engine (GEE) geospatial platform allow conducting robust spatial and temporal analysis based on a ready-to-use, pre-processed data catalog (Gorelick et al., 2017).The use of Sentinel-2A/B satellite images, in conjunction with cloud computing capabilities provided by GEE, has been demonstrated to be an interesting alternative for assessing OAC in large environmental studies (Bustamante et al., 2009;Caballero et al., 2019;Dörnhöfer & Oppelt, 2016;Fassoni-Andrade et al., 2017;Fraga et al., 2020;Garg et al., 2020;Liu et al., 2019;Potes et al., 2018).
This study analyzed the turbidity behavior in Mirim Lagoon, taking into account the significance of maintaining water quality for its multiple uses, as well as the potential benefits offered by advancements in cloud computing technology.The objective of this article was to assess the spatial and temporal distribution of turbidity in Mirim Lagoon through empirical modeling, utilizing Sentinel-2A/B MSI images in GEE.

Study site
Mirim Lagoon is located in Southern Rio Grande do Sul State (coastal plain), between Brazil and Uruguay.Its mean width is 20km, its surface area covers 3,750km 2 -2,750km 2 in the Brazilian territory, and 1,000km 2 , in the Uruguayan territory (Agência para o Desenvolvimento da Bacia da Lagoa Mirim, 2020) -and its mean depth reaches approximately 4.5m (Munar et al., 2017).Figure 1 shows the location of both Mirim Lagoon and the monitoring points, together with the land use and land cover map of the year 2019 for the region as well as the hydrography of the region.Fifteen points were sampled over the Northeast Mirim lagoon (Figure 1), this area is directly affected by the São Gonçalo channel and it is a well representation of the lagoon extension.
Mirim Lagoon water contributes to the quality of life of approximately 1 million people who live around it.Thus, this lagoon plays a key role as a freshwater reservoir (Oliveira et al., 2015), as well as in maintaining moisture in the wetlands of Taim Ecological Station.Farming prevails among the main land uses along Mirim Lagoon watershed (Figure 1).The use of fertilizers in agricultural crops, population growth, as well as domestic and industrial waste deposition are between the main source of pollutants in the lagoon, since they can worsen unbalanced nutrient availability conditions and increase the risk of having the eutrophication process taking place in the lagoon.It should be noted that Mirim Lagoon is classified as a shallow waterbody due to its limited depth and location in a flat terrain.The water depth varies according to the extension of the lagoon so that its highest values are at the southeast edge and the opposite is observed towards the São Gonçalo Channel (Bendô, 2019).Water levels can reach values below 3m in the center of the lagoon.The northern part has mean water depth values of about 6m, and the southern part of the LM shows depths above 9 m, with the exception of a section in the center of the LM, where average depths of around 5 m are found (Bendô, 2019).
In addition, agriculture is implemented in its lowland areas, a fact that makes this ecosystem more vulnerable to surrounding natural and anthropogenic variations, and even more susceptible to variations in its composition throughout the year (Scheffer, 1998;Trindade et al., 2009;Zambrano et al., 2009).The land in its surroundings is intensively used for agricultural and livestock purposes, and it implies vulnerability to anthropogenic pollution and to the water-quality status of this inland waterbody (Janse et al., 2008;Scheffer, 1998).
The analysis conducted in this study is warranted by the size and significance of this regional and transboundary lagoon, situated in Southern Brazil and Northern Uruguay, and its importance to the population of all 17 counties within its drainage basin on the Brazilian side.With an influence area spanning 29,250 km 2 , which represents 47% of the total basin area of 62,250 km 2 , Mirim Lagoon holds great importance (Agência para o Desenvolvimento da Bacia da Lagoa Mirim, 2020).

Data collection in the field and turbidity analysis
Water samples were collected at 15 different points, which were previously distributed in the Northern region of Mirim Caballero et al.

3/10
Lagoon (Figure 1).Sample collections were carried out on two different dates, under different hydrological and climatic conditions: on October 16 th , 2018 (Southern Spring) and on April 9 th , 2019 (Austral Fall).In situ collections started at 9:00 am on the days after the Sentinel-2 satellite passed above the investigated region; samplings were performed at a time interval shorter than 24 hours after its crossing.Approximately two liters (2 L) of water were collected at the depth of 0.30m, in each collection point, and kept under refrigeration.Subsequently, collected samples were sent to the Water and Waste Laboratory of bachelor's degree courses in Chemical Engineering and Chemistry Technician, at Federal Institute of Education, Science and Technology of Rio Grande do Sul (IFSul), Pelotas Campus -RS.Turbidity levels were measured in the Policontrol turbidimeter, model AP2000iRC/RS232, right after the water samples arrived at the laboratory.Water sample collection, storage for transport and analysis were performed as described in the SMEWW method 2130B (American Public Health Association, 2005).

MSI/Sentinel-2 data
The Copernicus program developed by the European Space Agency (ESA) has adopted a multi-platform strategy to increase temporal resolution while maintaining other resolutions.Recently, the launch of the Multispectral Imager (MSI) sensor aboard the Sentinel-2A and -2B satellites enabled increasing temporal resolution from 10 to 5 days (Barbosa et al., 2019).The MSI sensor of Sentinel-2 satellite comprises 13 spectral bands: four (4) bands have 10m spatial resolution (SR), six (6) bands have 20m SR, and three (3) bands have 60m SR and radiometric resolution of 12 bits per pixel (European Space Agency, 2020).According to Martins et al. (2017), MSI/Sentinel-2 configurations enable mapping small and irregular water systems, at higher radiometric sensitivity, in order to distinguish biotic variables, as well as having a proper temporal frequency, of 5 days over the study area, which is an important feature for inland water studies.

Generating the empirical turbidity model
Images generated -on October 15 th , 2018, and on April 8 th , 2019 -by the MSI sensor aboard Sentinel-2 satellite were used to build the empirical turbidity model; they did not show clouds in the analyzed areas.To perform OAC turbidity analysis, the images were atmospherically corrected using the Sen2Cor algorithm.Sen2Cor is a processing algorithm that uses combination of techniques to convert data from 1-C reflectance level (i.e., TOA -Top-Of- Predictive model for monitoring water turbidity in a subtropical lagoon using Sentinel-2A/B MSI images Atmosphere) into 2-A reflectance level (i.e., BOA -Bottom-Of-Atmosphere) to generate a corrected surface-reflectance product (Richter et al., 2011).
Regression analysis of multiple variables was applied to generate the empirical model, using satellite bands (images referring to collection dates) as independent variables and turbidity (the parameter to be estimated), as a dependent variable.Reflectance values recorded for bands used to generate the empirical model were obtained based on the aforementioned images.The adopted bands (and its ratios) were B2 (blue), B3 (green), B4 (red) and B8 (near infrared); they presented 10m spatial resolution.
To investigate data normality, a non-parametric Kolmogorov-Smirnov (K-S) test was performed at a 5% significance level (p < 0.05).Pearson's parametric correlation coefficient was used whenever the null hypothesis of normality was accepted (Hair Junior et al., 2009).Variable "turbidity", as well as factors such as individual bands' reflectance and bands' ratio were used to analyze the correlation between in situ and reflectance data.
Bands and bands' ratios (B3, B4, B8, B3/B2, B2/B3, B4/B3) presenting normal distribution and satisfactory correlation to turbidity were selected.Predictive models used to estimate turbidity were developed based on the stepwise multiple linear regression (MLR) method.The stepwise regression is a step-by-step, iterative construction of a regression model in which independent variables are selected for use in the final model.

Turbidity model validation
The selected empirical model was validated using the Monte Carlo method (Pinto et al., 2016), which involved performing 100,000 interactions of random variables.The randomly generated values demonstrated a normal distribution of data.The Monte Carlo method is a computational algorithm that uses random and repeated sampling to produce approximate results.The method involves generating random numbers for each primary variable according to a probability distribution function (PDF), which are then propagated through a mathematical measurement model.The method employs the concept of propagating probability distributions of input quantities (prior information), which assumes a normal distribution for each input quantity.These distributions are then passed through the mathematical measurement model M times (where M is the number of iterations), resulting in a new distribution.According to the Monte Carlo method, the procedure for assessing uncertainty can be divided into three stages: input, processing, and output.The input data includes (a) the mathematical measurement model; (b) the probability density function for each input quantity, and (c) the number of iterations.(Pinto et al., 2016).

Spatial and temporal mapping
After generating the empirical model to estimate turbidity in the Mirim Lagoon, the spatial and temporal mapping of this parameter was extended to the entire lagoon.This procedure was carried out on the GEE platform, which is a programming interface that provides computational resources and free satellite data in the cloud to minimize users' need for more robust computational resources and data availability (Gujrati & Jha, 2018).The MSI/Sentinel-2 sensor images of the surface water of the entire Mirim Lagoon, obtained from the GEE platform, were used.The images used in this stage of the study were already available as surface reflectance data, which means that they had already been subjected to atmospheric correction using the Sen2Cor algorithm.Images of the Mirim Lagoon region corrected from December 2018 onwards are available in GEE.For this reason, we chose to analyze data from 2019 and 2020.
First, images of the lagoon showing the lowest cloud cover rate in each month of the analyzed years (20% cloud cover was the maximum rate accepted) were selected for spatial analysis of turbidity -in total, 23 different images of the Mirim Lagoon were used in the analysis, since no image taken in October 2019 met the cloud cover criterion (≤20%).
As a second step, all images (87 in total) showing cloud cover lower than 20% in the investigated interval were selected for turbidity temporal mapping -cloud mask was also applied to remove the clouds from the images; the empirical model was applied to all images.In total, 999 random points were generated throughout Mirim Lagoon in order to extract turbidity values and perform the temporal analysis -mean turbidity of these points was calculated for each image in order to plot a temporal turbidity graph for 2019 and 2020.Rainfall data extracted from the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) were also used to help better understanding the turbidity dynamics in the investigated site -CHIRPS provides daily rainfall data at 0.05° resolution, by combining satellite imagery to data collected in situ (Funk et al., 2015).Daily CHIRPS data were grouped into monthly data and extracted for the entire extension of Mirim Lagoon.Then, they were compared to mean monthly turbidity data resulting from the monthly means of all 87 selected images.

Generation of the empirical turbidity model
The descriptive statistics of the turbidity data of the 30 water samples (N) were: mean turbidity values equal to 34.82±6.66NTU, minimum of 23.10 NTU, and the maximum value of 45.80 NTU.Table 1 shows correlation values between turbidity data and band reflectance/ratio for those parameters that showed normality based on the nonparametric Kolmogorov-Smirnov (KS) test (p < 0.05).B3 (green), B4 (red), B8 (infrared) and band ratios B3/B2 (blue), B2/B3 and B4/B3 were used, since they were capable of capturing the optical turbidity behavior.All analyzed bands and band ratios presented significant correlation to turbidity (Table 1); thus, they were all used as input data to generate the empirical model.
The best model was generated based on the B3 (green) and B8 (near infrared) bands; it showed the best performance between all alternatives, with r= 0.836 and r 2 = 0.700 (determination coefficient) and adjusted r 2 = 0.677, which represents to what extent the selected model fits the data set, i.e., to what extent the generated model is representative of turbidity recorded for Mirim Lagoon water, were recorded.The model recorded RMSE of 3.588 NTU and MAE of 2.597 NTU; it also recorded DW of 1.622, according to which, the analyzed variables did not show autocorrelation in residual values observed in the regression analysis.
The herein selected empirical multivariate regression model is presented in Equation 1. Figure 2a shows the scatter plot of observed and predicted values recorded by the turbidity model that was applied to Mirim Lagoon, RS. (1)

Validation
Turbidity data were satisfactorily validated based on the Monte Carlo method (Figure 2b); it recorded p-value of 0.99995, with the mean turbidity values were very similar when comparing the in situ values (30 points) with the 100,000 simulations: mean turbidity value recorded for the 100,000 interactions was 34.818 NTU; it was quite close to the mean value of 30 data, in situ (34.823NTU).

Spatial and temporal mapping
Equation 1 was applied to map turbidity in Mirim Lagoon water, based on all 23 selected images.Figure 3 shows the spatial distribution of turbidity, based on which one can observe the water-quality dynamics along the entire extension of Mirim Lagoon.
The spatial analysis of turbidity (Figure 3) has shown that the lowest turbidity values were often found in Northern Mirim Lagoon, where it connects to the São Gonçalo channel.In the first 5 months of 2019, some specific monitoring points exhibited high turbidity, but the mean turbidity values did not significantly vary across the entire extension of the lagoon.From June/July to approximately November 2019, there was a brief increase in turbidity, mainly in the southern region of Mirim Lagoon.However, turbidity levels decreased in the subsequent months until April 2020.High turbidity (reaching 60 NTU, or higher) over the entire RBRH, Porto Alegre, v. 28, e5, 2023 6/10 Predictive model for monitoring water turbidity in a subtropical lagoon using Sentinel-2A/B MSI images area were observed in April and June 2020, when mean rainfall rates reached 86 mm and 157 mm, respectively.Medium-to-high turbidity was observed in the other months of 2020, although August 2020 recorded values relatively lower than those observed for the other months evaluated in the current study.Mean rainfall rate in the aforementioned month was approximately 50 mm (Figure 4a).
It can observe the annual variability in mean turbidity in Mirim Lagoon water (Figure 4b).Overall, values ranged from 30 NTU to 75 NTU.Monthly turbidity variability is associated to mean monthly rainfall rates in 2019 and 2020 (Figure 4a).The correlation between the two variables was not significant, although it is overall possible to observe a certain trend in the association between them.However, high turbidity values observed in some months (such as Feb/20 and Nov/20) cannot be explained by direct increase in rainfall rates.
Considering the analysis of mean monthly turbidity and rainfall rates carried out in the lagoon (Figure 4a), intense rainfall events observed in some months led to high turbidity values, such as that recorded in August 2019, when the mean rainfall rate reached approximately 160 mm and turbidity recorded 40 NTU.However, some increased-turbidity episodes were not associated with high monthly rainfall values.Rainfall rate in April 2019 reached 80 mm, whereas mean turbidity in the lagoon reached 51 NTU.The highest mean turbidity value observed in the lagoon was 75 NTU, in February 2020, when rainfall rate recorded 54 mm.Although the association between rainfall rate and increased turbidity is often expected to be observed, these episodes may be linked to other variables, such as effluent entry into the lagoon, drainage of water from surrounding crops, among others.

DISCUSSION
It is essential to understand turbidity dynamics in water bodies at the time to analyze the quality of water, as well as to better understand phenomena that may interfere in, and affect, the behavior of this variable in the environment.Turbidity tends to increase water opacity and, consequently, to harm aquatic life ( Garg et al., 2020).In addition, temporal variations in turbidity may take place due to fluctuations in climatic variables and to human activities along the margins of water bodies (Luis et al., 2019).Suspended sediments are the major contributors to turbidity, being a potential contaminant and linked to bacterial impact and light suppression with effects on BOD, DO, and pH (Göransson et al., 2013).
After testing different bands and their combinations, the empirical multivariate regression model showing the best results was the one using bands B3 (green, 560 nm) and B8 (near infrared, 835 nm).It is worth noting that the generated model uses linear regression, which may result in some estimated turbidity values falling outside the calibration range.The incidence of suspended solids and, consequently, of turbidity in water bodies increases the reflectance of the red and NIR regions in the electromagnetic spectrum (Chawla et al., 2020).In waters with high concentration of inorganic particles, a reflectance growth occurs in the region of the blue reaching its peak of reflectance in the region of red (Fraga et al., 2020).Suspended solids increase the reflectance in the region of the visible and NIR, being that the greater its concentration, the greater the displacement of the peak of reflectance to greater wavelengths (Fraga et al., 2020).
Studies available in the literature have explored these properties in different ways to effectively monitor water turbidity based on RS.Red-region bands of Landsat TM (630-690 nm), SPOT (610-680 nm) and MODIS (645 nm) sensors were used to monitor low turbidity levels ranging from 0.9 NTU to 15 NTU (Bustamante et al., 2009;Chen et al., 2007;Goodin et al., 1996).In addition to single bands, different band ratios are also used to measure turbidity.The MERIS sensor green (560 nm)/blue (413 nm) band ratio was used by Potes et al. (2012) to monitor turbidity ranging from 1 NTU to 60 NTU.On the other hand, Predictive model for monitoring water turbidity in a subtropical lagoon using Sentinel-2A/B MSI images the Landsat red (630-690 nm)/blue (400-500 nm) band ratio was used by Liversedge (2007) to monitor high turbidity ranging from 2 NTU to 997 NTU.
Based on the spatial analysis, it can be observed that the highest turbidity values were often recorded in Southern Mirim Lagoon over the months.This may be attributed to the fact that the Southern Mirim Lagoon region is narrower and shallower compared to the Northern region, which could result in higher sediment accumulation and, consequently, increased turbidity.Fraga et al. (2020) have analyzed the dynamics of suspended solids in the lagoon and found lower concentrations in the Northern region of it, close to Mato Grande Biological Reserve and to São Gonçalo channel.Nunes et al. (2020) have analyzed the seasonal turbidity in the lagoon based on Sentinel-2/AB images applying the NDTI spectral index.Tributaries rivers (Figure 1) play an important role in the entry of sediments and pollutants into the lagoon, increasing its turbidity.Coradi et al. (2009) showed that tributary rivers are a relevant form of contribution to the pollution into the lagoon, arising mainly from industrial, agricultural and port activities.
Although the correlation analysis between the two variables was not significant, it is important to notice that the relation between the precipitation and turbidity is not straight-forward, since it is influenced by the intensity and spatial distribution of precipitation (Göransson et al., 2013).The precipitation analysis was carried out as a way of trying to understand the dynamics and the relationship between these two variables.A deeper analysis considering the precipitation of the lagoon's tributaries would help to better understand this dynamic, but it is beyond the scope of our study.Precipitation on the lagoon is an important variable to consider since it can promote the resuspension of sediments, increasing the turbidity.Also, wind intensity and direction are important factors in particle suspension in large shallow lakes, and since wind is one of the dominant climatic elements of circulation and water levels in the basin (Possa et al., 2022), winds incidence in the region can influence the spatial and temporal variability of turbidity.Turbidity values found in Northern Mirim Lagoon were lower than that recorded for the other monitored points.This outcome can be associated with the likely diluting effect of water mass on solid particles typical of the increased turbidity observed when these particles are found in larger amounts.

CONCLUSIONS
The current study demonstrates the potential of using satellite data from the Sentinel-2A/B MSI sensor in combination with in situ measurements to map and analyze turbidity in a lagoon located in Southern Brazil.By utilizing cloud computing on the GEE platform, the study was able to generate a spatial and temporal assessment of turbidity in the Mirim Lagoon between 2019 and 2020.An empirical model was developed using linear regression analysis applied to the in situ measurements and MSI sensor bands.The model was validated through the Monte Carlo method, and its results showed a normal distribution with turbidity values mostly concentrated around the mean value of 34.818 NTU.
The spatial analysis revealed that turbidity values were generally lower in the northern region of the lagoon and higher in the southern region.This could be due to the fact that the southern region is narrower and shallower, leading to higher sediment accumulation and turbidity.The monthly dynamics of turbidity showed that there was a brief increase in turbidity from June/July to November 2019, primarily in the southern region of the lagoon.Turbidity decreased in the following months until April 2020.
The results of this study provide a better understanding of turbidity behavior in Mirim Lagoon and can be used as a basis for further research to map longer time series.Future studies could also explore the relationship between turbidity and external factors such as land use and cover, bathymetry, and climatological variables, which could provide further insights into the factors influencing water quality in Mirim Lagoon.Overall, this study demonstrates the potential of using remote sensing and cloud computing tools for water quality monitoring in complex aquatic environments such as lagoons.

Figure 1 .
Figure 1.Water-quality monitoring points to the North of Mirim Lagoon [data source: land use and land cover map from MapBiomas Pampa Trinacional (2020)].
It involves adding or removing potential explanatory variables one at a time and testing for statistical significance after each iteration.The best model was selected based on results of the following statistical indices: correlation coefficient (r), determination coefficient (r 2 ), adjusted determination coefficient (r 2 adjusted ), standard error of estimate (Error), F-test of overall significance (p < 0.05), Durbin-Watson test (DW), root mean square error (RMSE) and mean absolute error (MAE) (Fassoni-Andrade et al., 2017).
displays the histogram resulting from the application of the Monte Carlo method, which depicts the turbidity data generated.The results indicate normal distribution of turbidity values, with values clustered around the mean (34.818NTU).Subsequently, a Student's t-test (with a significance level of p < 0.05) was applied to compare the means of the data analyzed in situ with the means of the values obtained through the Monte Carlo method interactions, based on Fraga et al. (2020).All statistical analyses were performed in R software.

Figure 2 .
Figure 2. (a) Scatter plot of observed and predicted values recorded by the turbidity model in all 30 analyzed points; (b) Histogram generated based on the application of the Monte Carlo Model with 100,000 interactions.

Figure 3 .
Figure 3. Spatial distribution of turbidity for selected monthly images with cloud cover <20% recorded for 2019 and 2020, along the entire extension of Mirim Lagoon-RS, Brazil.

Figure 4 .
Figure 4. (a) Temporal distribution of turbidity in Mirim Lagoon water, based on all images showing cloud cover rate lower than 20% in 2019 and 2020; (b) Turbidity and monthly mean rainfall (in mm) values based on Climate Hazards Group InfraRed Precipitation with Station (CHIRPS) data recorded for the entire extension of Mirim Lagoon.

Table 1 .
Correlation between turbidity data and band reflectance/ratio.