Dive Into the Unknown: Embracing Uncertainty to Advance Aquatic Remote Sensing

Uncertainty is an inherent aspect of aquatic remote sensing, originating from sources such as sensor noise, atmospheric variability, and human error. Although many studies have advanced the understanding of uncertainty, it is still not incorporated routinely into aquatic remote sensing research. Neglecting uncertainty can lead to misinterpretations of results, missed opportunities for innovative research, and a limited understanding of complex aquatic systems. In this article, we demonstrate how working with uncertainty can advance remote sensing through three examples: validation and match-up analysis, targeted improvement of data products, and decision-making based on information acquired through remote sensing. We advocate for a change of perspective: the uncertainty inherent in aquatic remote sensing should be embraced, rather than viewed as a limitation. Focusing on uncertainty not only leads to more accurate and reliable results but also paves the way for innovation through novel insights, product improvements, and more informed decision-making in the management and preservation of aquatic ecosystems.


Introduction
Optical satellite remote sensing provides an unparalleled amount of aquatic ecosystem measurements.High temporal and spatial resolutions enable the study of water body characteristics and dynamics, such as temperature, color, and biogeochemical processes [1].Satellite observations provide global coverage over long periods, making it possible to study changes in Earth's climate [2,3].
Uncertainty is inherent in all remote sensing measurements and models used to infer downstream products such as inherent optical properties or water column component concentrations [4,5].Sources of uncertainty include random and systematic effects such as wind and wave motion, atmospheric variability, sensor noise, and assumptions made in retrieval algorithms ("Uncertainty: Theoretical Background" section).Quantification of uncertainty is necessary to correctly develop, interpret, and use remote sensing products.
However, remote sensing data and downstream products have often been shared without associated uncertainty estimates [4,6].Recognizing and incorporating uncertainty in remote sensing studies can enhance the accuracy of interpretations, bolster their scientific relevance, and ultimately contribute to further scientific progress and breakthroughs ("Uncertainty: Practical Applications" section).
Recent years have seen a push to better understand and quantify uncertainty [7].The primary quantity in aquatic remote sensing, the spectral remote sensing reflectance R rs (λ), has been studied in detail, and its sources of uncertainty are increasingly well understood [8][9][10].Similar studies have been performed for related products, including inherent optical properties [11,12] and phytoplankton chlorophyll-a concentration (Chla) [13,14], and for individual sources of uncertainty, such as calibration materials [15] and mathematical assumptions [16].Important recent contributions to the field include the 18th report of the International Ocean Colour Coordinating Group [4] and the Fiducial Reference Measurements for Satellite Ocean Colour (FRM4SOC; https://frm4soc.org/and https://frm4soc2.eumetsat.int/) project.Addressing uncertainty is a fundamental aspect of the European Space Agency's Climate Change Initiatives (ESA CCI) and NASA's forthcoming Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) hyperspectral satellite mission.These current and previous efforts have focused on elucidating the nature of uncertainty, identifying its sources, and establishing methods for its assessment and propagation.
Nevertheless, the quantification and subsequent application of uncertainty remain limited within the broader remote sensing community.To highlight this gap, we conducted a survey of 100 research articles on aquatic remote sensing published in major journals in 2021-2023 (20  and Remote Sensing volumes 60 to 61).We found that 58% of the articles discussed uncertainty in their discussion or conclusion sections.However, only 35% of the articles integrated uncertainty into their studies by quantifying and accounting for it in input data (16%), assessing the impact of various uncertainty sources and adapting the methodology accordingly (24%), and/or offering an uncertainty estimate on the results (19%).
In this article, we highlight the significant benefits of embracing the uncertainty that is part of field data, retrieval algorithms, and final products.Embracing uncertainty means recognizing its inherent presence, actively incorporating it into research and decision-making processes, leveraging it as a driving force for innovation and for gaining a deeper understanding of aquatic remote sensing models and products.The aim of this article is to shift the perception of uncertainty from being merely a side product to being an essential source of information that advances remote sensing.We begin by summarizing the theory of uncertainty in the "Uncertainty: Theoretical Background" section.We then provide three practical examples that illustrate how working with uncertainty can advance remote sensing validation, calibration and modeling, and decision-making ("Uncertainty: Practical Applications" section).Our findings, outlined in the "Concluding Remarks" section, carry broad implications not only for aquatic remote sensing but also for related fields such as land and atmospheric remote sensing and aquatic science in general.

Uncertainty: Theoretical Background
Sources of errors that cause uncertainty can be broadly divided into 2 classes, namely, systematic and random.Systematic errors affect the accuracy of a measurement, i.e., how much an estimated value differs from a "true" reference value [5].In practice, there is no "ground truth, " so a trusted model or instrument is chosen as the reference [17].To ensure that a reference is trusted and verifiable, it should be traceable to objective criteria such as calibration standards and physical laws [4].Systematic errors lead to incorrect interpretations of data.Random errors affect the precision of a measurement, i.e., the dispersion between multiple individual measurements of the same quantity, creating an uncertainty on the result [5].
Systematic errors, or biases, arise from unknowns that could theoretically be corrected if they were known [5,18].Factors that are only partially understood and quantified are often termed known unknowns.Partially understood uncertainty sources include calibration errors [19], sensor drift over time [20], incorrect mathematical assumptions or simplifications [16,21], and properties of experimental materials [15].Additionally, experiments can be affected by unknown unknowns, which are factors that were not anticipated or considered beforehand.For example, a previously unknown seasonal drift in satellite-derived R rs (λ) has been found to affect downstream products by up to 50%; the origins of the drift are not yet entirely understood [22].A combination of multiple simultaneous systematic errors from unknown origins may appear to be a random effect and present itself as an uncertainty.
Random errors stem from, for example, unpredictable or stochastic variations in the sample, measurement process, or data processing, causing uncertainty in individual values [5,18].The uncertainty caused by random errors can often be reduced by averaging multiple replicate measurements.Sources of random error include inherent spatial or temporal variability in a sample [8,23], surface glint and wind effects [24], and thermal or photon noise in a sensor [25].Human factors also introduce random errors, for example, by pouring slightly different volumes of liquid (say, 98 ml versus 102 ml) into a filtration device.
The uncertainty on a measured value characterizes the dispersion of the values that could reasonably be attributed to the measurand [5].For example, if Chla in a sample is measured to be 3.0 mg m −3 with an uncertainty of 0.3 mg m −3 (10%), then one could reasonably attribute values between 2.7 and 3.3 mg m −3 to the measurand "Chla in this sample." Uncertainty is typically expressed in text using the ± notation (3.0 ± 0.3 mg m −3 ) or through a confidence interval (CI, 2.7 to 3.3 mg m −3 ).The ± notation typically signifies the 1-σ or 68% CI, while the latter notation is often used for the 2-σ or 95% CI, unless otherwise specified.It is important to note that, by definition, a CI does not have 100% coverage.For example, for a 95% CI, 1 in 20 samples should fall outside the CI.Uncertainty often follows a normal distribution (as in Fig. 1C), but this cannot always be assumed a priori.For example, logarithmic and threshold-based quantities often have asymmetric or multimodal distributions.Graphically, uncertainty is often represented as an error bar in scatter or bar plots and as a shaded area in line plots, although many variations exist [26].
Uncertainty is commonly estimated as the standard deviation of multiple replicate measurements [5], for example, by taking and processing three separate water samples from the same location.This method probes random effects in both the sample and the processing, but is susceptible to systematic errors and true changes in the measurand between samples.Furthermore, replicate measurements by definition increase the amount of labor and expenses by several factors.Alternatively, for well-characterized instruments or models, uncertainty estimates from previous studies may be used [27].
Covariance or correlated uncertainty occurs when multiple variables depend on each other or share a common source of error.For example, when measuring water temperature and salinity simultaneously using a thermosalinograph (TSG), sensor drift in the instrument may affect both measurements at once.As a result, not only is there a probability distribution of possible values for each individual variable, but there is also a distribution of pairs of values.Correlation may be positive, e.g., a higher temperature measurement suggests a higher salinity, or negative.When the correlation is significant, estimating the uncertainty for each variable independently does not represent the overall uncertainty in the measurements and the common sources of error must be included in the uncertainty propagation approach.
Mathematical uncertainty propagation can be performed either numerically or analytically [5].Numerical propagation techniques, such as the Monte Carlo method, repeat the same calculation many times with different input variables, matching the uncertainty of the input data.Numerical methods are computationally expensive, but can capture complicated behavior and arbitrary uncertainty distributions [10].Analytical propagation typically uses derivatives y x to express the sensitivity of a variable y to small changes in a variable x due to uncertainty.In the simple case of independent variables with uncorrelated uncertainties, this leads to the familiar sum-of-squares method shown in Eq. 1.For the more general case of multiple correlated Downloaded from https://spj.science.orgat LiB4RI on October 20, 2023 variables, the Jacobian matrix J is used, as shown in Eqs. 2 and 3, with Σ x , Σ y the covariance matrices for multidimensional variables x, y.This analytical technique is exact for linear transformations and approximate otherwise.Analytical propagation has the benefit of computational speed and simplicity, as long as the relevant assumptions are met [19,23].
(1) Downloaded from https://spj.science.orgat LiB4RI on October 20, 2023 In the context of numerical models, uncertainties can be further categorized into aleatoric and epistemic uncertainties [28].Aleatoric uncertainty arises from the inherent randomness or stochastic nature of the underlying processes being modeled, such as natural variability or measurement noise.This type of uncertainty is generally irreducible, even with improved knowledge or additional data.Epistemic uncertainty, on the other hand, stems from incomplete knowledge or understanding of the processes being modeled, such as inaccurate model parameterizations, insufficient data, or unrealistic model assumptions.Epistemic uncertainty can be reduced as one's understanding of the system improves or as more data become available.

Uncertainty: Practical Applications
There are many practical applications of uncertainty estimation and propagation in addition to the most basic application, i.e., knowing the range of values that may be attributed to the measurand ("Uncertainty: Theoretical Background" section).A brief overview of various applications that have been demonstrated in the literature is provided in Table.

Example: Validation and match-up analysis
Match-up analysis is a core process in model and instrument development and validation.Datasets are compared point-bypoint using statistics like the median absolute deviation, R2 , and point-by-point accuracy to estimate their agreement and find patterns therein.However, the value of a match-up analysis is significantly reduced when information about the uncertainty in individual data points is lacking.
It is challenging to determine whether the data being compared have significant differences without knowing the uncertainty associated with them.For instance, when comparing in situ and satellite-based measurements of remote sensing reflectance R rs or derived products like Chla (Fig. 1A), it is common to find differences of 30 to 80%.At first glance, this might appear to be a substantial discrepancy.However, understanding the uncertainties related to the measurements often reveals that these differences fall within the CIs.
Figure 1A highlights the importance of taking uncertainty into account when analyzing observations and performing model regression.By employing a weighted regression method that considers the uncertainties in the observations, the model's accuracy can be significantly enhanced compared to using an unweighted regression approach that disregards these uncertainties.
When deriving a model, neglecting the uncertainty on the input measurement data can reduce the accuracy of the obtained model by biasing it toward outliers.Measurements that are influenced by random factors and that deviate from the actual value can cause a model to be skewed either positively or negatively.Being aware of the uncertainty allows for the identification and filtering of these outliers or the assignment of lower weights in regression analysis, avoiding the need for arbitrary methods such as sigma-clipping, which can introduce additional human errors.This concept also applies to extensive datasets with numerous data points that exhibit considerable uncertainties due to random errors [21].Similar considerations exist in time series analysis [29].
Quantifying the uncertainty on best-fitting parameters and validation statistics themselves enables a fair and robust evaluation and comparison of models.In general, this can be done through numerical methods such as bootstrapping [23,30].For some common statistics, such as Pearson r correlation and median absolute deviation (MAD), analytical formulas exist.As an example, in a comparison between R rs measurements taken with 2 smartphone cameras, we found that they agreed significantly better in R rs band ratios (typical difference CI, 2.3 to 3.7%) than in single-band absolute R rs (CI, 3.8 to 8.2%) [23].
A common and important goal in validation studies is to attain closure, meaning different methods provide compatible results [4].For example, R rs can be determined from abovewater measurements of the water-leaving radiance L w and downwelling irradiance E d R rs =  [12].Closure is achieved when the matched-up results from these independent methods agree.The advantages of embracing uncertainty in match-up analysis, as described above, also apply to closure experiments and make it possible to quantitatively determine the degree of closure.

Example: Targeted improvement through uncertainty analysis
Uncertainty propagation facilitates the development of uncertainty budgets, allowing for the quantification of uncertainty in downstream products that arises from various individual input variables.This budget can then be used to improve the downstream product by targeting the "worst offenders, " those input variables that contribute the most to the output uncertainty [6,8].
As an example, here, we consider the measurement of Chla through fluorometry for in situ validation data, as shown in Eq. 4. Here, V f is the sample volume, V ex is the extraction volume, F o and F a are the fluorometer readings before and after acidification, and F m and F s are calibration constants [31].
The uncertainty in each input parameter can be propagated into σ Chla , the uncertainty in Chla, by applying the sum-ofsquares method shown in Eq. 1, giving Eq. 5.For illustration, in an experiment performed in 2019 at the Darling Marine Center (Maine, USA), we measured F o = 680 ± 2, F a = 395 ± 2, V ex = 0.0052 ± 0.0001 L, and V f = 0.2880 ± 0.0005 L with empirically determined calibration factors F m = 1.95 ± 0.05 and F s = 0.32 ± 0.02.Applying Eqs. 4 and 5 provides a concentration of Chla = 3.39 ± 0.24 μg L −1 .
The corresponding uncertainty budget (Fig. 1B) contains the relative contribution of each term in Eq. 5 to the overall uncertainty.The contributions depend on both the input uncertainties σ Fm , … and the scaling factors Chla F m , … as in Eq. 1. Importantly, Fig. 1B shows that in our experiment, the uncertainty in Chla was dominated (90.1%) by the calibration factors F m and F s , with only 9.9% coming from measurement uncertainty.In practical terms, this implies that enhancing the calibration process is more effective in reducing uncertainty in Chla than repeated or more precise sampling.Additionally, this outcome can be achieved only through quantitative propagation of uncertainty across all variables, rather than relying on replicate Chla measurements, as replicates do not probe calibration factors.
The same process can be applied, analytically or numerically, to other variables and downstream products.Analytical propagation is best suited for relatively simple algorithms, such as those based on band ratios or line height.On the other hand, numerical techniques are typically employed for more complex algorithms, including those that involve atmospheric correction and those targeting downstream products, such as primary production.For instance, a recent study found that when propagated through an atmospheric correction algorithm, the primary sources of uncertainty in retrieving R rs from satellite data were Rayleigh scattering and water variability [8].These findings challenged earlier knowledge, which suggested that aerosol optical thickness (AOT) and aerosol type were the main drivers of uncertainty in R rs retrieval.
For numerical models, it is possible to better understand their predictive uncertainty by separating the relative contributions into aleatoric (data-driven) and epistemic (model-driven) uncertainty [28].Both types of uncertainty can also be recognized in the fluorometry example discussed above.Aleatoric uncertainty embodies random variations that can be attributed to factors such as fluctuations in sample and extraction volumes, often due to human influences.Conversely, epistemic uncertainty signifies the unknown unknowns.These are elements, like the impact of other chlorophyll pigments (e.g., Chlb), that are not independently detected by the fluorometer, yet may introduce bias into the final results.
A similar case of targeted improvement through understanding uncertainty is found in the development of atmospheric correction algorithms.The methodology pioneered by Gordon and Wang in 1994 [32] introduced a degree of uncertainty associated with spectral bands in the blue region.Subsequent work identified aerosols as a source of this uncertainty and sought to mitigate it by conceiving and incorporating novel aerosol models [33].However, in spite of these enhancements, a considerable level of uncertainty in blue-band R rs measurements of ocean and coastal waters persisted.A reinterpretation of these uncertainties, as proposed in [8], could pave the way for future advancements in atmospheric correction techniques, provided a suitable model is identified and developed.

Example: Decision-making
Environmental managers and policymakers often rely on remote sensing data in their decision-making, for example, on the trophic state of a water body [1].Policies such as the American Clean Water Act and European Water Framework Directive use the trophic state, which ranges from oligotrophic (low in nutrients) through mesotrophic and eutrophic (high in nutrients) to hypereutrophic (excess in nutrients), to define norms for ecological and human-centric water quality.The monitoring and control of man-made eutrophication of oligo-or mesotrophic waters due to fertilizer and wastewater excess runoff is an important component of water management [34], for which remote sensing provides data on wide spatial scales and with fast response times [1].
A common proxy for estimating trophic state in remote sensing is Chla, which is closely related to phytoplankton biomass and thus to nutrient load.The Organisation for Economic Co-operation and Development (OECD) standard defines the boundary between mesotrophic and eutrophic at 8.0 mg m −3 [34]; other standards use similar values.Thus, if a Chla of 7.0 mg m −3 is observed in a water body, it is designated as mesotrophic (Fig. 1C).
However, the uncertainty in the observed Chla provides important additional information.The uncertainty on remote sensing Chla estimates for inland waters is typically 30% to 80% [14].Assuming an optimistic uncertainty of 30%, with a normal distribution, the observation described above results in a Chla of 7.0 ± 2.1 mg m −3 .Approximately 32% of the associated probability density overlaps with the eutrophic range (Fig. 1C), indicating a significant probability that the water body is in fact eutrophic.
Incorporating uncertainty in this manner offers a more natural representation of water body status, as trophic conditions can change rapidly, and this probability helps model such fluctuations.The information obtained through working with uncertainty proves valuable in both short-and long-term contexts.In the short term, it alerts water managers to the possibility of a water body becoming eutrophic, enabling them to take appropriate action and preventing false positives.
In the long term, incorporating uncertainty estimation is crucial for detecting trends and understanding changes in water body conditions.A more comprehensive understanding of uncertainty enables researchers and decision-makers to better identify and quantify the impacts of climate change, land use changes, and other anthropogenic pressures on aquatic ecosystems [6].When evaluating long-term trends, accounting for uncertainty aids in distinguishing between true trends and random fluctuations or noise present within the observations.Ensemble prediction methods are specifically suited for modeling long-term trends, as they consider multiple sources of  [13] Intercomparison between instruments [23,39] Distinguishing between sources of variability [9,28] Finding directions for future research Weighted policy-and decision-making [6] Downloaded from https://spj.science.orgat LiB4RI on October 20, 2023 uncertainty, including different assumptions, initial conditions, or model structures.They provide a robust assessment of trends by exploring a range of scenarios and capturing the effects of different drivers and potential future conditions.Ensemble methods generate a distribution of possible outcomes, which can be used to quantify uncertainties associated with trends and facilitate risk-based decision-making.Policy-and decision-makers can use this information to weigh the potential benefits and consequences of different actions and develop strategies that account for uncertainty, thereby reducing the risks associated with extreme or unexpected events [35].

Concluding Remarks
Although uncertainty is inherent in all aspects of aquatic remote sensing, it often remains unaddressed.In this article, we have highlighted the consequences of neglecting uncertainty and demonstrated the advantages of embracing it, as exemplified in remote sensing validation, targeted improvements of models and instruments, and the decision-making process.
In match-up analysis, including validation studies and model regression, accounting for uncertainty enables more accurate estimation of the level of closure between datasets and models, as well as more accurate derivation of models and a thorough estimation of their performance.Uncertainty budgets enable targeted improvement of models and instruments by providing information on the main sources of uncertainty and error, thus highlighting which aspects (e.g., field or calibration data) should be improved with the highest priority.Last, incorporating uncertainty in decision-making processes leads to more informed and robust management strategies, for example, enabling policymakers and environmental managers to better assess the status of water bodies and respond to potential issues more rapidly and effectively by employing fuzzy logic.
In conclusion, uncertainty in aquatic remote sensing should not only be acknowledged but also be embraced as a source of information and a driver of innovation and progress.It stimulates us to challenge our assumptions, refine our models, and enhance calibration methodologies.It is crucial to be aware that the journey to understanding and effectively managing uncertainty can be a long-term endeavor, often taking many years to discover, validate, and incorporate a new method or approach.Many sources of uncertainty in environmental science, particularly in aquatic remote sensing, derive from natural variability.Encountering these uncertainties presents inherent difficulties and necessitates specific solutions, such as the application of the vicarious calibration procedure for ocean color sensors.Through consistent recognition, quantification, visualization, and effective communication of uncertainty, we bolster the reliability and robustness of remote sensing products.Therefore, the philosophy of working with uncertainty, rather than against it, should be central to aquatic remote sensing research and product development, but should also be understood as a process requiring patience, perseverance, and time.

Fig. 1 .
Fig.1.Three examples illustrating the benefits of embracing uncertainty.(A) Improved match-up analysis and regression through the application of weights based on uncertainty (resulting in 2.5× lower error) ("Example: Validation and match-up analysis" section).(B) Targeted refinement of data products by quantifying the contributions of each input to the overall uncertainty ("Example: Targeted improvement through uncertainty analysis" section).(C) Better decision-making by considering multiple scenarios and employing fuzzy logic ("Example: Decision-making" section).
Figure 1 depicts three specific example cases, which are discussed in detail below.
from in-water measurements of the absorption coefficient a and backscattering coefficient b b R rs ∝ b b a + b b

Table .
Overview of common applications for uncertainty in aquatic remote sensing and adjacent fields.