Statistical Characterization of Environmental Hot Spots and Hot Moments and Applications in Groundwater Hydrology

10 Environmental hot spots and hot moments (HSHMs) represent rare locations and events that exert disproportionate influence over the environment. While several mechanistic models have been used to characterize HSHMs behavior at specific sites, a critical missing component of research on HSHMs has been the development of clear, conventional statistical models. In this paper, we introduced a novel stochastic framework for analyzing HSHMs and the uncertainties. This framework can easily incorporate heterogeneous features in the spatiotemporal domain and can 15 offer inexpensive solutions for testing future scenarios. The proposed approach utilizes indicator random variables (RVs) to construct a statistical model for HSHMs. The HSHMs indicator RVs are comprised of spatial and temporal components, which can be used to represent the unique characteristics of HSHMs. We identified three categories of HSHMs and demonstrated how our statistical framework are adjusted for each category. The three categories are (1) HSHMs defined only by spatial (static) components, (2) HSHMs defined by both spatial and temporal (dynamic) 20 components, and (3) HSHMs defined by multiple dynamic components. The representation of an HSHM through its spatial and temporal components allows researchers to relate the HSHM’s uncertainty to the uncertainty of its components. We illustrated the proposed statistical framework through several HSHM case studies covering a variety of surface, subsurface, and coupled systems.

based on statistical formulations has been identified as a major gap in the current HSHM literature (Bernhardt et al., 2017;Arora et al., 2020).
Statistical approaches offer multiple advantages for furthering the HSHM concept. First, statistical 75 approaches can develop common formulations that integrate biogeochemical and hydrogeological knowledge from multiple HSHMs studies. Once developed, these formulations can be readily applied to identify HSHMs at similar sites. Second, statistical approaches can easily incorporate categorical indicators that represent spatial heterogeneity and quantify the uncertainty of HSHM occurrences tied to these features. Such approaches can be used as predictive tools to estimate future occurrences of HSHMs, and provide an alternative to computationally-expensive high-80 resolution mechanistic models. This would greatly aid decision-makers in identifying scenarios (e.g., changes in the climate or in environmental conditions) that increase risks associated with the occurrence of HSHMs phenomena.
Statistical concepts and models have been widely applied in hydrology and hydrogeology, including but not limited to modeling flow and contaminant transport, quantifying subsurface heterogeneity and the associated uncertainties, developing strategies for site characterization, and providing informative priors for ungauged 85 watersheds. For example, Rubin (1991) described a Lagrangian approach to obtain the spatial and temporal moments of contaminant concentrations in the subsurface. These statistical moments were deemed both necessary and sufficient to define the probability distribution of contaminant concentrations over space and time, and thus, quite useful for quantifying HSHMs. In a similar manner, statistical moments can be used to characterize the occurrences of HSHMs.
Statistical terms, such as concentration mean and variance, concentration cumulative density function (CDF), 90 exceedance probabilities, and exposure time CDF also provide significant guidance to assess the environmental risks associated with HSHMs . Although there is a lack of conventional statistical approaches in current HSHM studies, we believe it is feasible and valuable to develop statistical formulations to characterize HSHMs dynamics.
Successful characterization of HSHMs through physically-based models or statistical approaches relies on 95 experts' knowledge of a site, intensive field characterization, and possibly continuous field sampling to provide the data to develop and validate these approaches. Understandably, intensive site characterization and long-term sampling can be quite challenging due to the associated costs and efforts. Thus, it is necessary to develop approaches that could simplify but still effectively and efficiently represent the underlying structure of HSHMs. In this regard, indicator statistics, defined by the Bernoulli distribution, can be useful, on two counts. First, it is suitable for modeling bimodal 100 situations. For example, a situation where an event might or might not take place. Indicators are also appealing in applications because of the sparsity of the Bernoulli probability model. Indicator statistics have previously been applied to model flow and transport phenomena in groundwater (Rubin and Journel, 1991), where indicators were used to model the spatial distribution in a sand-shale formation. Rubin (1995) applied an indicator spatial random function to model contaminant flow and transport in bimodal heterogeneous formations. Ritzi et al. (2004) developed 105 a hierarchical architecture to represent the spatial correlation of permeability in cross-stratified sediment using indicator statistics. Wilson and Rubin (2002) and Bellin and Rubin (2004) used indicator statistics that describe whether particles were captured by sampling points to characterize the level of aquifer heterogeneity. These studies suggest that the simplification of the system's structure through indicator formulation significantly lower the number https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License. of measurements needed, and thus reduce the costs associated with site characterization, while maintain sufficient 110 information for modeling flow and contaminant transport. In addition, indicator formulation is useful in that it allows to aggregate multiple variables (e.g., all HSHM relevant variables) into a single random variable. Instead of characterizing the full distributions of each parameter, indicator formulation only requires knowledge of the critical condition for relevant parameters. Such indicator RV will take a value of 1 if the critical conditions are met, regardless of the original distribution for the parameters. These advantages are further explored in section 2 and 3. With indicator 115 formulations for HSHMs, researchers can focus on identifying the most relevant parameters for HSHMs quantification, which can significantly reduce the efforts and costs required for intensive site characterization.
In this study, we developed a statistical framework to quantify HSHMs occurrences and uncertainties. The developed statistical framework can help determine HSHM-occurrence probabilities under user-defined scenarios. It can also be used for estimating future occurrences of HSHMs. Based on the mechanisms that drive HSHM 120 occurrences, we determined three categories of HSHMs: (1) those triggered only by spatial (static) contributors, (2) those triggered by both spatial (static) and temporal (dynamic) contributors, and (3) those triggered by multiple dynamic contributors. Within each category, cases from existing studies were used to illustrate the procedures for constructing the statistical formulations. We focused specifically on HSHMs applications in groundwater, where we derived analytical solutions for the statistical formulation of HSHMs and analyzed the probabilities of HSHM 125 occurrences and their corresponding levels of uncertainty using synthetic case studies.

Statistical formulation of hot spots and hot moments
HSHMs represent rare places or events with increased hydrobiogeochemical rates or fluxes that are significantly elevated above the background condition, thus exerting disproportionate influences over an ecosystem's dynamics. We define ( * , * ) as the jointly distributed RVs for HSHMs, and * and * represent the spatial 130 components of hot spots and temporal components of hot moments, respectively. An indicator random variable, ( * , * ), is used to represent whether the pair ( * , * ) is an HSHM or not. If there exists a pair of ( * , * ) that satisfies the critical conditions of an HSHM, ( * , * ) = 1, and the pair ( * , * ) represents the location and time of the HSHM.
Following the original definition by McClain et al. (2003), in our method, ( * , * ) can take the value 135 of 0 or 1, depending on the concentration or reaction rate measure at ( * , * ), respectively: where ( , * ) ( , * ) are the concentration and reaction rate at the position and time * , respectively. ℎ ℎ represent the concentration and reaction rate thresholds, respectively. Defining indicators with 140 concentration, or reaction rate depends on the target of HSHM. Similar definitions can be introduced based on the regulatory limits or the interest of the investigator, using the mean concentration or the solute mass within the volume The critical values, ℎ and ℎ , are key to an effective application of the above framework and should be determined based on the specific scenario. For example, for contaminants that are associated with significant 145 environmental or health risks (e.g., nuclear waste or a cancerous substance), ℎ = 0 can be used so that the HSHM will be triggered as soon as there is the presence of such contaminants. As an alternative, a limit in the total accumulated mass within hot spots may be set, such as suggested by EPA (USEPA, 2001), but in this case the definition (1) of the indicators should be modified. For water quality parameters, ℎ = can be assigned, where represents the maximum concentration limit for a specific solute. Alternatively, ℎ = * can be used in cases 150 where * is chosen based on the experts' domain knowledge. This approach requires that such decisions be made before deriving any solutions to determine HSHM occurrences.
Given the definition of ( * , * ), we observe that ( * , * ) follows a Bernoulli distribution, such as ( * , * )~(< ( * , * ) >), where < . > is the operator indicating the ensemble mean of the indicator represented as a random variable. An important characteristic of the Bernoulli distribution is that all the 155 statistical moments of the RV ( * , * ) can be expressed as a function of the ensemble mean < ( * , * ) >.
For example, the variance is given by Characterization of the spatiotemporal distribution of ( * , * ) requires the incorporation of the mechanisms that govern the development and occurrence of HSHMs. However, the direct quantification of < ( * , * ) > can be difficult in both time and space domain. Thus, to facilitate this undertaking, we propose to 160 decompose ( * , * ) into a Type-A (static) indicator random variable-( * )-and a Type-B (dynamic) indicator random variable-( * , * ). Definitions of the Type-A and Type-B contributors are as follows:  Type-A (Static) Contributors. This category covers discrete spatial elements (and their associated critical states) that could trigger an HSHM once they come into contact with Type-B contributors (see discussion below). Critical states are the range of values needed to trigger an HSHM (either in standalone mode or when 165 coupled with Type-B contributors).
 Type-B (Dynamic) Contributors. This category covers dynamic variables (and their associated critical states) that could trigger an HSHM once they come into contact with Type-A contributors. This category includes, for example, mass transport variables. It also includes changes in local hydrological and environmental conditions (e.g., water table fluctuations). The displacement of solutes in the subsurface 170 (trajectories and travel times) from below-and above-ground processes are prime examples of Type-B contributors.
As an example, naturally reduced sediments (Type-A contributor) occurring next to the river corridor at the Rifle site were identified as carbon export hot spots (Arora et al., 2016;Wainwright et al., 2015). Studies showed that these hot spots were triggered when temperature conditions (Type-B contributor) varied in the subsurface, resulting 175 in a 170% increase in groundwater carbon export from the floodplain site to the river (Arora et al., 2016). In another example, topographic features, such as the backslope of the lower montane hillslope (Type-A contributor) within the East River Watershed (Hubbard et al., 2018), were considered denitrification hot spots, which can have a significant https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.
impact on the watershed-scale nitrogen loss pathway. These hot spots were often triggered by spring snowmelt and storm events (Type-B contributor). 180 Both indicators of the Type-A and Type-B contributors assume a value of either 0 or 1. If one of these indicators takes a value of 1, it can be viewed as an HSHM contributor. However, for an HSHM to occur, both indicators must have a value of 1 at the same location and time. This idea can be expressed as follows: (2) In Eq.
(2), ( ( * , * ) = 1| ( * ) = 1) is the probability of observing a dynamic HSHM within * , at time * conditional to the fact that * is a static hotspot and ( ( * ) = 1| ( * , * ) = 1) is defined similarly. Based on the mechanisms of HSHMs, we can classify HSHMs into three different categories as discussed below. These categories can be used to guide the application of the above statistical framework in a variety of complex HSHM 190 scenarios, and they can also be used to develop analytical or numerical solutions for both static and dynamic indicators.
Furthermore, the three categories provide guidance on using indicator approaches for both transport-driven and biogeochemically-driven HSHMs, as discussed by Vidon et al. (2010).

HSHMs induced by type-A (static) indicators
In this section, we consider HSHMs that are defined by static indicators only ( Figure 1a). This list can include zones of high, persistent concentration and reactivity that are due to the subsurface or the ecosystem's unique 200 hydrological and biogeochemical properties. For example, the accumulation of contaminants in the subsurface (e.g., the high nuclide concentration in the subsurface at the Hanford site) could lead to the evolution of persistent, high reactivity zones. An aquifer's reactivity is another example that could distinguish certain regions with high reactivity compared to surrounding areas (Loschko et al., 2016). Such high reactivity spots (hereafter denoted as * ) can be characterized by static indicator RVs due to the persistence of high concentration or reactivity. The static indicators 205 are defined as follows: where * represents the conditions needed to trigger a hot spot at * , and ( * ) represents the corresponding local conditions at * . https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.

HSHMs induced by type-A (static) and type-B (dynamic) indicators 210
HSHMs can also result from dynamic processes encountering specific conditions at * (Figure 1b). This is the situation described by Eq. (2), where the static indicators are determined first, and then used jointly with the dynamic indicators for complete HSHM characterization. For example, Bundt et al. (2001) concluded that preferential flow paths are biological hot spots for soil microbial activities. Preferential flow paths in such cases are candidate hot spot locations ( * ). Meanwhile, dynamic factors, such as snowmelt, control contaminant transport via the preferential 215 flow paths, and thus, they determined the hot moment component. The duration of these events presents the temporal component of the HSHM.
For an HSHM induced by both static and dynamic indicators, the static locations are selected first, based on their HSHM-related properties. After this, we can focus on characterizing the HSHM dynamics as they relate to the relevant locations. A selected location, * , could become an HSHM site based on characteristics defined through the following 220 static and dynamic indicators, respectively: where * represents the critical conditions needed to characterize a hot moment, and ( * , * ) represents the local 225 condition at * and * . The statistical model of ( * , * ) can be expressed using the statistical models of and , as shown in Eq. (2).

HSHMs induced by multiple type-B (dynamic )indicators
Various dynamic processes could jointly evolve into an HSHM ( Figure 1c). Unlike the previous scenarios where static locations can be determined through known characteristics provided by geophysical or other types of 230 data, HSHMs can also emerge due to the confluence of dynamic processes. This situation is described in Eq. (7). For example, Gu et al. (2012) analyzed how streamflow fluctuations could trigger a nitrogen HSHM. In their example, the dynamics of the streamflow and groundwater controlled the transport and mixing of the chemical reactants, thus triggering the occurrences of HSHMs. For this case, the static locations of * are determined by the confluence of multiple dynamic processes. 235 We can consider the case where an HSHM is predicated on dynamic processes, , where , ( * , * ) is the dynamic indicator representing the action (or inaction) of at * and time * . The hot spot location * is determined by the confluence of all dynamic processes at time * . These dynamic processes are not necessarily independent. Therefore, generally, the statistical model for the comprehensive dynamic indicator (which covers all dynamic contributors) assumes the following form: 240 [ ( * , * ) = 1] = [ ,1 ( * , * ) = 1, … , , ( * , * ) = 1].
In situations where the various dynamic contributors can be viewed as independent (e.g., Destouni and Cvetkovic, 1991)-where the reactants travel via different paths-then, assuming independence, we can state that https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.
Here, the mean of the dynamic indicator becomes 245 If * is a hot spot, then Eq. (9) also defines < ( * , * ) >. However, if * is not a hot spot, then we need to resort to coupled statistical modeling, as suggested by Eq. (2).

Examples of the statistical formulation of HSHMs with case studies
In this section, we selected numerous examples from published research to present how our approach can be 250 used to derive statistical representations for the HSHMs investigated in these studies. We grouped these studies into three categories based on the similarities of their underlying HSHM mechanisms, as described in section 2. We also characterized the environmental risk levels and impacts based on their target HSHMs. https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.

HSHMs triggered by static contributors only
In this section, we use Wainwright et al. (2015) as an example to illustrate our process to construct ( * , * ) following Eq. (3), where an HSHM is triggered by static contributors only (section 2.1). NRZs within floodplain environments at the Rifle site are considered biogeochemical hot spots because they represent elevated 260 concentrations of uranium, organic matter, and geochemically reduced minerals and they have been found to contribute to significant carbon fluxes to the atmosphere and to local rivers (Arora et al., 2016). Due to its characteristics, we considered the spatial distribution of an NRZ to be a static-mechanism-based hot spot. Wainwright et al. (2015) used geophysical data (e.g., induced polarization) to map the distribution of an NRZ at the subsurface level. They found that the phase shift ( ) from the induced polarization data of the NRZ was within [4.5, 5] , 265 compared to non-NRZ locations at ⊆ [1, 3.5] . Thus, can be used to construct the static indicator with a critical condition of [4.5, 5] . Therefore, Other static attributes, including but not limited to elevation, hydraulic conductivity, and resistivity, can also be used to define the critical conditions to construct the static indicator for hot spots through Bayesian conditioning. 270

HSHMs occuring when dynamic contributors coincide at locations defined by static contributors
The second case we present here utilizes Eq. (4)-(6), where HSHMs are triggered when dynamic contributors coincide at hot spots determined by static contributors. Here, we present the case investigated by Duncan et al. (2013), where riparian hollows representing less than 1% of the total catchment area contributed to more than 99% of the total denitrification within the watershed. In their study, the denitrification rates peaked during the base flow (midsummer) 275 period, when the riparian hollows were partially oxygenated and the hydrologic fluxes were at a minimum. The site was considered to have low inorganic N availability, and thus, nitrate was supplied via nitrification. The highest rates of denitrification were therefore tied to nitrification and the partially aerated conditions. The static indicator needs to be constructed based on the microtopographical features within the riparian zone. Specifically, the topographic wetness index (TWI) (Beven and Kirkby, 1979;Sørensen et al., 2006) was used in 280 Duncan et al. (2013) to delineate the riparian hollows from other riparian locations. Terrain analysis indicated a TWI threshold value of 6.0 and 8.0 for riparian hollows under wet and dry conditions, respectively, whereas 4.8 and smaller TWI values corresponded to other riparian locations (e.g., hummocks). Thus, the static indicator can be constructed using the TWI values within the riparian zone to determine the hot spot locations-the hollows. Hence, Multiple dynamic processes control the denitrification rate at the riparian hollows. As examined by Duncan et al. (2013), a partially aerated condition ( 2 > 5%) is needed to support nitrification, which supplies the nitrate for denitrification. As quiescent, non-storm periods during base flow favor the coupled nitrification-denitrification mechanism, this is another key process that needs to be represented by a dynamic indicator. Although Duncan et al. https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.
It is noted that these dynamic processes are not statistically independent. Usually, when one condition is met (e.g., base flow conditions), other conditions may consistently be satisfied (e.g., the transport of nitrogen in riparian 300 hollows). Alternatively, numerical modeling approaches are more feasible to construct the dynamic indicators based on the critical conditions at riparian hollows ( * ), where we could directly target 2 fluxes using a Monte Carlo approach. The statistical formulation used here is constructed specifically for the mechanisms described by Duncan et al. (2013). Thus, the detailed threshold limits could change under other denitrification HSHMs cases, such as the case presented in Hill et al. (2000), who focus on desert landscapes, or the one by Harms and Grimm (2008), where 305 the monsoon season is influential for the nitrogen transport. Nonetheless, the general formulation of HSHMs using indicators is still applicable.

HSHMs occuring when multiple dynamic processes converge in space
HSHMs can also be triggered by the confluence of multiple dynamic processes that lead to the convergence of complementary reactants at * . Accumulation of complementary reactants is mobilized and transported via 310 different hydrologic flowpaths. They converge at hot spot locations and trigger hot moments during the mixing.
Following the statistical framework developed in this study, Eq. (7) to (9) are suitable for this condition. In order to illustrate how the dynamic indicators are constructed, we consider here the case reported by Gu et al. (2012), where high biogeochemical activity was observed at the interface of groundwater and surface water during the stream stage fluctuations, which resulted in significant in-stream denitrification and 3 − removal. 315 In their study, hot spots form around the near-stream-riparian subsurface during river stage fluctuations, where active biogeochemical reaction (e.g., denitrification) requires both 2 depletion and the simultaneous presence In order for the hot moments to be significant, the stream-riparian zone should also be microbially active. Based on 330 these conditions, the dynamic indicators can be constructed as follows: [ ( * , * ) = 1] = [ , ( * , * ) = 1, , ℎ ( * , * ) = 1], where , ( * , * ) represents the dynamic process induced by the hydrologic conditions (e.g., stage fluctuation), and , ℎ ( * , * ) represents the dynamic process controlled by the transport and accumulation of chemical reactants.
Based on the critical values or ranges, we formulate the indicators as follows: where , ( * , * ) is the value that the indicator assumes in the ℎ realization and N is the total number of simulations.
Overall, our choices of the three studies should not limit the generalizability of the indicator statistics approach for deriving statistical formulations for HSHM applications. The critical conditions chosen to construct the 345 indicators are determined solely on the findings from these selected studies, and they will vary under different scenarios.

HSHM applications in groundwater hydrology
Processes occurring within the subsurface are important factors leading to HSHM occurrences. Among others, these processes include the migration of groundwater carrying reducing substrates, nuclear waste transport 350 within the subsurface, the accumulation and transport of dense non-aqueous phase liquid (DNAPL) and other biogeochemical processes. Some current modeling approaches that focus on subsurface HSHMs assume simplified hydrologic structures (e.g., homogeneous and isotropic domains) in quantifying contaminant fate and transport in the subsurface. However, such an assumption neglects the effect of the heterogeneity in the subsurface, leading to the underestimation of the uncertainties in the HSHM occurrences. Thus, in this section, we focus on HSHM applications 355 in groundwater hydrology, with a particular emphasis on spatial variability in the subsurface. Specifically, we consider (15) https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License. several situations often encountered in groundwater contamination studies and present the indicator statistical formulations of HSHMs. With these results, we can determine the probability of HSHMs occurrences in the subsurface at a given time and space. Further, we are able to determine how spatial variability influences HSHM occurrences and how this is translated into environmental health risks. 360

Importance of spatial variability in the subsurface
The heterogeneous structure of hydraulic conductivity leads to significant variability in the contaminant transport in the subsurface, which further results in the heterogeneity of biogeochemical cycling, such as the development of NRZs, reactive facies, and heterogeneity in aquifers' reactivity (Li et al., 2010;Loschko et al., 2016;Sassen et al., 2012;Wainwright et al., 2015). 365 Figure 2 demonstrates the uncertainty associated with HSHMs by looking at the flow fields in twodimensional log-hydraulic conductivity ( = ( )) fields with streamlines resulting from a uniform mean head gradient, left to right. The three panels differ in terms of the variance, 2 , of the log-conductivity. The covariance function used for generating the fields is exponential and isotropic. 2 is shown to have a profound impact upon the conductivity field. As the variance increases, regions of high and low log-conductivity emerge, creating preferential 370 flow paths bypassing the low conductivity zones as shown by particle trajectories. At smaller variance (i.e., 2 = 0.1), particles mainly travel along the mean flow direction with very limited departure from the mean trajectory, which are the straight lines connecting the left and right boundaries. In this situation, the arrival times of solute particles to critical locations (i.e., * ))are predictable. With large variances (i.e., 2 = 2), the streamlines assume a very irregular, hard-to-predict geometry, and we can observe the emergence of flow channels, where particles can move fast, next to 375 stagnant flow regions. Arrival times become more uncertain, because the exact geometry of the streamlines is hard to predict unless the Y field is known deterministically. However, in another realization of the field, the situation may be totally different, resulting in significant uncertainties in predicting the particle travel times. Thus, spatial variability of log-conductivity is a major uncertainty-inducing factor, and by extension, obviating the need for stochastic modeling of HSHMs in situations where the associated processes and attributes are subject to uncertainty. In the 380 following sections, we will present illustrative examples to analyze how subsurface spatial variability influences < ( * , * ) >, including variance and anisotropy ratio of the log-conductivity. https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License. Figure 2. Illustrative example of a heterogeneous log-hydraulic conductivity field and solute particle transport. Black lines represent simulated particle travel paths. A left to right hydraulic gradient of 0.1 is applied. Mean of log-385 conductivity is set at -3. Note color scales for log-conductivity are consistent in all three panels.

Single-particle within *
Consider the case of a point source release of non-reactive tracer originated from ( , 0 ). The dynamic indicator depends on a particle being within * at time * or not. If local (pore scale) dispersion is neglected, the 390 dynamic indicator is defined as follows: (17) Given that the particle does not change its volume while traveling. The expected value of this dynamic indicator at * is therefore: where ( * ) ( | 0 , 0 ) is the probability distribution function (pdf) of the particle's trajectory at * (Dagan and Nguyen, 1989;Rubin, 2003). Other situations may be addressed by using the same framework. For example, for an instantaneous injection within a source volume 0 , the ensemble mean of the dynamic indicator assumes the following form: (19) 400

Concentration-based within *
When considering local dispersion, or in case of a reactive tracer, the condition that the particle is inside the volume * does not suffice to define the dynamic indicator and a concentration threshold ℎ should be introduced: In the absence of local dispersion and for a reactive solute decaying at a (spatially) constant rate , the 405 ensemble mean assumes the following expression (Cvetkovic and Shapiro, 1990): where 0 is the initial concentration and [•] is the Heaviside step function. The ensemble mean (21) is the product of the probability that the particle assumes a concentration larger than the threshold at * (given that reaction rate is constant, this probability is either 0 or 1) and the probability that at the same time * the particle is within the hot spot 410 * . In other words, Eq. (21) expresses the fact that a particle inside * contributes to the hot moment only if its concentration is greater than the threshold. Equation (21) can be generalized to the cases of instantaneous injection into a source of volume 0 , as discussed before for the non-reactive case. For other complex situations, such as that in which is spatially variable and complex reaction networks, the ensemble mean of the indicators can be addressed by Eq. (16) in a Monte Carlo framework. 415 https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.

Assessing the duration of hot moment and probabilities
The probability that the hot moment persists over the interval [ 1 , 2 ] at * can be formally computed as follows: < ( * , 1 , 2 ) >= ( 1 , * ) ( 2 | 1 , * ), where ( 1 , * ) is the probability that the particle is inside * at time * = 1 and ( 2 | 1 , * ) is the probability that 420 the particle is still inside * at time * = 2 , provided that at time 1 , it was also inside * . If the particle exits * during interval [ 1 , 2 ], this time interval will not be qualified as hot moment; and thus the probability computation needs to ensure the particle stays within * during the entire time interval.
This probability reduces further if in horizontal and vertical transverse directions * is much larger than the respective integral scales, because the probability of observing negative longitudinal velocity components (i.e., along the mean 455 flow field) is much smaller than in the transverse directions (Bellin et al., 1992) and vanishes as formation heterogeneity reduces.

Illustrative example and indicator formulation
Following sections 4.1 and 4.2, we present here synthetic case studies that demonstrate the statistical formulation of the indicators using methods developed in stochastic hydrogeology. The choice of the synthetic case studies does not limit our approaches to broader applications where stochastic modeling with Monte Carlo simulations 475 are applicable. In most applications, the locations of hot spots ( * ) are determined by static indicators, such as riparian hollows (Duncan et al., 2013), reactive facies (Sassen et al., 2012), and NRZs (Wainwright et al., 2015). The static indicator is constructed according to the corresponding critical conditions provided by ancillary data such as topography, remote sensing, and/or geophysical data. Hence, in this case, assuming the boundaries of * are determined by a static indicator, we consider a hot spot ( * ) to be confined within the following volume: 1 ≤ 1 ≤ 480 1 ′ ; 2 ≤ 2 ≤ 2 ′ ; 3 ≤ 3 ≤ 3 ′ .

Figure 3. Configuration of the synthetic case study
Given this case, the hot moment will be triggered only when the contaminant particle is found within * . The probability of finding the contaminant particle within * is given by 485 where m denotes the space dimensionality. Equation (33) defines the dynamic indicator for this case. If * is already identified as a hot spot location, then Eq. (33) provides < ( * , * ) >. Otherwise, the static indicator should be incorporated to determine the boundaries of * in order to compute < ( * , * ) > as shown in Eq. (10) where 490 geophysical data is used to identify the spatial context of * . If we also assume steady, uniform in the average flow with mild heterogeneity of the log hydraulic conductivity field with Gaussian displacement pdf-then we can compute < ( * , * ) > analytically using the following equation: The form of the displacement variances is controlled by the spatial distribution of the hydraulic conductivity in the subsurface. Equations (A4)-(A6) of the appendix show the displacement variances for an axisymmetric exponential covariance function of the log-conductivity (A3). 505

Implications for HSHMs
In the following sections, we present the results from the case study described in section 4.3. Specifically, in section 4.4.1 and 4.4.2, we explore how heterogeneity of log-hydraulic conductivity influences the probability of HSHM occurrences. To make results as general as possible, lengths are made dimensionless with respect to the integral scales ( ℎ in the two horizontal directions and in the vertical one) and time with respect to the following advective 510 time scale: ℎ / , where is the mean velocity). In the following, we explore the effect of the remaining parameters, i.e. the anisotropy ratio = and the variance of the log-conductivity 2 , on the emergence of HSHM. We placed * along the mean trajectory at (21 , 0, 0) with dimensions as (2 , 2 , 2 ). The dimensions of the hot spot are therefore of two integral scales in the three coordinate directions ( 1 , 2 , 3 )and is placed at a dimensionless distance of 21 from the point source.  Isotropic heterogeneity ( = 1 and the particle moments given by Eqs. (A7) and (A8)) was considered to investigate the dependence of < ( * , ) > on 2 with results presented in Figure 4. = / ℎ is the 520 dimensionless time. At early time (e.g., < 5), larger probability < ( * , ) > is observed with increase in 2 .
At intermediate time, i.e., at times comparable with the mean travel time = 21, < ( * , ) > is inversely proportional to 2 . At late time (e.g., > 40), the largest < ( * , ) > occurs at intermediate 2 . We observe that 2 regulates the timing of the peak in < ( * , ) > , which is located in the proximity of the mean travel time, = 21, for weak heterogeneity, and shifts towards earlier times as 2 increases. 525 These effects relate to the relationship between travel times (from the source to * ) and 2 . The key point to note is that 2 controls the spread of the travel time around the mean travel time. Larger variance enhance channeling effects (Fiori and Jankovic, 2012;Moreno and Tsang, 1994, also in Figue 2), which in turn enable earlier arrival times.
But at the same time, large 2 also leads to the low-conductivity zones. Streamlines of the solute tend to bypass low hydraulic conductivity zones, however, the small amount of solute that actually penetrates these zones by advection 530 and diffusion gets trapped for long time before being released and this results in an extended tailing with low concentration and therefore low < ( * , ) >. Thus, with an increase in 2 , we notice an increase in the probability to observe both increasingly earlier and increasingly delayed arrival times, which widens the probability distribution. On the contrary at small variance, particles deviate little from the ensemble mean trajectory, because of the small contrast in conductivity between high and low conductivity zones. This results in small particle spreading 535 and travel times that differ only slightly from the mean travel time ( = 21), and a probability distribution less spread around the mean, where the peak is observed.
In summary, hydraulic conductivity contrast between low and high conductive lithofacies increases with 2 leading to the emergence of organized high conductivity pathways sneaking through surrounding low conductivity zones with the latter acting as "trapping" elements. This causes the emergence of both early and late arrival times. 540 Early arrival times are controlled by the connected high conductivity pathways and the late arrival times are influenced by the low conductivity zones, which act as low-release reservoirs for solutes. The discussion here (accompanying Figure 5) focuses on the impact of the anisotropy ratio in the correlation structure ( , defined above) on the HSHM probabilities. The anisotropy ratio, , provides an indication about the persistence of the log-conductivity ( ) in the various directions. The spatial correlation model used here for demonstration is that of axis-symmetry, which is common to sedimentary formations (Dagan, 1989;Rubin, 2003), 550 with providing the ratio between the persistence of in the vertical ( 3 ) direction, represented by , and the ones on the horizontal plane ( 1 − 2 ), represented by . In unconsolidated sedimentary formations, is typically smaller than by as much as one order of magnitude, due to the different time scales of the depositional process taking place in the horizontal and vertical directions, which leads to thin and elongated lithofacies and consequently to a much smaller persistence of values in normal to the horizontal plane (Miall, 1985(Miall, , 1988Ritzi et al., 2004). 555 Figure 5 compares < ( * , ) > between formations defined by different anisotropy ratios and different 2 . It shows that we have two factors to consider when explaining the differences in < ( * , ) >. First factor, as discussed earlier, is the expansion in the range of travel times due to increase in 2 . With larger variance, we observe higher probabilities for departure of the travel times away from the average. The anisotropy ratio adds a compounding factor. To understand its effect, we should recall the analyses of lateral displacement variances of solute 560 particles moving in heterogeneous formations (cf., Dagan, 1989, and Eq. A4 to A6 here), showing that smaller leads to smaller lateral (both vertical and horizontal) displacement variances, implying smaller probabilities for lateral departures from the mean flow trajectory. Smaller limits lateral spreads, and increase the probability of particle to enter * , sooner or later, and to trigger HSHM. The effect could also be viewed as a channeling effect of sorts: https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License. smaller implies blocks of small aspect ratio (i.e., long but thin elements), which provide fast tracks for particles 565 when defined by high values, while blocking lateral spreads when defined by low values.
There are a few additional things to note here for completeness. First, * in the present analysis is located downstream from the source, along with the mean trajectory of the solute displacement. We expect different results in situations where * is positioned at an offset with respect to the mean flow direction. Second, we note that the analytical models used to compute the displacement statistics are formally limited to smaller variance ( 2 < 1), 570 although they are shown to provide good approximations for large variances (Bellin et al., 1992).Third, the stochastic formulation provides the theoretical and computational formalism for conditioning the probabilities on in-situ measurements (Copty et al., 1993;Ezzedine and Rubin, 1996;Hubbard et al., 1997;Maxwell et al., 1999;Rubin, 1991a;Rubin and Dagan, 1992) as well as on information borrowed from similar sites (Li et al., 2018;Cucchi et al., 2019). 575

Discussion and Summary
In this study, we developed a general stochastic framework that could be used to characterize the spatiotemporal distribution of environmental Hot Spots Hot Moments (HSHMs), with groundwater applications. The stochastic formulation is built around the following principles:  The HSHMs are defined as random variables and the goal is to derive their stochastic distribution in terms 580 of the relevant processes and attributes.
 HSHMs processes cover the dynamic components of the HSHMs. An example could be the transport of solutes and reactants. HSHMs attributes refer to the static components of the HSHMs, e.g., in situations related to the nitrogen cycles, attributes could represent pyrite concentrations or naturally-reducing zones.
HSHMs could be defined through the confluence of a variety of contributors, both static and dynamic. 585  The processes and attributes are modeled as stochastic processes and random variables, respectively, based on the underlying physics.
 The static contributors are modeled stochastically using geostatistical space random functions.
 The dynamic contributors are modeled stochastically using probability distribution functions derived from the underlying mathematical-physical models. 590  Several HSHMs categories are defined, based on the contributing factors, as follows: HSHMs defined by dynamic contributors only, HSHMs defined by static contributors, and most commonly, HSHMs requiring the coupling of static and dynamic contributors. The HSHMs stochastic formulations are expressed in terms of the stochastic formulations of the relevant contributors.
 We provided a detailed review of multiple HSHMs and showed how they relate to our definitions. 595 The framework we proposed in this study is advantageous in that it allows to calculate the uncertainty associated with HSHMs based on the uncertainty associated with its contributors. Additionally, it provides a formalism, well established by Bayesian theory, for conditioning the HSHM probabilities on in-situ measurements as well as on information borrowed from geologically and otherwise similar sites. https://doi.org/10.5194/hess-2020-343 Preprint. Discussion started: 15 July 2020 c Author(s) 2020. CC BY 4.0 License.
We demonstrated our proposed approach through applications in the area of subsurface transport and 600 hydrogeology, focusing on the impacts of subsurface heterogeneity on HSHMs. We analyzed, quantitatively, how subsurface heterogeneity of the conductivity field control the HSHM statistics, for example, the time expected for the probability of the HSHM to occur to reach a-priori set thresholds or time to peak probability.
Lastly, as mentioned both here and in previous studies, statistical methods for quantifying the occurrences of HSHMs and the associated uncertainties are needed to advance our understanding of the mechanisms that cause 605 HSHMs, as well as to enhance our ability to predict HSHMs and manage their consequences.