A stochastic stability equation for unsteady turbulence in the stable boundary layer

The atmospheric boundary layer is particularly challenging to model in conditions of stable stratification, which can be associated with intermittent or unsteady turbulence. We develop a modelling approach to represent unsteady mixing possibly associated with turbulence intermittency and with unresolved fluid motions, called sub‐mesoscale motions. This approach introduces a stochastic parametrisation by randomising the stability correction used in the classical Monin–Obhukov similarity theory. This randomisation alters the turbulent momentum diffusion and accounts for sporadic events that cause unsteady mixing. A data‐driven stability correction equation is developed, parametrised, and validated with the goal to be modular and easily combined with existing Reynolds‐averaged Navier–Stokes models. Field measurements are processed using a statistical model‐based clustering technique, which simultaneously models and classifies the non‐stationary stable boundary layer. The stochastic stability correction obtained includes the effect of the static stability of the flow on the resolved flow variables, and additionally includes random perturbations that account for localised intermittent bursts of turbulence. The approach is general and effectively accounts for the stochastic mixing effects of unresolved processes of possibly unknown origin.


INTRODUCTION
Turbulence in stably stratified conditions is characterised by an unsteady behaviour caused by the interactions of processes at multiple scales. The non-stationarity leads to difficulties in representing turbulent diffusion in nocturnal and stable boundary layers (SBLs), and formulating accurate parametrisation schemes represents an ongoing challenge for atmospheric models (Holtslag et al., 2013;Sandu et al., 2013). Strong wind or weakly stable conditions can typically be represented through adaptations of the commonly used Monin-Obukhov similarity theory (MOST); however, low wind speed or very stable situations are still poorly understood. When the wind is too weak to sustain turbulence, the flow becomes controlled mainly by gravity waves (Zilitinkevich and Calanca, 2000;Zilitinkevich, 2002), density currents, wind gusts, or other types of motions leading to localised shear acceleration on the so-called sub-mesoscales (Sun et al., 2004;Mahrt, 2014;Sun et al., 2015;Mortarini et al., 2017). The unsteady forcing of turbulence by sub-mesoscale motions results in intermittent turbulent bursts that can be decoupled from the surface (Acevedo et al., 2016), thus breaking down assumptions that form the basis of surface-based MOST modelling. Intermittency is not exclusively a consequence of sub-mesoscale motions; indeed, global intermittency can occur despite the absence of external forcing due to the presence of internal gravity waves in the free atmosphere (Zilitinkevich, 2002;Ansorge and Mellado, 2014). A cyclic behaviour of turbulence can result from the competition between a strong surface cooling leading to an increase of thermal stability and the resulting shear generation of turbulence due to low friction (van de Wiel et al., 2002).
The reasons for irregular mixing in SBLs are still the subject of research, and a unifying framework has yet to be found (Mahrt and Bou-Zeid, 2020). A general weak scaling of turbulence with the wind speed and stratification is observed in very stable contexts, which may be partly due to non-stationarity associated with shear generation by sub-mesoscale motions, or in other words to externally forced intermittency (Mahrt, 2014). Numerous field studies have indeed highlighted examples of generation of intermittent turbulence by wave-like motions, density currents, low-level jets, Kelvin-Helmholtz instabilities, and other sub-mesoscale motions (Einaudi and Finnigan, 1993;Sun et al., 2002;Sun et al., 2004;Poulos et al., 2010;Sun et al., 2012;Mortarini et al., 2017). In addition, the non-stationary forcing by sub-mesoscale processes can occur on time-scales that are just above the largest turbulent scales, preventing a scale gap between turbulence and its forcing (Vercauteren et al., 2016;Vercauteren et al., 2019a;Boyko and Vercauteren, 2020). The absence of a scale gap in turn prevents the possibility of reaching a quasi-equilibrium of turbulence (Mahrt and Bou-Zeid, 2020). Yet, classical closure models assume the turbulence to be in equilibrium with its forcing by the mean flow.
Developing more appropriate parametrisation schemes for the SBL hence entails at least two challenges, also summarised by Mahrt and Bou-Zeid (2020): one is to account for mixing and transport due to the intermittent behaviour of turbulence and to non-stationary, unresolved sub-mesoscale motions; the other is to account for departures from statistical equilibrium of the turbulence itself. Both aspects are intertwined by nonlinear feedback processes, leading to additional uncertainty in the resulting unresolved mixing that needs to be parametrised in numerical weather prediction (NWP) models. Turbulence closures used in operational NWP models typically enhance turbulent diffusion in stable conditions beyond what can be justified from observations. This approach is often justified by the need to incorporate intermittent behaviour, but it is known to be detrimental for the representation of the SBL (Sandu et al., 2013). Alternative approaches are needed to account for unsteady turbulence and the resulting model uncertainties.
An attractive solution to represent the uncertainty of unresolved processes, initially proposed for weather forecasting, is to use stochastic parametrisations. Stochastic parametrisations have had a tremendous impact on probabilistic weather prediction (Palmer, 2019), the most important one being to have increased reliability and skill of forecasts. Stochastic parametrisations represent atmospheric processes as a combination of a predictable deterministic and an unpredictable stochastic component. Developments in stochastic parametrisations have, for example, targeted convective clouds, exploiting cloud-resolving simulations for parametrisation development (Sakradzija et al., 2015;Sakradzija and Klocke, 2018;Christensen, 2020;Fleury et al., 2022). One key aspect in the development of stochastic parametrisations is to include uncertainty in a physically meaningful way. For this, the parameter dependence of stochastic models on the resolved-scale dynamics plays an essential role. In essence, the idea is to uncover scaling of the parameters of a stochastic model with resolved-scale dynamics. An empirical identification of a scaling relationship is in fact central to the derivation of MOST, which used field measurements to uncover universal functions of atmospheric stability that can describe turbulent fluxes in the surface layer. MOST is often described as a semi-empirical approach, since physical principles are used to identify the appropriate dimensionless variables between which to search for scaling relationships. The development in this work will follow a similar semi-empirical rationale.
This work seeks to generalise MOST by enabling an explicit treatment of uncertainty of the fluxes to be modelled. Such a stochastic framework has been suggested as a useful solution to represent unsteady mixing in the very stable boundary layer (Mahrt, 2014;Nappo et al., 2014;Calaf et al., 2022). A data-driven approach towards stochastic modelling was followed by Kang et al. (2015), who sought to classify the diversity of sub-mesoscale motions as a way to obtain a statistical description and improve their treatment in numerical models. As a step towards developing an appropriate stochastic model for unsteady turbulence impacted by sub-mesoscale motions, Vercauteren and Klein (2015) suggested a data-driven approach to study interactions between turbulence and sub-mesoscale motions. These and subsequent analyses exploited data classification approaches combined with multiscale data analyses to shed light on the physical drivers that control mixing and transport in unsteady flows (Vercauteren et al., 2016;Vercauteren et al., 2019a;Boyko and Vercauteren, 2020).
Those observational studies have proven insightful to better understand unsteady turbulence, but they have not been able to deliver a stochastic model for unsteady turbulence in the SBL. A prototype stochastic parametrisation of the SBL was introduced by Abraham et al. (2019), also based on results of data analyses, but its focus was on transitions between weakly and strongly stable regimes in the SBL and not on the parametrisation of unsteady turbulence. To bridge the previous analyses of unsteady turbulence to parametrisation development, Boyko et al. (2022) devised a method to uncover the physical scaling of parameters of a stochastic model using observations. The method is a model-based clustering approach that builds upon the method of Horenko (2010a) used in Vercauteren and Klein (2015) to quantify the impact of sub-mesoscale motions on turbulence. It extends the model-based clustering approach such that the non-stationary model used to represent the time evolution of the observations is a time continuous stochastic differential equation (SDE). This extension relies on a data-driven approach to model multiscale complex systems suggested by Krumscheid et al. (2015). Based on numerical experiments, Boyko et al. (2022) demonstrated that the method could uncover scaling of the SDE model parameters with slow modulating variables, based solely on observed time series. The approach therefore provides an ideal framework to derive stochastic subgrid-scale models that are modulated by mean-flow, resolved variables. The stochasticity of the model provides a means to include the model uncertainty in the parametrisation, and this uncertainty can depend on the large-scale forcing according to uncovered scaling relationships.
In this work, the method will be used to devise a stochastic extension to MOST. The model will then include uncertainty related to intermittency, to non-stationary mixing impacted by sub-mesoscale motions, and to the lack of statistical equilibrium of unsteady turbulence. A critical choice in the application of the model-fitting method is that of the variable that will be modelled as a stochastic process. Since MOST actually entails a data-driven component in the definition of the universal stability functions (Foken, 2006), the stability correction modelled in MOST as a function of a dimensionless scaling parameter representing atmospheric stability is the most natural candidate for a stochastic extension to MOST. Following this rationale, this article will target the stochastic modelling of the stability corrections to be used in MOST, while the physical basis of MOST will be kept unchanged.
Keeping the physical basis of MOST in the model means that a gradient-diffusion closure is still assumed, which may not be valid in the SBL, even when the stability correction, and hence the mixing length, is stochastic. However, these types of closure models are in use in NWP, and this extension is a first approach to model the variability of mixing resulting from sub-mesoscale motions and from turbulence intermittency. The stochastic parametrisation could replace the use of long-tail stability functions that are typically used in MOST to enhance the turbulent diffusion in high-stability regimes (Sandu et al., 2013). Instead, it enables the representation of localised bursts of turbulence through a stochastic model. This feature is important, as localised events can, for example, trigger regime transitions in the SBL (Lan et al., 2022). The inclusion of intermittent events in closure models may have important impacts on the representation of the SBL.
The article is organised as follows. The dataset used to study the potential of stochastic stability corrections is presented in Section 2. Section 3 presents the temporal scales of dynamics that are considered in the modelling process, along with an introduction of the stability correction term that will be modelled. Section 4 introduces the stochastic perturbation approach that is used to derive a stochastic stability equation (SSE). The optimal form of the SSE is analysed in Section 5, which also presents the estimation of the model coefficients, based on the observational data considered. The closed-form model obtained is finally tested in Section 6, highlighting very promising results to represent unsteady turbulence statistics as a function of atmospheric stability. The article concludes with a discussion in Section 7.

DATASET
The dataset considered consists of high-resolution eddy-covariance data collected during the Fluxes Over Snow Surfaces Phase II (FLOSS2) field programme, which took place between November 20, 2002, andApril 4, 2003, in the North Park region of Colorado, near Walden. High-quality wind measurements sampled at 60 Hz are available for the analysis. The site is locally flat with variable bush height from 0.2 to 0.5 m (Mahrt and Vickers, 2005). During the field campaign, a thin layer of snow covered the ground for about 20 days, ensuring lasting stable stratification. The stably stratified conditions are selected by calculating the diurnal cycles of the heat flux and extracting the nocturnal period as the hours with negative heat fluxes when averaged over the duration of the campaign. Seven Campbell CSAT3 sonic anemometers recorded three wind velocity components and temperature at heights of 1, 2, 5, 10, 15, 20, and 30 m. The instruments were attached to a rigid truss tower that allows air masses to pass through. The dataset was quality controlled following Vickers and Mahrt (1997) and carefully prepared for the following analysis by us.
Humidity correction of the sensible heat flux is not performed. This study focuses mainly on modelling turbulent momentum transport, so the turbulent heat flux is of secondary importance. The dataset is known for a temperature bias at slow temporal scales; however, this is of minor importance for the stable periods (Mahrt, 2011). Despite the temperature bias, the dataset is of excellent quality and provides nearly continuous time series, an essential aspect for the data analysis methods applied here.
The total dataset contains 132 days of continuously recorded time series. In individual periods, the instruments were covered with ice and therefore did not register any records, resulting in long data gaps (on the order of several hours). The days with these long periods were removed, leaving 102 days for the analysis. The remaining shorter minute-scale data gaps (likely quality control results, corresponding to less than 1% of the data) were linearly interpolated in time. The interpolation increases the continuity required for the finite-element method (FEM)-H 1 -regularisation-SDE parametrisation framework developed by Boyko et al. (2022) and used for the subsequent model fitting. The velocity vector is double-rotated (Lee et al., 2004) in the mean wind direction using a moving window approach. The direction of the mean wind for the rotation transform is calculated from the 30 m sonic anemometer using a moving average on a scale of 1 hr.

INTRODUCING SUB-MESOSCALE DEVIATIONS IN THE STABILITY CORRECTION
Atmospheric models rely on parametrisations to estimate turbulent transfer coefficients for momentum and heat and other scalars. A broadly used parametrisation is a turbulence kinetic energy (TKE) closure, also denoted as a turbulence model of order 1.5 (Stull, 1988). In this closure, the evolution of the TKE is represented via a prognostic equation, and the computation of the eddy diffusivities of momentum and heat relies on a turbulent length-scale, which is typically given a diagnostic formulation (Baas et al., 2018). The length-scale formulation includes the effect of static stability through stability functions, and improvements are needed in the SBL context (Gryanik et al., 2020). The present analysis targets the development of a stochastic extension for the stability functions, so as to account for the observed variability and intermittency of turbulent fluxes in the SBL. The temporal scales of the dynamics are key to the analysis and are thus introduced here, before introducing the type of turbulence parametrisation considered in the study.

Defining turbulent scales, sub-mesoscales, and mean scales
To study sub-mesoscale perturbations of the mean-flow dynamics, a triple decomposition of the measurements is used. Figure 1 shows a sketch introducing the variables with their associated scale. The data are essentially decomposed in a mean scale, a sub-mesoscale, and a turbulence scale following the strategy of Boyko and Vercauteren (2020); Calaf et al. (2022).
For this study of the FLOSS2 dataset, the long-time average (⋅) is set to 1 hr. The long-time average should be considered as the time-scale at which a logarithmic profile can be approximately observed. In a modelling context, this scale is assumed to be represented with some appropriate Reynolds-averaged Navier-Stokes (RANS) formulation. This means we assume that the mean shear is sufficient to maintain a certain amount of background (potentially weak) continuous turbulence connecting the boundary layer to the ground throughout and that the intermittency to be modelled will produce deviations from this background profile. More rigorous considerations would require to test for a strictly logarithmic scaling of the mean wind profile, as done in Boyko and Vercauteren (2020). This study found this condition to require a 3 hr averaging for the FLOSS2 dataset, whereas an additional jet scale was found between 1 and 3 hr to have enough energy to produce ground-sheared turbulence but insufficient to contribute to the logarithmic wind-speed profile up to a height of 30 m. However, to avoid further complications in the modelling of jet dynamics, the 1 hr mean is assumed to be adequately represented with some appropriate RANS formulation, and the stochastic modelling will target deviations around the corresponding wind profiles.
The definition of the short-time average scale, meant to include turbulence but exclude sub-mesoscale motions, would ideally relate to the integral scale of turbulence. Based on previous multiresolution analysis for the considered dataset (Boyko and Vercauteren, 2020), the short-time average(⋅) is set to 3 min. The integral scale of turbulence depends on the Reynolds number Re (Pope, 2000). In contrast, the uncertainty of the integral scale likely increases with increasing Richardson number Ri because, in the strongly stable regime, the sub-mesoscales begin to dominate over the mean scale in the generation of turbulence (see Boyko and Vercauteren (2020);, sec. 4.3). The 3 min period is dataset specific and has an averaged value -F I G U R E 1 A sketch of the scales and associated variables defined for the stable conditions of the Fluxes Over Snow Surfaces Phase II dataset to identify the stochastic stability equation. For strong wind conditions, the inertial subrange extends into the defined scale band between 1 hr and 3 min. For stable conditions and weak winds, this scale band is filled with sub-mesoscales. The turbulence averaging scale of 3 min is larger than the inertial subrange because it is estimated based on an entire dataset with averaged multiresolution decomposition, and therefore the turbulence scale may differ locally. Re, Reynolds number; Ri, Richardson number; u, v, horizontal wind components; , sub-mesoscale motion; U, mean horizontal wind speed; u * , friction velocity; T, potential temperature; z, height coordinate; overlines, tildes, and carets indicate long-time average, sub-mesoscale average, and short-time average respectively [Colour figure can be viewed at wileyonlinelibrary.com] that suits night-time. From the definition of the long-and short-time averages, it follows that a scale band corresponding to sub-mesoscale motions is defined between 1 hr and 3 min. The horizontal velocity components in this band are referred to asũ and hṽ.
Based on the defined averaging times, the triple decomposition is done with a rolling time average using the sampling frequency of 60 Hz, thus providing higher sampling frequency for subsequent analyses than more traditional block averaging procedures. After applying this averaging, the data is downsampled to 60 cycles/hr, which is the highest frequency content of the averaged signal. More details about the procedure is given in (Boyko and Vercauteren, 2020), sect. 3.1).

The turbulence parametrisation investigated and MOST
Broadly used turbulence closure schemes in NWP models are those based on a TKE closure, also called 1.5-order closure models. A RANS with a turbulence closure model of order 1.5 (Stull, 1988) is formulated for the SBL by prescribing, among other things, the kinematic eddy-viscosity coefficient, which models the momentum diffusion due to turbulent eddies: where is a modelling constant, l m is the turbulent mixing length for momentum, and e is the TKE. To achieve a complete closure, a model has to be specified for l m (Mellor and Yamada, 1982). The mixing length is typically expressed in terms of a local gradient Richardson number (Cuxart et al., 2006). In addition, the eddy heat conductivity coefficient K h responsible for the heat diffusion due to turbulence should be specified to achieve a complete closure.
In this study, we seek to model the l m stochastically as a function of the dimensionless local gradient Richardson number where the temporal averaging scale of the potential temperature T and the horizontal wind components u, v are specified next. The gravitational acceleration is g, and T 0 denotes the absolute temperature. In the following, the stochastic modelling focuses only on the coefficient K m by assuming a constant value for the turbulent Prandtl number Pr t = K m ∕K h . Following Mahrt and Vickers (2003), the mixing length l m can be expressed as the ratio of friction velocity u * and wind gradient, both considered on a Reynolds average scale. Here, the goal is to model localised excursions of the mixing length from its Reynolds average value, where the average value relates to the mean shear. Hence, the l m to be studied is defined based on the friction velocity estimated on the short-time scale,û * = (û ′ w ′ 2 +v ′ w ′ 2 ) 1∕4 , and on the mean shear as obtained from the long-time average (see Figure 1 for the definition of the averaging scales): In Equation (3), = 0.4 is the Von Kármán constant, z is the height coordinate, f is the momentum stability correction function used in MOST, and U = is the mean horizontal wind speed, where the overline denotes the 1-hr moving average. The rationale behind this definition is that the work seeks to model deviations from a background logarithmic profile, or, in other words, uncertainty around an average mixing length, which is dictated by the mean shear. The driving hypothesis is that sub-mesoscale motions and more generally intermittency induce transient changes in the mixing length due to departure of the turbulence from statistical equilibrium. The function f is determined from the flux-gradient relationships derived in MOST and scales the amount of turbulent mixing with a dimensionless stability in a turbulence closure scheme of order 1.5 (the type of scheme considered in this study). In the following, only the mixing length for momentum is modelled stochastically. The observable representing the needed momentum stability correction process for the dataset analysed is obtained from Equation (3) as To represent the lack of statistical equilibrium in an extended parametrisation, the uncertainty induced by the short-scale deviations introduced in is modelled with an SDE.

MODELING THE STABILITY-DEPENDENT UNCERTAINTY IN MOST
The objective is to establish an SDE for the temporal evolution of the variable , which is denoted as the SSE. The resulting parametrisation is a stochastic extension of MOST, and it thus includes model uncertainty. The parameters of this new stochastic equation scale with the Ri, an encouraging result that is uncovered in the study. The functional form of the equation and the functions that map the Ri to the parameter values are specified in the modelling process. The functional form is derived from MOST, and the scaling with Ri is uncovered through the parameter estimation using the model-inference methodology developed by Boyko et al. (2022). In the following steps, the strategy is formulated for determining the functional form of SDE.
1. Substituting the classical MOST function f with an ordinary differential equation (ODE) under the condition that the steady state of the ODE closely follows f . 2. Converting the ODE into an SSE by perturbing one of its parameters. 3. Fitting the proposed SSE with nonstationary parameters to the FLOSS2 data and identifying the scaling of the parameters with the Ri.
To begin identifying the model structure, we consider two characteristic regimes of the SBL; namely, the weakly stable regime and the strongly stable regime. In the weakly stable regime, the mean wind drives turbulent diffusion due to mechanical shear at the surface. In the strongly stable regime, turbulence is intermittent (Zilitinkevich, 2002) and partially modulated by sub-mesoscale motions (Vercauteren and Klein, 2015; Boyko and Vercauteren, 2020).

Two limiting regimes
Though a critical Richardson number Ri c = 0.25 is often used in studies of stably stratified turbulence as an indication for flow laminarisation, turbulence is known to persist beyond such a threshold (Galperin et al., 2007). At high values of the Ri (i.e., high stability), a plethora of phenomena occur on partly overlapping scales. Because of that, the velocity gradients used in the definition of the Ri may be associated with different processes (Boyko and Vercauteren, 2020). Disentangling the multitude of processes is ambiguous and, as a result, the Ri c derived from time-averaged quantities may include different processes (Mahrt and Vickers, 2006;Mahrt, 2010). This ambiguity may explain why turbulence is sustained much beyond the Ri c . Existing stability functions that parametrise the SBL in some operational models are examined next. The most straightforward stability function f (Ri) is linear and provides a simple tuning procedure to reduce the mixing length scale for increasing stability. Here, we are focusing on the functional relationship between stability correction and the Ri as it is reported to be used operationally (Cuxart et al., 2006;Sandu et al., 2013). The functions significantly limit turbulent mixing near Ri c . However, some models apply a mild cut so that some mixing prevails beyond the critical value (see Figure 2a).

F I G U R E 2
Comparison of different stability correction functions with the steady state of the ordinary differential equation (ODE) proposed in Equation (5). The Fluxes Over Snow Surfaces Phase II dataset is partitioned according to the wind threshold 8 m⋅s −1 at 30 m height. The colour encodes the height of the measurements. (a) Strong wind data in which the classical scaling f (Cuxart et al., 2006) is present up to the critical Richardson number Ri c marked by the vertical red dashed line. (b) Weak wind data that do not fully follow the scaling and are modelled in the study. (c) The steady-state solutions of the three ODEs defined through Equation (5) with the nonlinearity parameters. In all panels,Ri is calculated with an averaging scale of 3 min, and is as defined in Equation (4) Following the comparative study of Cuxart et al. (2006), Figure 2a,b shows several functions: the European Centre for Medium-Range Weather Forecasts model, the UK MetOffice model, and the Meteorological Service of Canada model. Sandu et al. (2013) provide a historical overview of the functional forms considered in the European Centre for Medium-Range Weather Forecasts model. They all obey a similar structure with some modifications considering turbulence cut-off characteristics.
In Figure 2, the stability correction of the FLOSS2 experiment is shown for all measurement heights, estimated based on 3 min averages following the definition in Equation (4). The correspondingRi is calculated with an averaging scale of 3 min. The vertical gradients are estimated locally by applying a least-squares fit of the log-linear equation following (Nieuwstadt, 1984). The data are separated using the mean wind threshold 8 m⋅s −1 , taken at a height of 30 m. This separation is chosen arbitrarily, and its sole purpose is to illustrate uncertainties inherent to the weak-wind regimes and motivate the need for stochastic modelling. The classical scaling theory applies in the strong wind or weakly stable regime (see Figure 2a), whereas intermittent and unsteady turbulence leads to a large observational scatter in the scaling relationship in the weak-wind or strongly stable regime (see Figure 2b). This observation relates to the existence of a minimum wind speed below which turbulence cannot be sustained ( Van de Wiel et al., 2012). The functions shown in Figure 2a,b have been included in the figures as obtained from the literature and are not adapted to the data under consideration. No specific partitioning of the data into the individual regimes is performed, and no further preprocessing is done beyond that described in Section 2 and the averaging described in Section 3. Good extraction of periods with constant mean winds reduces the scatter in the plots. Such preprocessing to exclude non-stationary periods is unnecessary in the proposed stochastic parametrisation since the procedure models the variability inherent to non-stationary flows. To capture that variability (see Figure 2b) and the transition into the non-stationary regime that occurs for higher Ri, the intention is to model the non-stationary and nonlinear time series of the variable with an SDE. Introducing a time dependence in the model will enable the representation of temporal intermittency.
To extend the classical functional scaling relationships to the envisioned temporal SDE model, the steady-state is first estimated for three alternative candidates of nonlinear ODEs, which will be randomised next. These ODEs are defined such that the equilibrium solution evolves according toRi closely following the classical stability functions. The three hypothetically suitable equations arė Equation (5) forms the basis for parameter estimation and provides a reasonable collection of models to test with the clustering method (see Boyko et al. (2022)). The variations of the parameter b are introduced here to show that the ODE obeys the traditional scaling theory in the stationary, unperturbed limiting case. This scaling of the steady-state solutions to the three ODEs withRi can be visualised in Figure 2c.
In the next step, the effects of sub-mesoscale motions are included as perturbations in Equation (5). It is known that the importance of sub-mesoscale motions for turbulence generation becomes relevant at low wind speeds and large values of Ri (Vercauteren et al., 2016;Boyko and Vercauteren, 2020). The following section motivates a structural form of an SSE by following a perturbation ansatz, considering local perturbations of the Ri value. The derived structural form will serve as a basis to fit an SSE with unknown parameters from the observations, as will be presented in Section 5.

Parameter perturbation ansatz
The stochastic parameter perturbation approach and the resulting structural change in Equation (5) require a foundation, and the following hypothesis is investigated. Intermittency and sub-mesoscale motions induce temporally local perturbations in the stratification and that should impact the local gradient Ri. In particular, the perturbations of the velocity profile in the scale band bounded by the largest turbulent eddy and the mean scale should correlate with the fluctuation of the Ri in the same scale band. In what follows, we examine whether the variance of the Ri correlates with the sub-mesoscale wind variance. This relationship provides a crudely simplified motivation of the parameter perturbation approach introduced in the following. The temporal averaging scales are discussed next.
The temporal evolution of the gradient Richardson number at a scale of 3 min is denoted withRi. It refers to the scale of the largest turbulent eddy on average, as estimated through multiresolution decompositions of the fluxes (Boyko and Vercauteren, 2020). The variability of Ri is measured as a variance by calculating the running variance at the scale of 1 hr. This is a convenient operation since by construction the variableRi has no fluctuations shorter than the period of 3 min, and provides an approximation to the fluctuation intensityRi ≔RiRi of the Ri related to the frequency band between 3 min and 1 hr.
Next, we consider the sub-mesoscale wind components, denoted byũ andṽ. These velocity components are separated with a bandpass filter using the discrete wavelet transform as presented by Boyko and Vercauteren (2020), selecting the wind oscillations in the scale range between 3 min and 1 hr. The non-dimensional variable quantifying the variability of the sub-mesoscale motions is defined as̃( where U is used to non-dimensionalise the running 1 hr variance of the sub-mesoscales. The factor 2 averages the two components. A correlation analysis shows that an increase in the variance of the Ri is concomitant with an increased relative variance of the sub-mesoscale wind velocity. Indeed, ln(̃) relates linearly to the natural logarithm of theRi: Ri ln ≔ ln(Ri) ln(Ri). A scatter plot of these variables for the height of 10 m is shown as an example in Figure 3a. The Pearson (assuming linearity) and Spearman (detecting monotonic nonlinear scaling) correlation coefficients (Rodgers and Nicewander, 1988;Scott, 2015) are used to identify the nonlinear relationship and show a value of ≈0.8 over the entire measurement height (see Figure 3b). Conditioning the data on the value of the mean wind in Figure 3a shows that, especially for the strongly stable regime (U < 3 m⋅s −1 ), the value of the two variables̃and Ri increases. Hence, the variability of theRi parameter for the strongly stable regime (see Figure 3a) arises in concert with the variability of the sub-mesoscale motions̃. This observation supports the perturbation approach used to motivate the SSE discussed later herein.
Assuming that sub-mesoscale motions act permanently and randomly in the boundary layer, the parameter is perturbed asR where the valueRi is perturbed from its mean Ri (the overline denotes the 1 hr mean) with an independent and identically distributed random process t . This process mirrors the effect of random turbulent mixing assumed to be related to the sporadic processes in the sub-mesoscale band (see Figure 1). The intensity of perturbation is denoted with p and is modelling the variablẽ(see Figure 3). The disturbance type in this study is limited to pure white noise (independent and identically distributed). One could instead wish for a process whose energy spectrum is non-flat (coloured noise), band limited, and has a particular smoothness. The decomposition approach in Equation (7) is the simplest possible and can be replaced by a more reasonable modelling hypothesis in future developments. The selected parameter perturbation is included in Equation (5). Thereby, the process t is defined as time increments of the Wiener process dW(t). The SSE takes the following general structure: where is the stochastic correction of the turbulent mixing length under the influence of stratification and of the random external perturbations of intensity p . The stochastic forcing term p dW(t) is a multiplicative type obtained by inserting Equation (7) into Equation (5). Equation (8) is different from its unperturbed form, Equation (5), and therefore it would be wrong to interpret the parameters in the same way. After parameter perturbation, the interpretation of howRi changes the steady state of the solution is altered by the noise. To investigate this, one can perform a stochastic bifurcation analysis (Stuart and Ord, 1994). Such study is of secondary importance because the parameters of an SDE whose form is motivated by this perturbation ansatz will be estimated from data, with the goal of uncovering scaling of the SDE coefficients with Ri. The task of the model-based clustering method (see Boyko et al. (2022)) is to resolve the non-stationarity and identify the hidden relationships of the fitted SDE parameters with the resolved scales. Accordingly, the SSE model to be estimated is as follows: where the total time-varying parameter vector (t) = [ (t), (t), p (t)] is subject to parametrisation via a data-fitting procedure. The nonlinearity parameter , defined in Equation (8), is constant and set to 2 for the estimation procedure, reducing the total number of free parameters. This choice was made after rejecting model alternatives, as presented in Appendix A. In this model, (t) represents the suppression rate of the turbulent mixing length l m ∼ 1∕ due to processes occurring above the mean scale, (t) is the production rate of the turbulent mixing, which becomes relevant at larger values of Ri, and p (t) is the perturbation intensity of the turbulent mixing assumed to result from random subgrid processes not resolved by the RANS model (e.g., sub-mesoscales). The data-driven estimation of the SSE parameters and exploration of their scaling with Ri will be investigated in Section 5.2.

Summary of the model estimation procedure
A series of studies (Horenko, 2010a;Metzner et al., 2012;Pospisil et al., 2018) introduced and developed an efficient non-parametric model-based clustering framework that proved to be a successful analysis tool in atmospheric sciences (Horenko, 2010b;O'Kane et al., 2013;Franzke et al., 2015;Vercauteren and Klein, 2015;O'Kane et al., 2017;Boyko and Vercauteren, 2020;Quinn et al., 2021), among other fields of research. The approach is based on two main assumptions. A non-stationary time series that one wishes to model is assumed to be represented by a statistical model with time-dependent parameters. To enable estimation of the statistical model parameters, the fluctuation time-scales of the parameters are assumed to be much longer than the fluctuation time-scale of the time series. In the current SBL modelling context, this means we assume that the fluctuations within an SBL flow regime occur on a much faster time-scale than the time-scale of transitions between flow regimes. In brief, the clustering method expresses the non-stationary time series as a combination of K > 1 stationary submodels that alternate in time, with the difference between each of the models being their parameter values. Each parameter set of the Kth model is constant for an a priori unknown period of time, which, along with the corresponding model parameters, are determined through a machine-learning procedure.
All the examples cited thus far relied on time-discrete statistical models to represent observed time series, with model parameters changing in time. Boyko et al. (2022) extended these approaches to enable the estimation of a non-stationary, time-continuous SDE model such as Equation (9). The extension was based on the combination of the model-based clustering approach of Horenko (2010a), with an SDE parameter estimation approach presented by Krumscheid et al. (2015). The hypothesis followed in this article, and motivated by the perturbation ansatz presented in Section 4.2, is that the uncertainty in the required stability correction will depend on the flow regime. The transitions between flow regimes are assumed to be slow, and within each flow regime the stability correction with its uncertainty is modelled with an SDE of the type of Equation (9). After clustering the time series into several locally stationary SDE models -that is, with fixed coefficients in Equation (9) -scaling of the learned model parameters with the Ri will be investigated. Indeed, based on numerous numerical examples, Boyko et al. (2022) showed that the model-based clustering approach successfully uncovered a priori hidden physical scaling of parameters in the stochastic model using observations.
The main idea of the model estimation procedure is briefly summarised next to introduce the required notation, and the interested reader is referred to Boyko et al. (2022) for a detailed description. In essence, the idea is to identify K sets of model parameters for the SDE to be fitted, Equation (9). This SDE has three unknown parameters, so that sets of three values will be identified and each denoted by a vector k . The K sets of model parameters are * = [ 1 , … , K ] and the procedure identifies a time-dependent model affiliation vector * (t) = [ 1 (t), 2 (t), … , K (t)], where the value of k (t) indicates the probability to be in a given model at a given time t. The functions are unknown and are estimated in an optimisation procedure. This procedure estimates the time-dependent parameters that minimise the total functional L N , where L N quantifies the misfit between the model and the data: and Ω and Ω are a feasible set of solutions for the model parameters and their time sequence. The total functional L N is formed based on a maximum likelihood estimator, where the likelihood function depends on the structure of the SDE model via a conditional probability transition density p associated with the SDE model structure: The transition density p is in general unknown for the SDE model structure considered but is approximated based on the procedure described by Krumscheid et al. (2015). The observed data is assumed to be generated by the considered non-stationary SDE model. In Equation (11) the index i iterates through all available observations N, and index k iterates through the defined number of submodels K. The functional L N is further regularised to ensure a well-posed problem: where the added term is a penalisation based on the smoothness of k and 2 is a hyperparameter, controlling the persistence of the identified states in the k . Large values force the method to identify long and persistent states with strongly elongated and overlapping transitions. Low values of 2 will identify noisy and unclear regimes. The strategy to find a suitable estimate for 2 as well as the algorithm to find the solution are given in . Finally, the total functional L N obeys the following constraints: These conditions ensure a well-behaved solution and demonstrate that k are a partition of unity and can thus be interpreted as the probability of model k being active at a given time.

Local parameter scaling
To achieve a closed-form model for a RANS closure, the unknown time-dependent parameters of the SSE, Equation (9), (t), (t), p (t) are hypothesised to relate to the Ri formed from the local gradients of wind and temperature at the resolved scale (marked with an overline and equal to a time average of 1 hr in this study). To test this hypothesis, the objective is to determine the following scaling functions for the parameters of the SSE: The assumption that such functions exist is sensitive to the choice of the model structure in Equation (9). For that reason, close attention is given in Appendix A to determine a pool of suitable model structures. Furthermore, the number of free parameters (here three) demands to be small to avoid overparametrisation. The stability correction phi is estimated independently at each height, and an SSE is fitted for each height with the method introduced in Section 5.1. The model estimation is performed on a reduced dataset and cross-validated with an excluded period. The proposed scaling functions in the following are designed to parametrise the entire stability range continuously.
The splitting of the dataset into training and validation could not be achieved with a standard percentage rule (Xu, 2018). The main reason for this is that the set of highly intermittent regimes (which are of interest) is present at the beginning of the dataset. Since the research interest is in modelling these regimes, one must choose the validation dataset wisely. For the identification study here, ≈5% of the dataset is used as validation. The data selected consist mainly of intermittent regimes, as the goal is to validate the impact of the stochastic modelling strategy on these dynamics. This removes the most important data from the training. A rough estimate is that ≈20% of the highly intermittent regimes is removed from the training data.
After applying the model-based clustering method, the scaling of the model parameters with Ri is examined using the estimated regime classifier. The cluster-averaged value of the Ri is calculated using the following equation for each height and cluster individually: where * k is the estimated model affiliation function from the vector * obtained with the FEM-H 1 -SDE method (see Section 5.1). Equation (18) can be recognised as a weighted arithmetic mean, where the weights are the model affiliation function due to the imposed constraints.
The scaling of the model parameters was also investigated with the mean local wind and temperature TA B L E 1 The estimated hyperparameters (K opt , 2 opt ) for each height of the model structure Equation (9 gradients in different combinations. This investigation did not result in an appropriate parametrisation, and hence is unreported. The clustering procedure models the observed time series with K distinct locally stationary SDEs. Table 1 summarises the estimated hyperparameters for this study, selected according to the procedures detailed in Boyko et al. (2022).
The optimal number of clusters K is selected in this study based on having the best trade-off between the functional cost value and the information gain. A summary can be found in Figure B.1 in Appendix B. According to this figure, the resolution of non-stationarity with five clusters is sufficient, except for the height of 1 m, where six clusters are found to be optimal. Subsequent analysis suggests that this number is rather too high, as the affiliation function shows more suppressed transition regimes than clustering with K = 3, 4 (not shown). Nevertheless, the following results are based on K = 5.
The estimation of the parameters of Equation (9) and their subsequent parametrisation are shown in Figure 4 for all the measurement heights. The results show a sufficiently reliable scaling where the entire parameter range is described by a continuous function of Ri (see Table 2). The parameters (t), (t), and p (t) are found to be independent of height (at least up to 30 m). The parameter values of the nonlinear term, Equation (16), seem to have a larger scatter. A sensitivity study has shown that a slight change in the slope of the parametrised function leads to only a marginal change in the curves for the moments of the probability density function (PDF) of (see Figure 5). However, the greatest sensitivity is observed in the parametrisation of the linear term, Equation (15) (see Figure 4a). Changes in this parameter have a large impact on the scaling properties of the expected value of the PDF of .
Studying the function in Figure 4a one may wonder why the hyperbolic tangent was chosen, considering that the function for Ri > 1 does not obey the trend of the data. Several variants of different polynomials were also tested to find a more straightforward function, but they failed to accurately capture the scaling in the essential range Ri ∈ [0, 01, 1]. The selected function is a compromise that ensures a good fit below Ri < 1. Prioritising this range of stability is also justified by the fact that the data used to train for Ri > 1 were relatively sparse. The parameter values of the scaling functions Equations 15-17 are determined by least-squares estimation. The important target is to obtain a PDF of with an expected value equal to one when the number Ri approaches zero, to be matching the neutral stability case in this limiting condition. The regression analysis naturally yields the correct values, so this condition is satisfied (see Figure 4) without any additional modification of the parameter estimates.
The diffusion parameter, Equation (17), can be tuned to control the scaling of the noise. Since the intensity of the noise reflects the activity of the sub-mesoscale motions according to the perturbation ansatz in Section 4.2, the ability to prescribe it is beneficial when incorporating the SSE into a turbulence model.
The identification study achieves the goal of finding a simple parametrisation of the SSE through the uncovered scaling functions (Λ, , Σ) given in Table 2, leading to the proposed SSE:

VALIDATING THE SSE
The identified SSE, Equation (19), is validated in two different ways. Comparison of the conditional PDF of between data and model shows us how adequately the latter captures the long-term statistics. Simulating the sample evolution of the SDE and comparing it with the evolution of the measured provides insight into how adequately it reflects non-stationarity, and thus interference, in a localised space-time domain. The given SDE validation tests aim to examine the performance of the identified model to recognise its weaknesses. The tests are also intended to provide a basis to support the embedding of the SDE model in a RANS closure.

Stationary PDF
For a fixed parameter Ri, the SSE, Equation (19), will reach a stationary behaviour after some time. This behaviour is described by a time-independent PDF p s ( , Ri) of the stationary stochastic process . Although p s ( , Ri = const) is fixed, individual realisations of this process continue to fluctuate, so that individual observations (t 0 , Ri) and (t 0 + Δt, Ri) belong to the density p s ( , Ri). Stationary statistical properties, such as the expected value E[ ] and the most probable value M[ ], can be calculated directly from p s . The exact method used to calculate the stationary density p s ( , Ri) from the parametrised SSE is described in Boyko et al. (2022), chap. 4). The difference is a function of Ri and is defined as the measure of global intermittency for the SBL. The most significant parameters affecting the objectivity of the measure I are the two averaging scales in Equation (4). However, this choice is determined to some extent by the multiresolution analysis in the previous study by Vercauteren et al. (2019b). The measure I is such that in the neutral state, I ≈ 0 and p s ( , Ri) represents a Gaussian distribution. As Ri increases, the expected value diverges from the mode, the measure I increases, and the shape of p s ( , Ri) changes toward an extreme value distribution. The conditional density of the data is computed as a reference to evaluate the parametrisation using the scaling functions in Table 2. Figure 5 (left upper panel) shows the conditional probability density p s estimated with a kernel density estimation method using the training data. All seven heights are used for this purpose. The resulting p s of the model is shown with solid contour lines. For comparison, the stability function of the Meteorological Service of Canada atmospheric model is plotted with a dashed black line. The value of the Ri c is shown with a dashed red line. The expected value and distribution's mode are plotted above the Ri to perceive the distribution type better. The colour of the data points indicates the height. A characteristic divergence of the mode and expected value is present after the critical value of Ri.
The stationary distribution of the model is shown in Figure 5 (left bottom panel) and is largely similar to the distribution found in the data. The main difference is how the expected value deviates from the mode. In the data, both statistical measures stay close to each other over a longer range of the Ri, and then, at the critical value, they diverge abruptly. According to the SSE model, the divergence is more uniform across the Ri. It is important to note that the conditional PDF of the data was also examined separately for each measurement height. It was found (not shown) that some individual heights exhibit the same type of moment divergence as given by the SSE model. Therefore, the discrepancy can be attributed to the averaging effect over height. It is also likely that the logarithmic spacing of the measurements introduces some bias, as the region with higher wind gradients is sampled more frequently than the upper boundary layer.
The right panel in Figure 5 compares the distributions of observed and modelled probability density of for four fixed Ri values, represented by the vertical dashed lines in the panels on the left. The neutral, pre-critical, and post-critical regimes are adequately captured. The distributions for 0.05 < Ri < 0.2 of the model have a more pronounced right tail of the distribution. The left tail of the distribution is captured quite well. Attempts were made to better represent the model's distribution; for that, the scaling functions were slightly modified but without significant improvement. Therefore, the solution found by estimating the scaling functions using the least-squares estimation method is optimal.
The model hence captures the structure of the distribution found in the data. The characteristic limit state Ri → 0 obeys a Gaussian distribution with expected value 1 and decreasing variance, which is also perceived by the reduced dispersion. It can be seen in Figure 5 that as the Ri c is approached from the left, the expected value begins to diverge from the mode of the distribution. Owing to this divergence, the time evolution in a prediction of stability correction deviates from its equilibrium over time, so that the coefficient of turbulent mixing exhibits intermittent mixing. The expected value starts to decrease after the Ri c (see Figure 5), which implies a relative increase in turbulent mixing. This behaviour is not expected and represents the largest difference from traditional scaling. Such a large contrast is created by non-stationary events that disturb the mean wind profiles. These events are included when estimating the SDE.
The noise intensity of the stochastic model increases with the Ri (see Figure 4c). As shown in the multiscale analysis performed by Boyko and Vercauteren (2020) for the same dataset, the sub-mesoscale motion band has a specific energy limit. Therefore, the noise intensity associated with the sub-mesoscale motions should also be limited. Consequently, it is reasonable to expect that when using the stochastic model -see Equation (9) -the intensity of the noise could exceed a suitable limit representative of sub-mesoscale motions. After exceeding this threshold, the noise intensity should be physically related to larger, potentially resolved scales and should hence be suppressed entirely. Further research is needed to investigate this hypothesis.

Out-of-sample forecast
The simulated time evolution of is validated on data excluded from training. The set-up of the numerical experiment is to compute sample paths of the SSE by providing the evolution of the Ri estimated from data. To do this, one needs the gradients of mean wind and temperature. These are obtained through Reynolds averaging applied on a time-scale of 1 hr, separating the mean from the fluctuations u i = u i − u ′ i and T = T − T ′ . Then, at each time step, a least-squares fit of the log-linear equation following (Nieuwstadt, 1984) is performed: where f (z) denotes U = √ u 2 + v 2 or T. Local gradients are easily obtained via The predictor variable is estimated from the data as where the overbar highlights the averaging scale of 1 hr.
The Ri is then used to solve the SSE, Equation (19), with the identified parameter functions Λ, , and Σ (Table 2).
Since the SSE is solved over a vertical column of 30 m, corresponding to the height extent of the FLOSS2 measurements, the sampling procedure of the Wiener process requires the choice of a correlation length-scale. This length-scale is estimated from the measurements and implemented accordingly. All details for this procedure, as well as for the numerical implementation, can be found in Boyko et al. (2022), chap. 4). The evolution of the profiles within the validation dataset is shown in Figure 6, using the evolution of Ri calculated from the data as input for the model in Equation (19) and three different noise intensities. The predictor variable Ri varies on a time-scale slower than 1 hr. Since the traditional scaling law for predicting is a static function, its time evolution is also slower than 1 hr ( = 1 + 12Ri). Solving Equation (19) yields a with stochastic fluctuations much faster than 1 hr. Therefore, for comparison, the modelled variable is temporarily averaged with a moving window of length 1 hr ( ). Figure 6a shows the evolution of the Ri, and Figure 6b shows the corresponding evolution of for several nights of the validation dataset. The maximum value of Ri in the forecast is limited to 10. The colour bar in Figure 6a is set so that the white areas indicate the critical value Ri c = 0.25. Accordingly, the red colour indicates the neutral conditions and the blue colour indicates the highly stratified conditions. The colour bar in Figure 6b is set so that the white areas denote = 1, which corresponds to neutral conditions. Accordingly, the blue colour denotes the suppressed turbulence level ≫ 1, and the red regions represent the enhanced mixing due to turbulence < 1. Different flow conditions are marked with a red frame as an example. It can be seen that for the intermittent conditions with high values of Ri (see Figure 6a), the variable (see Figure 6b) shows extreme mixing events.
Recall that represents the relative deviation of turbulent mixing as a function of the mean wind gradient. For the intermittent conditions in Figure 6, the gradient of the mean wind is lower than for the other two marked regimes.
The solution in Figure 6c-e differs by a model parameter s , representing the sub-mesoscale motion's intensity. This parameter is defined in Table 2 in the function that parametrises the scaling of the noise process in the SSE. The identification study (see Section 5.2) showed that s = 0 is optimal based on the training dataset. However, the parameter s = 0 is not optimal for the validation dataset, since the observations show a slightly weaker sub-mesoscale activity. Decreasing the value of s reduces the intensity of the sub-mesoscale perturbations, as visualised in Figure 6d,e. Figure 6d shows a slightly adjusted value of s = −0.07, which lowers the intensity of the extreme events to a reasonable level for all three regimes considered.
A further decrease in the value of the parameter s provides an even stronger elimination of fast perturbations (see Figure 6e). This limit is interesting because one can wonder how the model behaves in the regimes of pre-critical stability. To analyse this question, one can examine the evolution of given by traditional scaling (cf. Figure 6e,f). It is assumed that the stochastic model in the limiting case without small-scale perturbations should converge against the classical scaling for Ri < Ri c . Comparing the weakly and strongly stable conditions (Figure 6e,f), we find that of the stochastic model resembles the classical scaling prediction. It also captures the main tendency by evolving from the weakly stable to the strongly stable regime. However, it appears that classical scaling performs favourably with respect to observations in the weakly and strongly stable regimes (cf. Figure 6f,b). The comparison suggests that the stochastic model is not sensitive enough to the vertical gradient of the Ri in the limiting case of low small-scale perturbation. This is because the expected value of the invariant measure starts to fall after criticality (see Figure 5). However, if we compare the intermittent regime (see Figure 6e,f), the stochastic model with parameter s = −1 yields a finite turbulence suppression level, whereas the classical theory predicts the absence of turbulence.
In summary, the stochastic model captures the full range of regimes in the night-time boundary layer exceptionally well. The model reproduces the spatio-temporal patterns in the data. In addition, the intensity of the sub-mesoscale motions can be adjusted by changing a parameter value. The adjustment of this parameter can be controlled and optimised for a given dataset. The interested reader is referred to Boyko et al. (2022), chap. 4) for a quantitative approach to adjust this parameter.

SUMMARY AND CONCLUSIONS
In this paper, an SSE is introduced and validated to parametrise intermittent turbulence in the SBL. The approach accounts for the uncertainty of the turbulence parametrisation due to intermittency, to mixing by unresolved sub-mesoscale motions or due to the lack of equilibrium of turbulence under unsteady forcing. The classical MOST approach is extended to a stochastic model following a parameter perturbation approach. The stochastic model parameters are found to scale with the local Reynolds average gradient Ri (see Figure 4), which allows application in a RANS turbulence model. The results show that the stochastic model is suitable over the entire observed stability range and captures non-stationarity of turbulence. It adequately captures the invariant density and hence quantifies the uncertainty of the stability correction variables. The statistics of the stability correction are non-Gaussian, reflecting the intermittent behaviour. The stochastic formulation accommodates both the short-term intermittent behaviour and the long-term averages and could potentially replace the traditional use of long-tail stability functions. In this study, only one dataset was analysed. A broader validity of the stochastic equation is questionable and should be explored in other field experiments. The important question to investigate is: How general is the functional form of the stochastic model? More idealised studies, for example using wind tunnels or numerical experiments, could provide alternative controlled case studies. If a particular parametrisation design is fixed, the identification of the parameters can be made directly by adjusting the PDF moments of the process , thereby avoiding use of the clustering method to fit a new model.
The uncovered scaling of the SSE parameters with atmospheric stability is an encouraging result as it provides a closed-form uncertain parametrisation of turbulence. It can be readily implemented in a RANS (or NWP) model by implementing the proposed SSE instead of using a classical stability function. The resulting stochastic closure scheme can account for mixing and transport due to non-stationary, unresolved sub-mesoscale motions and for departures from statistical equilibrium of the turbulence itself, by lumping those in the uncertainty of the SSE. It could thus present a way forward for dealing with the complexities of the unsteady SBL in NWP or climate models.
The implementation of the stochastic parametrisation in an NWP model requires the definition of spatial coherence of the random process, and discussion on these aspects can be found in Boyko et al. (2022). In its suggested form, the approach cannot tackle the modelling of systematic sub-mesoscale flow driven by local surface features, or the systematic impact of internal gravity waves in the free atmosphere as occurs in long-lived SBLs (Zilitinkevich, 2002). Further developments could build upon existing studies that explore the scaling of MOST with terrain features, and explore the scaling of the stochastic model parameters with such additional terrain features. A promising route would be to combine the stochastic extension of MOST to the generalised approach of Stiperski and Calaf (2023), where turbulence anisotropy is found to be a key missing variable in MOST. The impact of terrain and flow complexity on anisotropy and on uncertainty could be evaluated jointly, leading to a stochastic parametrisation that could consider both terrain features and unsteady turbulence. Model cost J rescaled to [0.01 1.01] dϕ = ( a 1 − a 2 ϕ) dt + a 3 ϕ dW ( t ) dϕ = (1 + a 1 ϕ − a 2 ϕ 5 / 4 ) dt + a 3 ϕ dW ( t ) dϕ = (1 + a 1 ϕ − a 2 ϕ 3 / 2 ) dt + a 3 ϕ dW ( t ) dϕ = (1 + a 1 ϕ − a 2 ϕ 2 ) dt + a 3 ϕ dW ( t ) dϕ = (1 + a 1 ϕ − a 2 ϕ 3 ) dt + a 3 ϕ dW ( t )

F I G U R E A1
Selection of the best-fitting model from a considered pool of models based on the model fitness function; see Equation (A1). The likelihood value is plotted against the number of clusters used to resolve the time-varying parameters. The y-axis is rescaled for better illustration. The model structure with quadratic nonlinearity performs best over almost the entire range of clusters used. The simplest model structure, d = (a 1 − a 2 ) dt + a 3 dW(t), is not shown because its fitness value is significantly higher than the value of the displayed models [Colour figure can be viewed at wileyonlinelibrary.com] The vertically averaged value of is examined to identify the optimal model structure: where z 30 = 30 m is the height of the highest measurement and z 1 = 1 m is the height of the lowest. Once the model structure is established in the following analysis, the clustering is repeated for different heights and presented in Section 5.2. Figure A1 shows the change in model likelihood according to Equation (A1) over the number of clusters used up to 10. This is more than sufficient to understand which model structure is the most appropriate. The y-axis shows the scaled model cost according to Equation (A1). The scaling between 0 and 1 is subject to the absolute minimum and maximum values and is done here for a visual purpose. The lower the value, the better the model approximates the data.
The results in Figure A1 show that, within the range considered, the best model has multiplicative noise and quadratic nonlinearity. The worst model has linear structure and additive noise, where the absolute fitness value is so insufficient that it is not even considered in the figure due to the significant cost values. Moreover, according to Figure A1, the inclusion of cubic nonlinearity leads to an increase in the cost value. Reducing to linear drift while retaining multiplicative noise also does not improve performance.
The sensitivity study formed is a plausibility check for the derivation of the model structure in Equation (9). The data-driven testing of different model structures confirms that the quadratic nonlinearity and multiplicative noise, Equation (9), is the correct candidate for the simplest possible model structure within the dataset considered.
Finally, it should be noted that an attempt was made to simplify the quadratic model so that the parameter a 1 (t) scales linearly with Ri. In addition, when identifying the parameter scaling, an attempt was made to express one of the parameters by another to obtain a model with effectively two parameters. None of these approaches lead to a reasonable stochastic closure.

APPENDIX B. ESTIMATION OF NUMBER OF CLUSTERS
The estimation of the number of clusters needed to model the stability correction is done based on a trade-off between minimisation cost and gain in information. A summary of the results is shown in Figure B.1, showing that the resolution of non-stationarity with five clusters is sufficient.