On a probabilistic evolutionary approach to ocean modelling: From Lorenz-63 to idealized ocean models

In this study we develop an alternative way to model the ocean reflecting the chaotic nature of ocean flows and uncertainty of ocean models -- instead of making use of classical deterministic or stochastic differential equations we offer a probabilistic evolutionary approach (PEA) that capitalizes on the use of probabilistic dynamics in phase space. The main feature of the data-driven version of PEA proposed in this work is that it does not require to know the physics behind the flow dynamics to model it. Within the PEA framework we develop two probabilistic evolutionary methods, which are based on probabilistic evolutionary models using quasi time-invariant structures in phase space. The methods have been tested on complete and incomplete reference data sets generated by the Lorenz 63 system and by an idealized multi-layer quasi-geostrophic model. The results show that both methods reproduce large- and small-scale features of the reference flow by keeping the probabilistic dynamics within the reference phase space (the phase space of the reference flow). The proposed approach offers appealing benefits and a great flexibility to ocean modellers working with mathematical models and measurements. The most remarkable one is that it provides an alternative to the mainstream ocean parameterizations, requires no modification of existing ocean models, and easy to implement. Moreover, it does not depend on the nature of input data, and therefore can work with both numerically-computed flows and real measurements from different sources (drifters, weather stations, etc.).


Introduction
The modern ocean modelling utilizes a wide spectrum of tools ranging from observations to using comprehensive ocean models.Most of the latter are based on deterministic or stochastic differential equations, and use both the physics-and data-driven paradigms.The majority operate in physical space (e.g., [7,1,3,6]), while some, umbrellaed under the recently proposed hyper-parameterization approach (e.g., [10,13,12,11]), take advantage of working in phase space.In this study we develop an alternative way to model the ocean reflecting the chaotic nature of ocean flows and uncertainty of ocean models -instead of making use of classical deterministic or stochastic differential equations we offer a probabilistic evolutionary approach (PEA) that capitalizes on the use of probabilistic dynamics in phase space.In this study we develop a data-driven version of PEA the main feature of which is that it does not require to know the physics behind the flow dynamics to model it.It is achieved by its data-driven nature and by shifting the focus from the physical to the reference phase space (the phase space of the reference flow).The reference flow can be a numerical solution (generated by an ocean model), observational data, or combination of both.Within the PEA framework we develop two probabilistic evolutionary methods based on probabilistic evolutionary models using quasi time-invariant structures in the reference phase space.
The PEA offers appealing benefits and a great flexibility to ocean modellers working with mathematical models and measurements: (1) it requires no modification of existing ocean models, (2) easy to implement, and (3) does not depend on the nature of input data, i.e. it can take not only numerically-computed flows as input data but also real measurements from different sources (drifters, weather stations, etc.), or combination of both.Most remarkably, the PEA provides an alternative to the mainstream ocean parameterizations.Namely, instead of running long high-resolution simulations of ocean models one can generate its relatively short coarse-grained version (that retains the nominally-resolved, resolved on the coarse grid, flow features) and use it as input data for the PEA, thus replacing computationally-intensive ocean models with a way faster probabilistic evolutionary models.In other words, the PEA addresses the eddy-parameterization problem from a different angle: it shifts the focus from the physical space to the phase space of the model and consider the inability of the low-resolution model to reproduce the nominally-resolved flow structures as the persistent tendency of the phase space trajectory representing the low-resolution solution to escape the reference phase space.
First, we explain the probabilistic evolutionary approach and how to build probabilistic evolutionary models, and show how they work on the example of the Lorenz 63 system with complete and incomplete data sets.Then, we apply them to multi-layer quasi-geostrophic (QG) flows with complete and incomplete reference data.It is worth mentioning that this work is intended as a proof of concept, therefore we deliberately reduce the technicalities beyond the PEA to a bare minimum, while focusing the attention on the key points of the approach.

The probabilistic evolutionary approach (PEA)
The probabilistic evolutionary approach is a new approach to ocean modelling that capitalizes on the chaotic nature of ocean dynamics by taking advantage of using the probability distribution of states in the reference phase space as opposed to making use of deterministic or stochastic differential equations.By construction, fluid dynamics models can be divided into two classes: deterministic and stochastic.In deterministic models, the dynamics is determined by a set of deterministic differential equations, i.e. the transition from a state x i to a state x i+1 is unique (Figure 1a).Stochastic models expands the class of deterministic models by introducing noise that does not lead to the unique state x i+1 but to a cloud of possible states neighbouring state x i+1 .In other words, stochastic models describe the transit from a state x i to a neighbourhood of state x i+1 (Figure 1b), where the size of the neighbourhood depends on the noise amplitude and the way it is included in the model (additively, multiplicatively, or otherwise).
The probabilistic evolutionary approach offers a different point of view: infinitely many states (x i1 , x i2 , . ..) can be reached from a state x i , and the transition to a particular state is defined by a transition probability function, P, which assigns a probability to any possible transition (Figure 1c).Note that probabilistic evolutionary models do not describe the evolution of a probability function, the evolution in these models is governed by a probability function.In the data-driven version of PEA proposed in this study, the transition probability function is calculated from available reference data as the current stage changes (i.e., on the fly).In the physics-driven PEA, we envisage that the transition probability function can be defined from the physics of the studied phenomenon.Another class of models that naturally follows from the stochastic and probabilistic ones is the probabilistic-stochastic evolutionary models in which every possible state is replaced with a cloud of states neighbouring it (Figure 1d).This gives rise to probabilistic evolutionary models with noise.In this study we are focused on the data-driven PEA, where the transition probability function is calculated locally from available reference data for every transition from one state to another, i.e. a new state of the probabilistic flow evolution is defined by the likelihood of reference states neighbouring to the current state of the probabilistic evolutionary model.Within the PEA framework, the probabilistic nature of the flow evolution implies that even very unlikely (rare) events are expected to occur once in a while thus echoing our observations of extreme weather and climate events.More importantly, it allows the probabilistic trajectory to cover regions of the reference phase space that are not presented in the reference data set, but can potentially happen.
It would be helpful to remind that a system of ordinary differential equations can be geometrically interpreted as a vector field in the phase space of equation ( 1); here, the prime denotes a time derivative.The direction of the vector field at a given point x is determined by the vector F(x) for ∀x ∈ R n .Once F(x) is known, it can be used to calculate a new position of point x in the phase space.This idea is used in the hyper-parameterization method "Advection of the image point" [10,11].
The PEA works in a different way.In the data-driven version of PEA the analytical form of equation ( 1) is not available.The only reference data available to the PEA is a numerical solution of (1), i.e. the reference solution, or observations if one works with data from weather stations, satellites, etc.Therefore, instead of directly calculating F(x) at a give point x, the PEA computes a new vector, G(y), by sampling from the transition probability function, P, based on the joint probability distribution of states neighbouring to the current state in the reference phase space (Figure 2).Thus, the probabilistic evolutionary model can be written as follows: with S being an operator sampling from the joint probability distribution, i.e. it returns a point in phase space given a probability.
We have developed two methods within the PEA framework.These methods use different probability distributions to calculate the directional vector G(y) for the probabilistic evolutionary model.The first method calculates directional angles and the length of G(y) based on its joint probability distribution function computed from available reference data.Hence the probabilistic evolutionary model for the probabilistic solution y(t) is given by where U(y(t)) is the neighbourhood of probabilistic solution y(t), C S is the transformation from Cartesian to spherical coordinates (used to compute the angles and lengths of reference vectors), P a is a transition probability function based on the joint probability distribution of directional angles and lengths of the reference vectors neighbouring to the current state y(t) in the phase space of (1), i.e. the vectors F(x(t)) x(t)∈U (y(t)) ; C −1 S is the inverse of C S used to compute G(y) in the Cartesian space.The second term on the right hand side of equation ( 3) is a nudging term: where η is a nudging strength, M is the number of nearest (in l 2 norm) to the solution y(t) points over which the averaged reference solution x(t) is computed.
The second method does not use the joint probability distribution of the directional angles and lengths of reference vectors to compute G(y).Instead, it computes G(y) from the joint probability distribution of the coordinates of reference vectors neighbouring to the current state y(t) in the phase space of (1).Hence, the probabilistic evolutionary equation reads as follows: where P c is a transition probability function based on the joint probability distribution of the coordinates of the reference vectors neighbouring to the current state y(t) in the phase space of (1), i.e. the vectors F(x(t)) x(t)∈U (y(t)) .To sample from P a and P c , we use the sampling operator S based on the inverse transform sampling method [4].We also tried the rejection sampling but did not observe that much of a difference.
Probabilistic nudging.The form of the nudging term used in the probabilistic evolutionary models (3) and ( 5) is governed by our desire to keep the probabilistic evolutionary model as simple as possible.Different metrics and forms of the nudging term can be used instead (for example, adaptive [13] or probabilistic nudging).
The idea behind probabilistic nudging is also based on using the probabilistic evolutionary machinery.However, instead of computing the joint probability distribution of the vectors F(x(t)) x(t)∈U (y(t)) as above, we compute the joint probability distribution of the states x(t) themselves.Thus, the probabilistic nudging term can be written as where P c is a probability function based on the joint probability distribution of the coordinates of the reference states x(t) neighbouring to the current state y(t) in the phase space of (1), i.e. the states x(t) x(t)∈U (y(t)) .We do not study how the probabilistic nudging performs in this work and leave it for the future research.
On the optimal choice of parameters.Note that the neighbourhood U(y(t)) in equations ( 3) and ( 5) is computed as N (and M for the nudging term) nearest (in l 2 norm) to the solution y(t) points.The neighbourhood can be computed differently, and the way it is computed affects the solution.The optimal value of N and M (as well as η) for a given reference solution can be computed by solving the following optimization problem min where F is a problem-specific function, and T is the length of the reference solution x(t).For example, F can be defined as a norm of the difference between the reference and probabilistic solutions.Our choice of N , M , and η is driven by our measure of goodness (to keep the probabilistic solution y(t) within the reference phase space).This measure is used because it allows the probabilistic solution to evolve in the neighborhood of the reference phase space, since the failure to do so results in a wrong flow dynamics typically shown by low-resolution ocean models.Studying optimal strategies of computing the neighbourhood and its size as well as the nudging strength η is a topic beyond the scope of the present paper.
As an example, we consider the Lorenz 63 system [5]: with x(t) = (x(t), y(t), z(t)), and σ = 10, β = 8 3, ρ = 28.As an initial condition, we take x(t 0 ) = (−8.6,−12.4,21.0) to make sure the solution is close to the Lorenz attractor (Figure 3a).Along with the solution of the Lorenz system, we compute the probabilistic solutions to equations ( 3) and ( 5) with N = 10 (Figures 3b,c); we have also tested the probabilistic evolutionary methods for N = 5 and report that the results are qualitatively the same (not shown).Note that all probabilistic solutions are computed without nudging (i.e., for η = 0), as it is not required to reproduce the Lorenz attractor; however, nudging will play an essential role in simulations of the QG model discussed in Section 3. As seen in Figures 3b,c, the probabilistic solution stay within the same region of the phase space as the reference solution of the Lorenz system, despite that only the first half of the reference solution is available (i.e., t ∈ [0, 100]), and the method runs out of the sample over the second half, i.e. for t ∈ [100, 200].This is important, as for the probabilistic evolutionary approach the measure of goodness is how close the probabilistic solution is to the reference phase space.
Incomplete reference data.As the proposed probabilistic evolutionary approach is intended to work with ocean models, it is instructive to test its methods on incomplete reference data sets which are not uncommon in ocean modelling, especially when using comprehensive ocean models.We consider three test cases: (1) gappy dynamics, (2) holey attractor, and (3) disjoint wings.
Gappy dynamics.For this test we generate two data sets that are used as reference data for both methods.Namely, we take the Lorenz solution over the time period [0, 100] and keep every second and every fourth point of the solution, thus retaining only 50% and 25% of the original reference data, respectively.We set N = 10 and present the results in Figure 4; the results for N = 5 are qualitatively the same (not shown).As we see in Figure 4, both methods keep the probabilistic solution on the Lorenz attractor, despite the reference solution is missing a significant portion of data.Holey attractor.This test is harder compared to the first one, as we cut out some regions of the reference dynamics by making three holes of radius 4 in the attractor itself (Figure 5a).As in the first test, both probabilistic evolutionary methods (Figures 5b,c) keep the solution on the attractor.More importantly, the methods restore the dynamics on the attractor as if there are no holes.Disjoint wings.It is the hardest test in the series, as we cut the attractor into two disjoint sets (Figure 6a) thus simulating a significantly corrupted data set; the cut width is 2. As seen in Figures 6b,c, both probabilistic evolutionary methods not only recover the attractor but also restore the dynamics in between the wings where the reference solution is unavailable.On a detrimental role of nudging.We did not use nudging in the probabilistic evolutionary methods to model the Lorenz system, as it is not necessary to reproduce the reference dynamics.But, within the context of the QG model discussed below we will see the beneficial effect of nudging on the flow dynamics.However, it should be noted in advance that nudging can also play a detrimental role when using out of place.As an example, we take the Lorenz system and demonstrate how improper use of nudging can affect the solution.As seen in Figure 7, the nudging strength has a significant effect on the solution.Namely, for a relatively strong nudging the probabilistic solution cannot properly develop on the attractor, and the whole dynamics is confined in a narrow strip.It is a simple but illustrative example of how one should use caution to properly adjust the nudging strength to get a good solution.Summing up our findings for the Lorenz 63 system, we conclude that the proposed probabilistic evolutionary methods have strong potential for modelling geophysical flows.It may seem that the first method is somewhat inferior to the second one, as the former gives the trajectories which do stay in the reference phase space, but can experience rapid changes of the direction that may lead to undesirable effects in geophysical flows; in fact, these changes can be mitigated by using a shorter time step (not shown).Despite this seeming disadvantage we does not disregard the first method and test it too on the QG model, as this effect might be of minor or no influence within the context of QG dynamics.
The PV anomaly and stream function are related through the system of elliptic equations: with the stratification parameters s 1 = 4.22 ⋅ 10 −3 km −2 , s 2 = 1.41 ⋅ 10 −3 km −2 ; chosen so that the first Rossby deformation radius is Rd 1 = 25 km.
System ( 9)-( 11) is augmented by the integral mass conservation constraint [8]: by the periodic horizontal boundary conditions set at eastern, Γ 2 , and western, Γ 4 , boundaries and no-slip boundary conditions u set at northern, Γ 1 , and southern, Γ 3 , boundaries of the domain Ω.
The QG equations ( 9) can be recast in the form of equation ( 1) as follows: where the right hand side F defines the vector field used to evolve q; F is computed with the central finite difference in time from the available reference data.The only difference with the Lorenz system (8) is the vector F and the advected quantity q; note that the dimension of phase space is defined by the number of degrees of freedom used to discretize the equation in space.Thus, the analogue of equations ( 3) and ( 5) for the QG equations reads: y ′ (t) = G(y) + N (q(t), y(t)), y(t 0 ) = q(t 0 ).
For the purpose of this study both high-and low-resolution solutions are needed.We compute these solutions on uniform grids of size 513 × 257 and 129 × 65 over a period of 4 years after a 10-year initial spin up.In order to test the probabilistic evolutionary methods in different regimes, we use both the highresolution solution and its point-to-point projection onto the coarse grid 129 × 65; the coarse-graining is of little importance to the probabilistic evolutionary approach, and any other method (e.g., interpolation schemes, spatial averaging, or filters) can be used.The high-resolution solution is needed to study to what extent the probabilistic evolutionary methods can be used as an alternative to high-resolution ocean simulations, whereas its low-resolution projection is to compare the performance of the methods with the low-resolution solution computed on the coarse grid with the QG model.We should also note that these solutions are denoted as x(t) in equation ( 1), while their corresponding probabilistic solutions are denoted as y(t) in equations ( 3) and (5).It is an important comparison which will show whether the probabilistic evolutionary methods can reproduce flow features that are presented in the low-resolution projection but are missing in the low-resolution QG solution.For the purpose of this study, it is enough to consider the first layer PV anomaly, as it is much more energetic than the second layer and full of both large-and small-scale flow features.
In order to demonstrate the ability of the probabilistic evolutionary methods to reproduce nominallyresolved on the coarse grid flow features, we take only the first two years of the 4-year long high-resolution solution, coarse-grain it onto the grid of size 129 × 65 and then use it as a reference solution (Figure 8a).As for the Lorenz 63, we firstly apply the methods without nudging.
The build-up effect.As seen in Figures 8b,c, at the very beginning the probabilistic solution reproduces both large-scale flow structures (two zonally-elongated jets) as well as small-scale vortices and meanders along the jets of the reference solution (Figure 8a).It is because the probabilistic solution remains in the reference phase space.But, after a short period of time the build-up effect takes over and makes the probabilistic solution drift away from the reference phase space, and eventually to settle to a constant in time field (40and 60-day solutions already show almost no change in time).It takes only 60 days for the solution to "freeze" down to a point of no use.For more details on the build-up effect we refer the reader to [11].In order to avoid the build-up effect we use the nudging methodology.We set the nudging parameter as η = 0.1 (it is not the only choice) in equations ( 3) and ( 5), and present the results in Figure 9.As seen in Figures 9b,c, both methods reproduce the large-scale reference flow structures (two zonally-elongated jets) as well as small-scale vortices and meanders along the jets of the reference solution (Figure 9a).However, the coarse-grid QG model cannot reproduce the large-scale flow structures not to mention the small-scale ones (Figure 9d).It is also important to remark that the rapid change of the trajectory computed with the first method (which we observed for the Lorenz system) does not seem to reveal itself in the QG dynamics, and both methods give qualitatively the same results.Therefore we further use the second method as it is somewhat faster than the first one.Incomplete reference data.Incomplete observational data are typical in ocean modelling especially when running comprehensive ocean models.Obviously, such data cannot be directly used for numerical modelling, as they may include undefined values, missing parts of data records, or combination of both.There are different interpolation methods and reanalysis data to overcome the problem.As the probabilistic evolutionary approach is intended to work with comprehensive ocean models, it would be instructive to study its performance on incomplete, raw reference flows, i.e. without engaging interpolation or reanalysis.For doing so, we take the second method and consider similar to the Lorenz system test cases (Figure 10): (1) gappy trajectory, (2) holey dynamics, and (3) disjoint space.We use the same 2-year long reference solution as above, while running the probabilistic evolutionary model for four years.In the gappy trajectory test case we remove every second (Figure 10a) and every fourth (Figure 10b) point from the original 2-year long reference solution thus retaining only 50% and 25% of the reference data.As seen in Figures 11a,b, the probabilistic evolutionary solution reproduces the nominally-resolved reference flow structures way beyond the time period over which the reference data is available.
The holey dynamics test case is harder than the previous one as we remove a vast region of the reference dynamics.Namely, the reference solution contained in the sphere of radius r = 0.979 centered at its time mean has been excluded from the reference solution thus making voids in different parts of the reference trajectory; note that only half of the reference solution remained after this resection.As with the previous test case, the probabilistic evolutionary method restores (Figure 11c) the nominally-resolved flow structures of the reference solution.
The disjoint space is the hardest test case in the series, as it cuts the phase space into two disjoint regions (divided by a gap of width 0.02).Despite that, the probabilistic evolutionary solution is still able to evolve in the reference phase space and reproduce nominally-resolved reference flow features (Figure 11d).9a), (a)/(b) probabilistic evolutionary dynamics for the gappy reference solution with every second/fourth point removed (Figure 10a,b), (c) probabilistic evolutionary dynamics for the holey reference solution (Figure 10c), (d) probabilistic evolutionary dynamics for the disjoint reference solution (Figure 10d).and a 4-year time-average (last column); the nudging strength is η = 0.1.All solutions are given in units of [s −1 f −1 0 ], where f 0 = 0.83 × 10 −4 s −1 is the Coriolis parameter.Note that the probabilistic evolutionary method reproduces both large-and small-scale flow features even for significantly incomplete reference data sets.
Although the probabilistic solution reproduces both large-and small-scale reference flow structures for significantly corrupted data sets, the results in Figure 11 clearly show that it is overheated, i.e. significantly larger in the absolute value than the reference solution.It happens because the incomplete reference data works as a repeller thus pressing the probabilistic trajectory out of the reference space.On the one hand, it might be considered a weakness of the proposed approach, while on the other hand an option to explore vaster regions of the reference phase space, and thus study probable reference solutions that can potentially be simulated with the reference model.In order to keep the probabilistic evolutionary solution in the reference phase space (and thus make its amplitude closer to the reference one), we crank up the nudging strength (Figure 12).It can be done either manually (as in this study) or automatically with the adaptive nudging [13].The stronger nudging makes the amplitude of all probabilistic solutions smaller by keeping them closer to the reference phase space, although we adjusted the nudging strength individually for each case.Based on the PEA performance for the incomplete reference solutions, we conclude that the PEA works well even with significantly corrupted reference data.This is an appealing features not only for ocean modellers working with models but also for those working with measurements.Long simulations.It might seem from the results above that simulations with the PEA can only be twice as long as the reference solution, thus setting the upper limit for PEA simulations.In what follows, we demonstrate how the PEA works on longer time-scales.We take the second method and run it for 8 years, while using the same 2-year long reference solution (Figure 13).As seen in the figure, the method reproduces nominally-resolved flow features (large-scale jets, small-scale vortices, and meanders along the jets) of the reference solution.This, once again, ensures that the PEA can model ocean flows far beyond the reference data set.The high-resolution simulation.As the PEA is proposed as an alternative to modelling the ocean, it would be instructive to assess it on high-resolution data as well.In order to do it, we take a 2-year long high-resolution QG solution (computed on the grid 513 × 257) as a reference solution, and study how the second method performs (Figure 14).As with the low-resolution reference solution, the method performs equally well and reproduces the nominally-resolved reference flow features.

Conclusions and discussion
In this work we have proposed a probabilistic evolution (PEA) approach to ocean modelling that capitalize on the chaotic nature of ocean dynamics by taking advantage of using the probability distribution of neighbourhood states in the reference phase space as opposed to making use of deterministic or stochastic differential equations.A new state of the model is determined by the likelihood of the states neighbouring to the current state.The probabilistic nature of the flow evolution implies that even very unlikely (rare) events are expected to occur once in a while thus echoing our observations of extreme weather and climate events.
Within the PEA framework we have developed two probabilistic evolutionary methods.These methods have been tested on the Lorenz 63 system and showed that both methods reproduce the Lorenz attractor even for significantly corrupted reference data sets.In addition, we have showed how the nudging strength can influence the probabilistic solution.Being assured in the PEA potential for modelling geophysical flows, we have considered an idealized ocean model (two-layer quasi-geostrophic model configured for a horizontally periodic flat-bottom channel) and showed that a non-eddy-resolving solution can be significantly improved towards the reference eddy-resolving solution.Within the context of QG dynamics we have demonstrated the build-up effect and its detrimental consequences on the probabilistic solution, and how to avoid it with using the nudging methodology.We have also studied how the probabilistic evolutionary models work on incomplete reference data and demonstrated that they reproduce nominally-resolved on the coarse grid reference flow features even for significantly corrupted reference data sets.In addition, we have demonstrated that the PEA works very well over long time periods (8 years in our case) even for short reference solutions (2-year long).Our results show that the probabilistic evolutionary approach performs equally well for both low-and high-resolution reference solutions.
The appealing advantages of the probabilistic evolutionary approach are: (1) it requires no modification of the ocean model; (2) easy to implement; (3) it can take not only the reference solution as input data but also real measurements from different sources (drifters, weather stations, etc.), or combination of both; (4) it is ready out of the box for generating ensembles of solutions, (5) copes with significantly corrupted data sets, (6) performs equally well for both low-and high-resolution reference data, and reproduces nominallyresolved reference dynamics; (7) works over long time-scales without degradation of the probabilistic solution (even for short reference records) thus operating well beyond the reference data range.
All this offers a great flexibility to ocean modellers working with comprehensive ocean models and measurements, and allow us to expect that the proposed approach has strong potential for the use in the context of primitive equations which we plan to approach in the future research.

Figure 1 :
Figure 1: Shown is a change of state in (a) deterministic models, (b) stochastic models, (c) probabilistic evolutionary (PE) models, (d) probabilistic-stochastic evolutionary (PSE) models; state and their neighbourhoods are denoted by black dots and grey discs, function P(x i k ) is a transition probability function for state x i k .

Figure 2 :
Figure 2: Schematic of the data-driven PEA.A state y of the probabilistic evolutionary model (2) neighbouring to the reference vectors F(x i ), i = 1, 2, 3, 4. The neighborhood U (y) of y is denoted by the grey disk, and S is an operator sampling from P.

Figure 3 :
Figure 3: Shown is (a) the solution of the Lorenz system (8) for the time interval t ∈ [0, 200], (b) and (c) the probabilistic solutions of (3) and (5), respectively.The probabilistic solutions use only the first half of the reference data (i.e., t ∈ [0, 100]), and over the second half, t ∈ [100, 200], the probabilistic evolutionary methods work out of the sample.Both probabilistic solutions stay in the reference phase space, and reproduce the Lorenz attractor.

Figure 4 :
Figure 4: Shown is (a) the reference solution (over t ∈ [0, 100]) with every second point retained, (b)/(c) the first/second probabilistic solution over t ∈ [0, 200]; (d)-(f ) are the same as (a)-(c) but for the reference solution with every forth point retained.The axes are the same as in Figure 3.

Figure 7 :
Figure 7: Shown is the dependence of the Lorenz solution (computed with the second probabilistic evolutionary method) on the nudging strength η for t ∈ [0, 200].The results for the first method are qualitatively similar (not shown).

Figure 8 :
Figure 8: Shown are snapshots of the top layer PV anomaly: (a) the reference solution (computed on the 513 × 257 grid and then projected onto the 129 × 65 grid), (b)/(c) probabilistic solution computed with the first/second method.All solutions are given in units of [s −1 f −1 0 ], where f 0 = 0.83 × 10 −4 s −1 is the Coriolis parameter.Both methods reproduce nominally-resolved on the coarse grid flow features but over a short period of time the build-up effect kicks in and arrests the dynamics.

Figure 9 :
Figure 9: Shown are snapshots of the top layer PV anomaly: (a) the reference solution (computed on the 513 × 257 grid and then projected onto the 129 × 65 grid), (b)/(c) probabilistic solution computed with the first/second method, (d) modelled solution computed with the QG equations (9) on the coarse grid 129 × 65, and a 4-year time-average (last row); the nudging strength is η = 0.1 for both probabilistic solutions.All solutions are given in units of [s −1 f −1 0 ], where f 0 = 0.83 × 10 −4 s −1 is the Coriolis parameter.Note that the probabilistic evolutionary methods use only the first 2 years out of the 4-year long reference solution.Both methods reproduce nominally-resolved on the coarse grid flow features, while the coarse-grid QG solution results in complete failure to reproduce even large-scale jets not to mention nominally-resolved features.

Figure 10 :
Figure 10: Shown is a 2D projection of the dynamics in the reference phase space (two coordinates from the multi-dimensional reference phase space of the top layer PV anomaly) for the original 2-year long reference solution used in the QG simulations above, (a)/(b) gappy trajectory with every second/fourth point retained, (c) holey dynamics (the reference solution contained in a sphere of radius r = 0.979 centered at its time mean is removed), (d) disjoint space (the reference phase space is cut into two disjoint regions; the cut width is 0.02).

Figure 11 :
Figure 11: Shown are snapshots of the top layer PV anomaly for the incomplete reference solution (Figure 10): (top row) the original 2-year long complete reference solution (as in Figure9a), (a)/(b) probabilistic evolutionary dynamics for the gappy reference solution with every second/fourth point removed (Figure10a,b), (c) probabilistic evolutionary dynamics for the holey reference solution (Figure10c), (d) probabilistic evolutionary dynamics for the disjoint reference solution (Figure10d).and a 4-year time-average (last column); the nudging strength is η = 0.1.All solutions are given in units of [s −1 f −1 0 ], where f 0 = 0.83 × 10 −4 s −1 is the Coriolis parameter.Note that the probabilistic evolutionary method reproduces both large-and small-scale flow features even for significantly incomplete reference data sets.

Figure 13 :
Figure 13: Shown are snapshots of the top layer PV anomaly: (a) the reference solution, (b) probabilistic solution computed with the second method, and a 8-year time-average (last column); the nudging strength is η = 0.1.All solutions are given in units of [s −1 f −1 0 ], where f 0 = 0.83 × 10 −4 s −1 is the Coriolis parameter.As with shorter runs, the probabilistic evolutionary method reproduces the nominally-resolved reference flow features (both large and small scales) over the period of 8 years.

Figure 14 :
Figure 14: Shown are snapshots of the top layer PV anomaly: (a) the reference solution (computed on the 513 × 257 grid), (b) probabilistic solution (computed with the second method on the grid 513 × 257), and a 4-year time-average (last column); the nudging strength is η = 0.1.All solutions are given in units of [s −1 f −1 0 ], where f 0 = 0.83 × 10 −4 s −1 is the Coriolis parameter.Note that the probabilistic evolutionary method reproduces the nominally-resolved reference flow features (large-scale jets and small-scale vrotices).