Computation of rare transitions in the barotropic quasi-geostrophic equations

We investigate the theoretical and numerical computation of rare transitions in simple geophysical turbulent models. We consider the barotropic quasi-geostrophic and two-dimensional Navier-Stokes equations in regimes where bistability between two coexisting large-scale attractors exist. By means of large deviations and instanton theory with the use of an Onsager-Machlup path integral formalism for the transition probability, we show how one can directly compute the most probable transition path between two coexisting attractors analytically in an equilibrium (Langevin) framework and numerically otherwise. We adapt a class of numerical optimization algorithms known as minimum action methods to simple geophysical turbulent models. We show, that by numerically minimizing an appropriate action functional, in a large deviation limit, one can predict the most likely transition path for a rare transition between two states. By considering examples where theoretical predictions can be made, we show that the minimum action method successfully predicts the most likely transition path. Finally, we discuss the application and extension of such numerical optimization schemes to compute rare transitions observed in direct numerical simulations, experiments and to other, more complex, turbulent systems.

We investigate the theoretical and numerical computation of rare transitions in simple geophysical turbulent models. We consider the barotropic quasi-geostrophic and two-dimensional Navier-Stokes equations in regimes where bistability between two coexisting large-scale attractors exist. By means of large deviations and instanton theory with the use of an Onsager-Machlup path integral formalism for the transition probability, we show how one can directly compute the most probable transition path between two coexisting attractors analytically in an equilibrium (Langevin) framework and numerically otherwise. We adapt a class of numerical optimization algorithms known as minimum action methods to simple geophysical turbulent models. We show, that by numerically minimizing an appropriate action functional, in a large deviation limit, one can predict the most likely transition path for a rare transition between two states. By considering examples where theoretical predictions can be made, we show that the minimum action method successfully predicts the most likely transition path. Finally, we discuss the application and extension of such numerical optimization schemes to compute rare transitions observed in direct numerical simulations, experiments and to other, more complex, turbulent systems.

INTRODUCTION
Many turbulent flows related to climate dynamics undergo sporadic random transitions [1]; after long periods of apparent statistical stationarity close to one of the dynamical attractors they spontaneously switch to another dynamical attractor. In recent years, there have been increasing evidence that indicates that the ocean circulation has multiple attractors [2] corresponding to different regimes of the thermohaline circulation, driven by salinity and temperature differences between the poles and the equator. The transition between such attractors may be related to Dansgaard-Oeschger events [2,3]. Transitions between two attractors (bistability) is also observed at large scales of ocean currents, for instance for the Kuroshio [4,5]. The importance of possible bistability and abrupt transitions has been emphasized many times including for the planetary atmosphere [6][7][8][9][10][11], where planetary jets may have a huge impact on abrupt climate change [8,12,13].
Random transitions in turbulent flows are also extremely prevalent in astrophysics, geophysics, in laboratories and industrial applications. For instance the Earth's magnetic field reversal is a transition between two turbulent attractors just like in magneto-hydrodynamics experiments [14]. Bistability is also observed in two-dimensional turbulence simulations and experiments [15][16][17], Rayleigh-Bénard convection cells [18][19][20][21], and dozens of other three-dimensional fluid flows show this kind of behavior (see for instance [22] and [17] for more references).
Stochastic resonance [23,24] has been advocated as a possible mechanism for the abrupt transitions between glacial and inter-glacial periods, and in relation to bistability of climate dynamics and time varying forces (i.e. the Milankovitch cycles [25]). The hypothesis of stochastic resonance is debated [23,26] because of the disparity between simple models of only a few degrees of freedom that are used conceptually [23,24], and more complex models. Stochastic resonance however remains a very interesting possibility. In order to address such issues one should study the attractors and the dynamics of the rare transition between attractors in a hierarchy of models from the simplest to more complex ones used by the climate community. For these complex climate models, that genuinely reproduce the turbulent nature of the Earth's atmosphere and ocean dynamics, such a task is currently inconceivable and seems unreachable is the foreseeable future using direct numerical simulations. The reason is the rarity of the transitions, and the computational complexity of these models. The main aim of this paper is make a step in the direction of this challenge by studying bistability and the associated transitions in turbulent dynamics using tools that will allow one compute transitions in more complex systems in the near future.
These rare transitions are essential phenomena as they correspond to drastic changes of the complex system behaviour. Moreover, they can not be studied using conventional tools. They contain dynamics occurring on multiple and extremely different timescales, usually with no spectral gap. This prevents the use of classical tools from dynamical system theory. The theoretical understanding of these transitions is an extremely difficult problem due to the complexity, the large number of degrees of freedom, and the non-equilibrium nature of many of these flows. Up to now, there has been an extremely limited number of theoretical results, where analysis has been limited to analogies with models of very few degrees of freedom [27], or to specific classes of systems that can be directly related to equilibrium Langevin dynamics [28]. For this reason, the use of non-equilibrium statistical mechanics in order to study those dynamics is necessary.
The main problem is in how to develop a general theory for these phenomena? When a complex turbulent flow switches at random from one subregion of the phase space to another, the first theoretical aim is to characterize and predict the observed attractors. This is already a non-trivial task as no picture, based on a potential landscape, is available. Indeed, this is especially tricky when the transition is not related to any symmetry breaking. An additional theoretical challenge is in being able to compute the transition rates between attractors. It is also often the case that most transition paths from one attractor to another concentrate close to a single unique path, therefore a natural objective is to compute this most probable transition path. In order to achieve these goals, it is convenient to think about the framework of large deviation theory, in order to describe either, the stationary distribution of the system, or in computing the transition probabilities of the stochastic process. In principle, we could argue that from a path integral representation of the transition probability [29], and the study of its semi-classical limit in an asymptotic expansion, with a well chosen small parameter, we could derive a large deviation rate function that would characterize the attractors and various other properties of the system. When this semi-classical approach is relevant, one expects a large deviation result, similar to that obtained through the Freidlin-Wentzell theory [30]. If this notion is correct, then this would explain why these rare transitions share many analogies with phase transitions in statistical mechanics and stochastic dynamics with few degrees of freedom.
Therefore with this in mind, we consider the simplest turbulent systems that exhibit random transitions between multiple coexisting attractors. The quasi-geostrophic model with stochastic forces in simple enough to be studied from first principle, in the framework of statistical mechanics and large deviation theory. Moreover this model is relevant to describe some aspects of the largest scales of turbulent geophysical fluid dynamics. The model shares many analogies with the two-dimensional Euler and Navier-Stokes equations [31]. These systems include the onelayer quasi-geostrophic model and its subsidiary the two-dimensional Navier-Stokes equations. For instance, in [17], the authors observed rare transitions between two quasi-stable large-scale flow configurations, namely a dual-band zonal jet and a vortex dipole for the stochastically forced two-dimensional Navier-Stokes equations in the limit of weak noise and dissipation. Moreover, multiple zonal jet configurations were observed as dynamical attractors in the quasi-geostrophic model for the same set of parameters [32]. These examples provide the necessary motivation to try and understand rare transitions between two attractors for geophysical fluid flows. The goal is to develop a theory that will be able to predict the most likely transition path between two attractors without having to resort to direct observations of rare events in Nature, computationally expensive numerical simulations or costly experimental setups.
To this end, we develop a non-equilibrium statistical mechanics description for the prediction of the most probable rare transitions between two coexisting attractors. By considering the transition probability of all the possible transition paths between the two states as a Feynman path integral, we apply a saddle-point approximation, in an appropriate limit characterizing the rarity of these transitions, in order to determine the path that yields the greatest contribution to the transition probability. We decompose the problem into two sub-classes: equilibrium and non-equilibrium. Through an equilibrium hypothesis, we are able to make direct analytical predictions from the path integral formalism for the most probable transition path. In this case any transition away from an attractor will become rare in the limit of weak noise. Alternatively, the non-equilibrium problem is more complex. In many cases, we are resorted to numerically computing the most probable rare transition through numerical optimization techniques. We outline an appropriate algorithm for use in turbulent models considered here and show that the numerical predictions agree with theoretical results when obtainable.
The layout of this manuscript is as follows: In section we discuss the class of turbulence models (the barotropic quasi-geostrophic and two-dimensional Navier-Stokes equations) and detail the path integral formalism for the Freidlin and Wentzell (instanton) approach. In section , we overview recent theoretical results of these models in a purely equilibrium Langevin setup. In such cases, rare trajectories can be directly computed by considering relaxation (deterministic) trajectories of a corresponding dual dynamics. By considering a simple example where a first-and second-order phase transition occurs through a bifurcation of a tri-critical point, we show that the predicted rare trajectories agree with new direct numerical simulations of the system for a transition between two zonal jets. Section details an numerical optimization algorithm used to compute the most probable rare transition in the barotropic quasigeostrophic and two-dimensional Navier-Stokes equations in both the equilibrium and non-equilibrium regimes. In section we apply the numerical method from the previous section to several examples of rare transitions in geophysical flows where analytical predictions can be made. Moreover, we consider an important generalized example of bistability in geophysics: a non-equilibrium rare transition between two distinct zonal jets with topography. We show how the numerical optimization algorithm predicts a rare transition that remains in the set of zonal jet states, thus greatly simplifying the accompanying theory. Finally, we conclude in section discussing the relevance of the equilibrium and non-equilibrium setups, the advantages and disadvantages of the numerical procedure, and the possible extension of this method to more complex turbulent systems.

THE BAROTROPIC QUASI-GEOSTROPHIC AND TWO-DIMENSIONAL NAVIER-STOKES EQUATIONS
The most simple turbulent model relevant to bistability in geophysical fluid dynamics is arguably the stochastically forced one-layer barotropic quasi-geostrophic model inside a periodic domain of size D = [0, L x ) × [0, L y ) with aspect ratio δ = L x /L y : where ω, q, v and ψ are the vorticity, potential vorticity, the non-divergent velocity and the streamfunction respectively.
Both ω and ψ are periodic functions in the two spatial directions and are related by ω = ∆ψ. The velocity field is recovered using the streamfunction representation that automatically satisfies the incompressibility condition ∇ · v = 0. The topography if defined through the function h(r). If we set h ≡ 0, then the barotropic quasi-geostrophic equation (1) reduces to the two-dimensional Navier-Stokes equation with linear friction and hyper-viscosity. Due to the double periodicity, it is convenient to consider a Fourier representation of the potential vorticity: where e k (r) = exp (ik · r) / (L x L y ) 1/2 is the orthonormal Fourier basis for a doubly periodic domain. We introduce a stochastic noise η, defined as a sum of random noises where η k are independent, white in time real random noises, such that , and f k is a complex noise spectrum with randomized phases for each Fourier mode k. Consequently, we can define the noise correlation as where C(r, r ′ ) the noise correlation matrix can alternatively be represented in Fourier space in terms of the complex noise spectrum: C(r) = k C k e k (r) = k |f k | 2 e k (r). The noise amplitude σ can be associated to the energy injection rate through the normalization of the noise spectrum by where k = |k| is the wavenumber of the wave vector k.
Assuming that the topography satisfies the condition D h(r) dr = 0, the barotropic quasi-geostrophic model (1) on a doubly periodic domain, conserves an infinite number of dynamical quantities, namely the energy and an infinite number of Casimirs where s(q) is any smooth function of the potential vorticity q. On the other hand, a common choice of topography is the beta-plane approximation h(r) = βy, which results in the barotropic quasi-geostrophic model defined upon a beta-plane. This model is widely used as a simple model for atmospheric and ocean flows, where the curvature of the Earth is approximated by a beta-plane [33]. Making the beta-plane approximate results in only the quadratic Casimir s(q) = q 2 /2 (as well as the energy (5)) being conserved. In any case, the barotropic quasi-geostrophic model on a beta-plane, the one-layer quasi-geostrophic model, including the two-dimensional Navier-Stokes equations conserve two sign definite quadratic invariants: the energy (5) and the enstrophy C 2 [q] = (1/2) D q 2 dr (where q = ω for the Navier-Stokes case). Due to the presence of two quadratic invariants, a simple phenomenological argument [34] shows that energy will flow to large scales, while the enstrophy travels towards small scales. The implications of the inverse energy transfer is of paramount importance in atmospheric and ocean flows. Restriction imposed by finite sized domains, lead (in the inertial limit) to the condensation of energy at the largest scales, leading to self-organization of the flow into large-scale coherent structures on a background of random turbulent fluctuations. The explicit form of these structures depends explicitly on the boundary conditions and on the noise correlation. For periodic boundary conditions, coherent structures in the inertial limit of the Navier-Stokes (Euler) equations can take the form of a vortex dipole or zonal jets [31], whilst only zonal jets are observed in large β regimes of the barotropic quasi-geostrophic model [35,36]. Using the definition of the energy (5), and the equation of motion (1) we can derive an equation for the energy balance in the system. By taking the scalar product of (1) with the streamfunction ψ and integrating, and applying Itô's lemma on the noise, we arrive at where H = D ψ(−∆) n ω dr corresponds to the dissipation of energy via the hyper-viscosity term. Assuming the system has achieved a non-equilibrium steady state such that the system reaches an energy balance between the injection σ and the dissipation, and by further assuming that the majority of the energy is concentrated at the largest scales (meaning that it is reasonable to neglect energy dissipation through the hyper-viscous term H), we can perform a non-dimensionalization in order to fix the mean energy density to be of order one. By enforcing that the mean energy density be unity, i.e. that E/L 2 = U 2 = L 2 /τ 2 = 1, where τ is now the characteristic energy turnover time at the domain scale L, from the energy balance equation (7) and assuming steady state conditions (∂E/∂t = 0), we can estimate the typical energy turnover timescale as τ = (2αL 4 /σ) 1/2 . Then by non-dimensionalizing with respect to this timescale, we define new non-dimensional variables as t ′ = tτ , ω ′ = ωτ , h ′ (r) = h(r)τ , α ′ = ατ , ν ′ = ντ /L 2n , and η ′ = ηL 2 τ 1/2 , resulting in the barotropic model in non-dimensional form: From this moment on, we will deal with the non-dimensionalized quasi-geostrophic equations (8) with all primes dropped.

Dynamics of the statistically steady state
For turbulent regimes where the flow is dominated by the presence of large-scale coherent structures, we consider the inertial limits of the barotropic quasi-geostrophic equations (8): ν < α < 1. The type of coherent structures observed are dependent on the topography h(r). Much work has been done in the inertial (Euler) limit, where h(r) = 0. Here, the main approach has been to advocate the time-scale separation between the inertial dynamic and the slow dissipative and noise dynamics. Then the invariant measure will concentrate close to the attractors of the two-dimensional Euler equations. A set of attractors can be found using equilibrium statistical mechanics in the form of the Miller-Roberts-Sommeria theory through an energy-Casimir variational problem [37,38]. The theory predicts either the formation of a large scale vortex dipole or of a two band zonal jet. The appearance of the vortex dipole is associated to the degeneracy of the two smallest eigenfunctions of the Laplace operator of the square geometry. For h(r) = 0, the picture is a little more complicated. If we consider the barotropic model on a beta-plane, where h(r) = βy, then the relevant physical parameter that determines the type of coherent structures observed is β. For β < 1, we have the usual Euler (or Navier-Stokes) equations, where we expect to observe the invariant measure to concentrate close to vortex dipoles or zonal jets [31]. For rectangular domains, stable parallel flows are formed along the largest scales, where two jets appear in opposite directions. For β > 1, the β-effect dominates, which tends to stabilize the parallel flows in the e x (zonal) direction (corresponding to the smallest eigenfunction). Then, possible steady state solutions to the barotropic equations with more than two jets can be observed. A rough estimate to the number of jets observed can be made by considering the ratio of the domain size L to the Rhine's scale given by L β = 1/βτ . However, it must be stated that this is only an approximate measure as the structure of the noise correlation will also contribute. Moreover, many cases of multiple steady state solutions of the barotropic equation with differing number of jets has been observed for the same sets of parameters [32]. These multiple states are assumed to be linearly stable for the unforced (with or without dissipation) dynamics. Consequently, when one introduces stochastic fluctuations by the addition of a noise, the dynamical attractors become meta-stable, and one may expect to observe rare transitions between several attractors.

The large deviation and instanton approach
Transitions between coexisting attractors or the appearance of uncommon large-scale flows are rare events. There are many ways in which one can study these rare events in general. However, one of the most promising is the large deviation and instanton approach. This strategy relies on the description of the transition probability of observing a transition between two states in terms of a Feynman path integral derived from the statistical properties of the noise [39]. For simplicity, one usually assumes that the system is driven by a white in time noise, however there has been attempts to generalize the formalism to include coloured noises [40]. Detailed mathematical derivations of the transition probability can be found in classical textbooks [29,30]. The final result is an Onsager-Machlup path integral [41,42] over all possible transition trajectories from a state q 0 to a state q T occurring in time T , where each transition is weighted according to some action functional A: Here, deviations from the zero noise (deterministic) relaxation trajectory is represented by a penalty function defined through an action functional A (0,T ) [q]. The action is the time integral of the Lagrangian associated to the dynamical equations: For the barotropic quasi-geostrophic equations (8), the Lagrangian is explicitly where C −1 (r, r ′ ) is the formal inverse of the noise correlation, such that D C(r, For rare probabilities, the path integral is a Laplace integral and one can often perform a saddle-point approximation around the global minimum of the action functional to get a leading order approximation to the transition probability. This estimate will be based on the action of the trajectory that globally minimizes the action functional. This global minimizer will be the most probable transition path going from state q 0 to q T in time T . At its most simple, it will consist of the most probable fluctuation path out of the initial attractor to the edge of the basin of attraction of a neighouring attractor (known as an instanton), and then the relaxation to the second attractor. Mathematically, this is defined as When the saddle-point approximation is valid, the transitions are rare and are clustered around the instanton path. As global minima, the most probable paths are critical points of the action functional (10), and satisfy the corresponding Euler-Lagrange equations whereq = ∂q/∂t. The Euler-Lagrange equations (13) can be re-expressed in terms of an instanton Hamiltonian H[q, p] for canonical variables q and p = δL/δq: The instanton Hamiltonian should not be confused with the energy E. The instanton Hamiltonian is a quantity that remains conserved by the dynamics of the most probable transition path (instanton and relaxation). Therefore, H becomes an extremely useful quantity for numerical purposes where it can be used to verify whether a transition path is a critical point of the action functional (10). For the quasi-geostrophic dynamics (8) the instanton Hamiltonian H is given by Then, explicitly the Euler-Lagrange equations become subject to the boundary conditions q(r, 0) = q 0 and q(r, T ) = q T . The Euler-Lagrange equations (16) are also known as the instanton equations as the most probable transition path will satisfy them. It should be make clear that any transition path that is a critical point of the action functional may satisfy the Euler-Lagrange equations (16), not only the most probable one (12). On general grounds, one should expect multiple solutions to the instanton equations, hence leading one to compare their respective action values in order to determine the most likely path. It should be emphasized that Eqs. 16 are only valid in the Freidlin-Wentzell limit of the transition being rare, where the saddle-point approximation is valid. A straightforward study of rare transitions, through direct numerical simulation of the governing equations is nearly always impracticable. This is mainly a complexity problem, due to the large number of degrees of freedom involved for genuine turbulent flows, and the extremely long time between two successive transitions. The path integral approach provides a way to systematically determine the most likely transition path between two attractors. Through the action functional A with the a priori given attractors one can predict the most probable rare transition by considering the local action minimizers or by solving the instanton equations with appropriate boundary conditions. Theoretically, this problem is also extremely difficult, and for turbulent flows, can only be achieved in the simplest of circumstances (see section ). Alternatively one can resort to numerical approaches. Numerical algorithms exist that compute the most probable transition paths by iteratively converging towards local action minimizers (see section ), or by directly solving the boundary value problem associated to the instanton equations (16) (for instance see [43]).
In the cases where the dynamics are in equilibrium, i.e. that they satisfy Langevin dynamics, every transition out of an attractor can be rare. This allows for the direct computation of rare transitions through deterministic relaxation trajectories in a related dual system [28], as will be discussed in the next section.

EQUILIBRIUM LANGEVIN DYNAMICS OF THE TWO-DIMENSIONAL QUASI-GEOSTROPHIC EQUATIONS
This section contains a brief overview of recent theoretical work [28] on the Langevin dynamics of the quasigeostrophic equations. We introduce the Langevin dynamics description for the quasi-geostrophic equations, where a specific relationship between the noise correlation and the kernel of the potential force invokes the Langevin property. By specifying a particular structure of the potential term, we consider a Langevin dynamics with a first-order phase transition between coexisting zonal flow attractors. Through the equilibrium Langevin dynamics theory, we can analytically predict the most probable rare transition paths between two attractors by considering an effective potential landscape and relaxation (unforced) trajectories of a dual dynamics. This also yields an Arrhenius law for the transition probability. The final part of this section is dedicated to the direct numerical simulations of the bistable system considered in [28], verifying numerically that the theoretical predictions hold.

The two-dimensional quasi-geostrophic Langevin equations
The Langevin formalism was previously considered for the two-dimensional quasi-geostrophic and Euler equations in [28]. We explain why the two main hypotheses of Langevin dynamics (Liouville property and conservation of the potential related to the transversality condition) are verified when the kernel in front of the gradient part and the noise autocorrelation are identical.
The Langevin dynamics associated to the quasi-geostrophic equations in a periodic domain D = [0, 2πδ) × [0, 2π) with aspect ratio δ given by with potential G. The stochastic force η is a Gaussian process, white in time, with correlation function . The topography h(r) is such that D h (r) dr = 0 and we also introduce a new parameter γ that will control the relative strength of the noise to the potential term. For the Langevin description to be correct, the potential G must consist of conserved quantities of the inviscid (α = 0) dynamics of (8). Moreover, the deterministic equations for Langevin dynamics (Eqs. (17) for α = 0) essentially correspond to a transport equation by a non-divergent velocity field, leading to a Liouville property for the nonlinear advection term v · ∇q. A more detailed discussion of the Langevin assumptions and results can be found in [28].
As with the quasi-geostrophic equations (8), the equilibrium quasi-geostrophic dynamics conserve the energy E and an infinite number of Casimirs C s given by Eqs. (5) and (6) respectively for the deterministic (α = 0) dynamics.
Then, the correct choice of the potential G for Langevin dynamics will consist of a combination of these conserved quantities:

Reversed dynamics and the relaxation equation
For the two-dimensional Euler or quasi-geostrophic equations, the time-reversed dynamics defined as q r (t) = I[q(T − t)] also satisfies a Langevin dynamics through a set of symmetries with the relevant involution operator I[·] corresponding to a time reversal being Then, the dual process is given by (17) We observe that for the two-dimensional Euler equations (h = 0), the dual dynamics (20) agree with the original dynamics (17) if the potential G is even. Then we conclude that the dynamics are time-reversible and detailed balance is verified. If however, G is not even or that h = 0, then the dynamics are not time-reversible and the original dynamics are conjugated to another Langevin dynamics where h has to be replaced by −h and G by G [−q]. In this case, detailed balance is not verified.
For Langevin dynamics, the instantons from one attractor to a saddle are given by the reverse of the relaxation paths of the corresponding dual dynamics. The relaxation paths are simply the deterministic trajectories of the Langevin dynamics (17) with γ = 0. Therefore, for the barotropic quasi-geostrophic equations the equation for the relaxation paths are Eq. (21) are known as the relaxation equations. They provide the means to directly compute, through deterministic means, the instanton trajectories from an attractor to a saddle by considering the relaxation paths, defined by (21) from the saddle to the attractor.
The Energy-enstrophy ensemble and physical dissipation A special case of Langevin dynamics are when the potential is given by the following form This structure is referred to as the potential enstrophy ensemble (when β = 0), the enstrophy ensemble (when β = 0 and h = 0), or generally as the energy-enstrophy ensemble. The properties of the corresponding invariant measures have been discussed on a number of occasions, starting with the works of Kraichnan [44] in the case of Galerkin truncations of the dynamics, and for some cases without discretization, see for instance [45] and references therein. For specific choices of the potential G and of the noise correlation C, the friction term can also be identified with a classical physical dissipation mechanism. For instance, if C(r, r ′ ) = ∆δ(r − r ′ ), and the potential takes the form of (22), then the dissipative term on the right hand side of (17) is which leads to a diffusion type dissipation with viscosity α and a linear friction with friction parameter αβ. Such a linear friction can model the effects of three-dimensional boundary layers on the quasi two-dimensional bulk vorticity, that appear in experiments with a very large aspect ratio, rotating tank experiments, or soap film experiments.
The fact that for the enstrophy ensemble, the quasi-potential is simply the enstrophy, the relaxation and fluctuation paths can be easily computed explicitly in many scenarios, as is discussed in [31].
For the majority of the other cases, the dissipative term on the right hand side of (17) cannot be identified as a microscopic dissipation mechanism nor as a physical mechanism. There is however another possible interpretation of this kind of friction term. As explained in [46], entropy maxima subjected to constraints related to the conservation of energy and the distribution of vorticity, are also extrema of energy-Casimir functionals. By analogy with the Allen-Cahn equation in statistical mechanics, that uses the free energy as a potential, it seems reasonable to describe the largest scales of turbulent flows as evolving through a gradient term of the energy-Casimir functional. Such models have been considered in the past (see, for example [47,48] and references therein). At this stage, this should be considered as a phenomenological approach, as no clear theoretical results exist to support this view.

Phase transitions and instantons between zonal flows in the equilibrium quasi-geostrophic equations
In order to fully determine the quasi-geostrophic Langevin dynamics (17), we need to specify the topography function and the potential G. Given the infinite number of conserved quantities for the quasi-geostrophic dynamics, there are many possible choices. We are interested in the description of the phenomenology of phase transitions and instanton theory in situations of first-order transitions. Therefore, we will illustrate such a phenomenology through an example originally discussed in [28].
We begin by choosing a topography given by h (r) = H cos (2y) on a periodic domain [0, 2πδ) × [0, 2π), such that and consider the potential with energy E (Eq. (5)), with β the inverse temperature and where C is the Casimir functional where we assume that a 6 > 0 and β = −1 + ǫ.
In [28], it was shown that for the potential (25), with small β and ǫ > 0 we expect to observe a first order phase transition. When H = 0, a bifurcation occurs for β = −1 (ǫ = 0) which can be easily checked (see [49]). This bifurcation is due to the vanishing of the Hessian at β = −1 (ǫ = 0). As discussed in many papers [31,[49][50][51], for the quadratic Casimir functional C 2 = D q 2 /2 dr, the first bifurcation involves the eigenfunction of −∆ with the lowest eigenvalue. If we assume that the aspect ratio δ < 1, then the smallest eigenvalue is the one corresponding to the zonal mode proportional to cos (y). Because we are interested by transitions between two zonal states, we assume from now on that δ < 1.
For non-zero, but sufficiently small, H there will still be a bifurcation for ǫ close to zero. This is the regime that we wish to consider. The null space of the Hessian is spanned by eigenfunctions cos (y) and sin (y), therefore as a consequence, for small enough ǫ and H, we expect that the bifurcation can be described by a normal form involving only the projection of the field q onto the null space. It was shown in [28], that by tackling the problem perturbatively, assuming that ǫ ≪ 1, H 2 ≪ 1 and a 6 H 2 ≪ ǫ, the Langevin dynamics can be described by an effective potential given by with G given at leading order by around the potential vorticity field of The fact that G is a normal form for small enough ǫ, a 6 , and H, implies that the gradient of G in the directions transverse to q = −A cos (y) − B sin (y) are much steeper than the gradient of G.
We observe that the term proportional to A 2 − B 2 2 breaks the symmetry between A and B. Its minimization imposes that A 2 = B 2 . Then either A = B, or A = −B. If we take into account that minimizing with respect to A 2 + B 2 will give only the absolute value of A, we can surmise that we will have four equivalent non-trivial solutions: with φ i taking one of the four value − 3π 4 , − π 4 , π 4 , 3π 4 , with |A| minimizing The reduced potential G is plotted in Figure 1 for the case ǫ > 0. The structure has four non-trivial attractors due to a breaking of the symmetry imposed by the topography h(y) = H cos (2y). In Figure 2, we show the potential vorticity across y for the two attractors, the corresponding saddle in between, and the topography.
Through the equilibrium hypothesis, we know how to describe and compute the the instantons corresponding to the phase transitions between zonal flows. They are none other then the reversed trajectories for the relaxation paths for the dual dynamics. The corresponding equation of motion for the relaxation paths for the dual dynamics for the quasi-geostrophic dynamics has then been derived in section .
In the present example, the potential G is an even function, see Eq. (26). Also, we remark, that over the set of zonal flows v = U (y)e x , the nonlinear term of the quasi-geostrophic equation vanishes: v [q + h] · ∇q = 0. As a consequence, when the instanton remains a zonal flow, the fact that h has to be replaced by −h has no consequence and hence the dynamics will be time-reversible. Let us now argue that the instanton is actually generically a zonal flow.
We assume for simplicity that the stochastic noise is homogeneous (invariant by translation in both directions). is the zonal part of the correlation function, and C m = C − C z the non-zonal or meridional part.
As the nonlinear term of the two-dimensional Euler equations identically vanishes, the relaxation dynamics has a solution among the set of zonal flows. If C z is non-degenerate (positive definite as a correlation function), then relaxation paths will exist through the gradient dynamics where q = q(y) is the zonal potential vorticity field. Moreover, as argued previously, the fact that G (28) is a normal form for small enough, a 6 , and H, implies that the gradient of G in directions transverse to q = −A cos (y) − B cos (y) are much steeper than the gradient of G. As a consequence, at leading order the relaxation paths will be given by the relaxation paths for the effective two-degrees of freedom G. Then, from (27), (28), and (33) we obtain that, at leading order, the dynamics of A and B are given by with c = −αδ 2π 0 C z (y) cos (y) dy, where we recall that G is given by Eq. (28). From this result the relaxation paths are easily computed. Using the fact that fluctuation paths are time reversed trajectories of relaxation paths, instanton are also easily obtained. One of the resulting relaxation paths (blue curve) and one of the instantons (red curve) are depicted in Figure 1 Figure 3.
For the Langevin dynamics formalism, the stationary probability distribution is known a priori and is given by where Z is a normalization constant. At a formal level, this can easily be computed by writing the Fokker-Planck equation for the evolution of the probability density functional. Then the property that P s is stationary readily follows from the Liouville theorem and the fact that G consists of the conserved quantities of the deterministic dynamics. Subsequently, the transition rate k for rare transitions between two attractors are going to be given by an Arrhenius law of the form where C is an order one prefactor and ∆G = G[q saddle ] − G[q attractor ] is the potential difference between the saddle and the initial attractor.

Direct numerical simulations of rare transitions between coexisting attractors
To verify the theoretical predictions of rare transitions in the equilibrium case, we perform direct numerical simulation of the Langevin example discuss above.
We numerically solve the Langevin dynamics of the quasi-geostrophic equations given by (17) using a pseudospectral spatial discretization scheme of resolution 64 × 128 on a periodic domain [0, 2πδ) × [0, 2π) with δ = 1/2. Due to aliasing errors from the quintic nonlinearity associated to potential G, we fully de-alaising courtesy of an 2/6th rule [52]. We time integrate the system using a second order Runge-Kutta method with time step dt = 2 × 10 −3 . For simplicity we choose a white in time noise with correlation C(r, r ′ ) = δ(r − r ′ )/Z, where Z is the normalization constant defined through condition (4). The use of such a noise correlation results in an ultraviolet divergence at high Fourier harmonics which may result in numerical instabilities. Therefore, to avoid a possible enstrophy bottleneck we include hyper-viscous dissipation to the right-hand side of Eq. (17), see Eq. (37). The addition of this extra dissipation breaks the equilibrium hypothesis on a general basis. However, the dissipation only acts of the extreme high harmonics with little effect to the dynamics of the largest scales. Therefore, we expect little deviation to the theoretical (equilibrium) prediction. The numerical equation of motion is v =e z × ∇ψ, ω = ∆ψ, q = ω + h(r), As an additional check to the predictions of subsection we perform a relaxation of the system (37) from arbitrary initial conditions with ν = γ = 0. From the effective potential landscape, the system should converge to the attractors of G given by (30). By starting at four different regions of phase space, we indeed find the predicted attractors of G plotted in Figure 4. We observe that the theoretically predicted attractors (black dashed curves) overlay perfectly to the numerically founds attractors (coloured curves).  From the numerical perspective, we initialize the system beginning from one of the attractors and time step the system until we observe a transition to a neighbouring saddle. We expect that if we are in a sufficiently weak noise limit γ ≪ ∆G, then the transition in the direct numerical simulation will remain close to the theoretically predicted instanton (red curve in Figure 1). Therefore, we utilize the following parameters in our numerical simulation: α = 1 × 10 −1 , ν = 1 × 10 −13 and γ = 5 × 10 −2 .
In Figure 5 we plot the numerically observed transition onto the contour plot of the effective potential G. We observe a relatively noisy transition (dashed blue curve) up to the saddle from the initial attractor. We see a lot of the fluctuations at the base of the potential well were the gradients are small. As the transition progresses up the potential well towards the saddle we observe better agreement to the theoretically predicted transition (solid red curve). We expect that much better agreement would be seen if we chose a smaller γ however with the penalty of a far rarer transition.

NUMERICAL OPTIMIZATION OF THE ONSAGER-MACHLUP ACTION FUNCTIONAL FOR THE BAROTROPIC QUASI-GEOSTROPHIC EQUATIONS
In this section we develop a numerical algorithm that computes local action minimizers of the Onsager-Machlup action functional defined in section . These local action minimizers are candidates for the most probable transition paths between two states. Unfortunately, these numerical optimization techniques are usually unable to distinguish between local minimizers and global ones. Therefore, using numerical schemes that are based on minimization of the action functional may not lead to the most likely transition path. However, one may devise strategies to check to a certain degree whether a local or global minimum is obtained by perturbing found minima to see if they minimize to an alternative path. If multiple minima exist, then comparison of the total action can be made. As already stated, the numerical prediction of rare events without brute force simulations, observation or experiments is important. The advantages of the action minimization methods are in that they can be applied to non-gradient systems in both equilibrium and non-equilibrium cases.
Alternative strategies exist to compute rare transitions, such as string methods [53,54], the nudged elastic band method [55], eigenvector-following-type methods [56], the dimer method [57] and obtaining direct solutions of the instanton equations [58]. However, many of these methods can not be applied to turbulent systems in general.
The numerical scheme that we implement here is based on the adaption the minimum action method to turbulent systems. In essence, the procedure uses a series of iterative estimates of transition paths until it finds a local minimum of the action functional A. This numerical method is applicable to both the equilibrium and non-equilibrium cases and can easily be extended to consider more complex turbulence problems.

The minimum action method
The minimum action method, is a class of numerical optimization procedures that determine local minimizers of functionals. One of the key properties of the algorithm is that it can be applied to systems that do not provide a priori an energy potential landscape. This makes it ideal to study rare transitions in turbulence problems. It has already found many applications in the use of computing most probable transition paths in low dimensional gradient systems [59,60], rare transitions in the the Kuramoto-Siavashinksky equation [61] and the Kardar-Parisi-Zhang equation [62]. In this section, we will outline an algorithm for the standard minimum action method, however there exists many more advanced versions they may be useful in the future. These include algorithms that provide adaptive re-meshing known as adaptive minimum action methods [60,63] or one that use an arc length parameterization of time in order to compute infinite time transition paths, known as geometric minimum action methods [64,65].
The generic strategy of the minimum action method algorithm is to begin with an initial estimate of the most probable transition path between two states. Then, with the use of variations of the action functional with respect to the transition trajectory, improvements in the form of iterations to the initial guess can me made that subsequently reduces the action. This iterative process is continually repeated until the series of estimates converges to a local minima of the action functional.
The main complexity of this method is in determining how one should improve each guess so that the action is reduced. There are various strategies one may use, such as applying Newton's method [66], which uses information about the first and second variations (Hessian) of the action functional or quasi-Newton methods, such as the popular Broyden-Fletcher-Goldfarb-Sahnno (BFGS) scheme that iteratively approximates the Hessian without the need to compute it directly. Newton's method is a relatively expensive procedure especially for high dimensional minimization problems where the computation of the Hessian is difficult. Subsequently, quasi-Newton methods have been favoured in the community and has been successfully applied to the minimum action method but for only in situations of low dimensional gradient systems [59,62].
We found that for turbulence problems, where we have to deal with a large number degrees of freedom, even quasi-Newton methods are expensive. Therefore, we are resorted to relying on methods based solely using the first variation of the action functional. The simplest method that falls into this category is the method of steepest descent, where a descent direction d is taken in the direction of the local anti-gradient of the action functional, i.e. d = −δA/δq. Usually, these methods can have poor convergence rates when the potential energy landscape consists of long narrow valleys, where the minimization procedure leads to zig-zagging across the narrow valley rather than along it. To improve convergence in these situations, one can use the nonlinear conjugate gradient method, which uses knowledge about previous descent steps to avoid crossing back and forth across potential valleys [66]. It is with this in mind that we utilize a nonlinear conjugate gradient method for our problem.
For out notation, we label each iteration with a superscript such that the n th estimate of the most probable transition path is labeled as q n for n = 0, 1, 2, . . . . The initial guess is denoted as q 0 . Each new estimate for the most probable transition path q n+1 is computed from the previous guess q n by taking an appropriate descent step of size l n in the descent direction d n : The descent direction d n is obtained through a nonlinear conjugate gradient method [66]. In general, for nonlinear conjugate gradient methods one takes the descent direction as The parameter β n is known as the nonlinear conjugate gradient parameter which determines to what level the current descent direction should depend on the previous descent direction. There are various ways of computing β n , but we found that the most optimal was to use the standard Fletcher and Reeves formula where where || · || is an appropriate norm. Due to the finite number of degrees of freedom associated to the numerical discretization, there can only be a finite number of orthogonal descent directions. Therefore, it will become important to occasionally reset the nonlinear conjugate gradient parameter when to consecutive descent directions are far from being orthogonal. To achieve this, we set β n = 0, resulting in standard steepest descent step when δA δq n , δA where (·, ·) is the inner product associated to the norm || · ||. The step length l n is chosen such that we obtain the the greatest reduction in the action functional. In order to ensure that d n corresponds to a descent direction (that it results in the reduction of the action), the step length l n must satisfy the strong Wolfe conditions [66]. Fortunately, there exists standard line search algorithms in order to determine the largest step length that satisfies the strong Wolfe conditions. Therefore, we implement the line search algorithm 3.5 of chapter 3 in [66].
The minimization is continuously performed for each iteration until the estimate of the most probable transition trajectory q n is within some tolerance, say ǫ of being a solution of the Euler-Lagrange equations (16). This is checked by halting the algorithm if the solution satisfies the condition δA δq n < ǫ. (42) Numerical discretization of the action functional In order the numerically minimize the action functional, we must first discretize the action in both time and space. Due to the periodicity of our domain, it is natural for us to consider the Fourier harmonics as the standard basis. In time, we approximate the transition on a uniform grid of N t + 1 points along the interval [0, T ]. All spatial derivatives are computed in Fourier space with the nonlinear terms computed in physical space using the 2/3rds dealiasing rule (the standard pseudo-spectral method [52]). Derivatives in time are achieved by applying the second-order central finite difference scheme upon a staggered grid labelled by {j + 1/2} for j = 0, . . . , N t − 1, where time is parameterized by t j = j∆t for j = 0, . . . , N t and ∆t = T /N t .
In this respect, the transition trajectory is fully represented by the set of Fourier amplitudes {q k,j } ≡ {q k (j∆t)} given by Eq. (2), for j = 0, . . . , N t and k = (2πn x /L x , 2πn y /L y ) for n x,y ∈ {−N x,y /2, . . . , N x,y /2 − 1} where N x × N y is the spatial resolution. Due to the reality of the potential vorticity q, the Fourier harmonics satisfy the condition q k = q * −k at every point in time. Using this convention, we define the numerically discretized action functional A[q] as where we have used the notationq k,j+ 1 2 ≡ (q k,j+1 − q k,j ) /∆t to denote the time derivative defined on the staggered grid. As a consequence, we must also compute the linear and nonlinear terms on the same grid. To do this, we average the contribution of neighbouring points by simple interpolation q k,j+1/2 = (q k,j+1 + q k,j ) /2.
To compute the first variation of the action functional δA/δq, we express the action in terms of its Lagrangian: where the variations of the Lagrangian are explicitly given as and δL δq k,j+ 1 Notice that the expressions p, (v · ∇p) and ∆ −1 [(∇q × ∇p) · e z ], are only defined on the staggered grid indexed by {j + 1/2}. Consequently, our notation is defined as (v · ∇p) k,j+ 1 2 = v j+ 1 2 · ∇p j+ 1 2 k . To evaluate these quantities back onto the original grid we interpolate the quantities by p k,j = (p k,j+ 1 2 + p k,j− 1 2 )/2. Finally, after some straighforward mathematics, we arrived and the numerical expression for the first variation of the action function with respect to q: where the time derivative of p is the standard central finite difference expressionṗ k,j = (p k,j− 1 2 − p k,j+ 1 2 )/∆t.

NUMERICAL PREDICTIONS FOR THE MOST PROBABLE RARE TRANSITIONS
To show that the minimum action method is suitable for the prediction of rare transitions in turbulent models, we consider a series of examples that verifies the algorithm. In this section we will begin by considering the over-damped limit of the barotropic quasi-geostrophic dynamics where the nonlinearity is assumed to be absent. We follow with an example that satisfies the equilibrium hypothesis of section in a regime of a single global dynamical attractor. Through this example, we check the numerically obtained transition agrees with the prediction made through the equilibrium theory. Finally, we consider a geophysical based example of considering a rare transition between two distinct zonal jet configurations modelled by the quasi-geostrophic equations. In all cases, we show good agreement of the numerical prediction to analytical predictions.
The over-damped limit The over-damped limit α, ν ≫ 1 of the barotropic quasi-geostrophic equations corresponds to dynamics that are dominated by dissipative effects. Therefore, we can make the assumption that the nonlinearity is sub-dominant and can be neglected. Moreover, for simplicity, we assume also that the topography is absent h = 0. We remark that this limit, although unphysical in reality provides a simple way of verifying the minimization procedure of the minimum action method. Absence of the nonlinearity reduces the system to a linear problem allowing for a theoretical treatment. Indeed, the instanton equations (16) become a series of linearly independent differential equations for each Fourier amplitude that can be straightforwardly solved. The over-damped dynamics are given as Due to the linearity of Eq. (48a), the dynamics of the Fourier representation of the vorticity field ω can be represented as a series of uncoupled Ornstein-Uhlenbeck processes for each Fourier amplitude ω k . This linearity means that the instanton equations can be directly solved yielding a theoretical prediction for a rare transition between two states for a transition time T . By working in the Fourier representation, the instanton equations can be reduced to series of second-order, linear boundary value problem for each Fourier amplitude Notice that the noise correlation |f k | drops out of the instanton equations, meaning that the trajectory is independent on the noise-this is a consequence of the decoupling of each Fourier amplitude from each other. Eqs. (49) can be readily solved with solution with where for we have used the notation β k = α + νk 2n . As one can observe from solution (50), the most probable rare transition corresponds to an exponential decay from the initial state with rate β k followed by an exponential increase with the same rate to the final state. In essence, the transition wants to decay to the zero state with the relaxation defined by the dissipation rate β k for each Fourier mode. Once decayed, the trajectory will transition to the final state with an exponential rate governed again by the dissipation rate. It should be noticed, that for large transition times T , the majority of the transition will result in the state being close to zero, with most of the dynamics occurring at the beginning and at the end of the transition on a timescale defined by the dissipation rate β k The corresponding Lagrangian for the theoretically predicted rare transition (50) is given by The Lagrangian (51) quantifies how much momentum is required from the noise to push the transition to the final state. Therefore, it is an important quantity for the rare transition characterizing the effect of the noise along the transition and also yields the action upon time integration.
We now test the minimum action method and compare the numerically obtained transition path to that predicted by the theory. We apply the numerical method to the over-damped system defined above. We select the initial and final states to be ω 0 = [cos(x) − (2/5) sin(x) + (1/5) cos(y) + (3/5) cos(x + y) − (4/5) sin(2y − x)]/E and ω T = [(1/2) cos(x) + (2/5) sin(x) + (3/5) cos(3y) − (1/5) cos(2y − x) + (1/5) sin(2y − x)]/E appropriately normalized through E to give unit energy density. The two states are displayed in Figure 6 and were chosen so that they contain a large number of modes. We perform the minimization with an initial trajectory defined through linear interpolation between the two boundary states in time. We use N x = N y = 16 Fourier modes and a temporal grid of N t = 101 points for a periodic spatial domain of size L x = L y = 2π and time domain of length T = 10. The dissipation parameters that we use are α = 1 × 10 −1 and ν = 5 × 10 −2 with n = 1. We choose a noise correlation that represents a Gaussian white noise with C(r − r ′ ) = δ(r − r ′ )/Z where Z is the normalization constant to ensure relation (4) holds.
Displayed in Figure 7 (left) is the time evolution of absolute value of each Fourier mode in the transition, and (right) the complex phase space of the transition of each Fourier mode. In both plots, the theoretical predictions arising from Eq. (50) are overlaid by the black dashed curves. We observe excellent agreement between the numerical and theoretical results. We observe a slight discrepancy to the numerical data in the time evolution of mode k = (−1, 2), however this is certainly down to numerical resolution close to the cusp where the transition goes from exponential decay to growth near t = 5. In Figure 7 (right), we quite clearly observe that the transition quickly decays to the zero state for each Fourier amplitude before transitioning to the final state. In Figure 8 we plot the time evolution of the Lagrangian for the numerical prediction and compare to the theoretical result given by (51). We observe excellent agreement to the theory and see that the majority of the Lagrangian is appears at later times where the transition needs to be pushed against the dissipation to reach the final state. We conclude that the numerical minimization for the over-damped system yields the expected results predicted through the Freidlin-Wentzell theory. However, we stress that this example ignores the effect of the nonlinear advection term of the quasi-geostrophic equations, which we will discuss in the next subsections.

Equilibrium instanton starting at zero
In this subsection, we consider applying the minimum action method to an example that satisfies the equilibrium hypothesis of section . Such an examples will allow for the direct comparison to the predictions made in section verifying, not only the numerical optimization algorithm but also the equilibrium theory. Our setup will be the following: we will consider a transition beginning at the zero state and transitioning to another, non-zero, state. What is essential is that we compute this transition in the equilibrium regime where there is only one global dynamical attractor, the zero state. This is important because we wish to compare the numerical prediction to the solution defined through the relaxation from the time-reversed transition in the corresponding dual dynamics defined through the relaxation equation (21). The criterion of zero being the only attractor is important as this comparison can only be made if the transition remains in the same basin of attraction to that of the attractor in which the transition starts. (Transitions that occur across several basin of attractions will have to be compared to a theoretical transition composed of several instantons and relaxation trajectories corresponding to each attractor and saddle the transition passes through.) By considering a setup with only one attractor, then all possible non-zero states will be within the basin of attraction of ω = 0 and the transition can be cmopared to one instanton prediction through the relaxation equations.
In general determining a transition from zero to an arbitrary state ω T analytically is difficult. However, by ensuring the equilibrium hypothesis holds will helps us in this regard. We know from section that the rare transition from an attractor to any state within the basin of attraction of that attractor will be the time-reversed relaxation path of the corresponding dual system. Therefore, by considering the relaxation path Eq. (21), we will recover the instanton: the most probable infinite time fluctuation path. Unfortunately, due to the numerical discretization of the minimum action method, we are unable to ascertain the infinite time transition path. However, one would expect that if T is sufficiently large enough then the two transitions should be relatively close. Therefore, we will consider a sequence of transitions for increasing T and show convergence to the instanton.
The equilibrium setup is as follows: we consider a noise spectrum that is uniform in Fourier space, i.e. corresponding to a Gaussian white noise with a correlation C(r, r ′ ) = δ(r−r ′ )/Z and a potential that is proportional to the enstrophy measure G ∝ ω 2 /2. This is important for three reasons, i) this corresponds to linear friction in the two-dimensional Euler equations meaning that the model is realistic in some sense, ii) this potential and the noise correlation satisfy the equilibrium hypothesis of section , and iii) the quadratic form of the potential implies that there is only one minimum corresponding in this case to the zero state ω = 0.
We use a Fourier resolution of N x = N y = 16 Fourier harmonics in the minimum action method and compare a series of minimizations with increasing transition time T . In each realization we ensure that we have sufficient temporal resolution by using a fixed grid spacing of T /N t = 10 −1 .
In Figure 9 we show the vorticity distribution of the final transition state (left) and the time evolution of the transition energy E (right). Notice, that from the energy balance equation (7), we expect an exponential decrease of the energy with rate 2α. This is exactly what is observed from the relaxation trajectory (black dashed line in Figure 9). Moreover, observe that the numerical minimization predictions also agree with this decay rate initially. The discrepancy at later times is a consequence of the minimization procedure only dealing with transition of finite transition time T , such that the energy must vanish in finite time. This is also supported by the observation that for increasingly longer transition times results in better agreement to the expected energy decay. Of course one expects, and is indicated by the numerics, that this agreement will be exact in the limit of T → ∞. We plot the complex phase space trajectories for each Fourier mode in Figure 10. Again, we observe gradual convergence to the theoretical infinite transition time prediction computed through the relaxation equation. Notice the complex behaviour of the transition associated to the nonlinear nature of the evolution. Finally, in Figure 11 we plot the instanton Hamiltonian for the numerical minimum action predictions for the various transition times T .
We observe fairly good stationarity of the Hamiltonian across the time evolution for transition times T . Notice that the value of the Hamiltonian decreases with increasing transition time T . We expect that the value of the instanton Hamiltonian should decrease with increasing transition time T . Through this example, we have verified the numerical predictions for the most probable rare transition from zero to an arbitrary state using the minimum action method agrees with the theoretical prediction made using the equilibrium hypothesis of section . Through this, we have also independently confirmed that the predictions of rare transitions in equilibrium cases can be predicted through relaxation dynamics of a corresponding dual system.
A non-equilibrium geophysical example: A rare transition between two zonal flow states In this subsection we will consider a more general example which is of huge interest to the geophysical community. Namely, a rare transition between two zonal jet configurations for the barotropic quasi-geostrophic equations with topography and forced by statistically homogeneous noise. This example does not verify the equilibrium hypothesis of section .
The importance of this example is based upon its practicality. The rare transition between two zonal flow states is something that arises in Nature, for instance in ocean currents and atmospheric dynamics. Moreover, the existence of multiple zonal jet attractors in geophysical models has also recently been observed [32], meaning that rare transitions between then in the presence of stochastic fluctuations is an important and viable problem.
Mathematically, the problem is an intriguing one as the set of all zonal states in the periodic barotropic quasigeostrophic equations (q(r, t) ≡ q(y, t)) forms a vector space of steady state solutions of the dynamics where the condition v · ∇q = 0 is always satisfied. By considering the transition between two zonal flow states, there always exists a critical point of the action (a solution of the Euler-Lagrange equations) that remains in the vector space of zonal flows, as long as the noise is non-degenerate in the zonal direction. As discussed below, this zonal critical point of the action verifies simple equations enabling us to make analytical predictions. We stress however, that it is not granted that this zonal critical point is an action minimizer or even a local action minimizer. In this section, for a specific example, we will use the action minimization algorithm to check that this zonal rare transition is actually a local minimizer.
First, let us begin by investigating the theoretical problem. Consider the rare transition between two generic zonal flows (without loss of generality we assume the zonal direction to be x), e.g. q(r, 0) = q 0 (y) and q(r, T ) = q T (y) with topography only varying across y: h(r) = h(y). As mentioned previously, we study the transitions between two zonal flows that occur through other zonal states. Then, the equations for the rare transition are: q * (y, t) = γ(t) [q T (y) − q 0 (y)] + q 0 (y), (52a) γ(0) = 0, γ(T ) = 1. (52b) To find the structure of the parameterized path γ(t), we insert ansatz (52) into the instanton equations (16). Due to the ansatz requiring the transition to remain through zonal steady states, all nonlinear terms identically vanish in the instanton equations. Then these Euler-Lagrange equations simplify to where ω * (t) = γ(t) [q T (y) − q 0 (y)] + q 0 (y) − h(y). The reason for expressing (53) in terms of the vorticity ω and not the potential vorticity q is that the topography h only appears in the definition of the boundary states and not the equation of motion itself. Then, we can straightforwardly solve (53) for each Fourier amplitude as the system is linear in ω. Subsequently, writing the solution in terms of the Fourier amplitudes for ω is more transparent with where e ky = exp (ik y y) /L 1/2 y . We have represented the transition in terms of ω * ky (t) = q * ky (t) − h ky , where h ky are Fourier coefficients of the topography h(y) = ky h ky e ky , and β ky = α + νk 2n y . Solution (54) can be transformed back into the solution for the potential vorticity using q * ky (t) = ω * ky (t) − h ky . What should be noticed is that the transition (54) is reminiscent of the over-damped solution presented in subsection . This is because the zonal-zonal transition occurs through the vector space of zonal flows and the nonlinearity vanishes throughout the transition. Therefore, one can think of zonal-zonal transition as the same as the over-damped solution, or in terms of an Ornstein-Uhlenbeck process with the transition exponentially diffusing across steady states. What is also interesting, is that the solution does not depend on the type of topography, as long as it is defined along y only.
The explicit expression for the Lagrangian for trajectory (54) is given by where C z is the zonal part of the noise correlation function defined by (32). To perform the numerical minimization, we select two zonal flow states given by ω 0 = [cos(y) − (2/5) sin(y) + (4/5) cos(3y) − (3/5) sin(3y) + 2 cos(4y)] /E and ω T = [cos(y) − sin(y) − (3/2) sin(2y) + (4/5) cos(3y) − (4/5) sin(3y)] /E, where E is the appropriate normalization constant. Both of these states are displayed in Figure 12. We use a linear friction coefficient of α = 1 × 10 −1 and normal viscosity (n = 1) with coefficient ν = 5 × 10 −2 . We consider a transition occurring over a time of T = 10 with a temporal resolution of N t = 200 grid points. Our Fourier resolution is N x = N y = 16 on a periodic square domain of size L x = L y = 2π. For the noise, our only conditions are that it is homogeneous and non-degenerate. Therefore, we choose a noise spectrum of the form with k f = 3 and Z being the normalization constant to ensure condition (4) is satisfied. The profile of the forcing is shown in Figure 13. As one can observe, the noise is isotropic and peaked at wavenumber k = 3 with a Gaussian profile around this peak. In Figure 14 we plot the time evolution of the zonal flow transition from the initial to the final states in terms of an Hovmöller diagram. A more illustrative comparison to the theory can viewed in Figure 15 which shows the time evolution of the zonal Fourier amplitudes of the numerically predicted transition (left) and also the transition of each mode in the complex phase space (right). Overlaid in both of the plots in the black dashed curves are the theoretically predicted transition paths from Eq. (54). We observe excellent agreement between the numerical data and theory, indicating that the minimum action method has located a local action minimizer of the action that coincide with the theoretical zonal critical point.
As additional checks, we present the time evolution of the Lagrangian (55) and instanton Hamiltonian (15) in Figure 16 with the theoretical predictions overlaid in black. Again, we observe that the theory agrees with the numerical data. The Lagrangian indicates that most of the effort is in pushing the transition to the final state near the end of the transition. The constant value of the Hamiltonian throughout the transition is another indicator that the minimum action method has found a local minimum.
As can be observed from the expression of the Lagrangian for this setup (55), the amplitude of the noise plays an essential role in determining which Fourier modes contribute to the Lagrangian and hence the action. An important remark in this example is that the noise correlation will not have a direct effect on the shape of the transition, this being due to the nonlinear terms vanishing, but will be essential for determining the specific value of the action corresponding to the rare transition. This value will measure the rarity of the transition and its likelihood of it being observed.
The example above illustrates how one expects a rare transition between two zonal flow states will occur. We predict that the transition will remain through the vector space of zonal flows with the structure of the transition being independent on the non-degenerate noise correlation. A natural next step would be to try and observe such a transition in direct numerical simulations of the quasi-geostrophic equations in regimes like the ones observed in [32] where multiple zonal jet have been observed as dynamical attractors.
The above result on the zonal-zonal transition can be generalized in the context of a rare transitions between two  steady states that are formed from eigenfunctions of the Laplacian operator ∆ with the same eigenvalue λ where −∆q = λq. This is because the set of states constructed by eigenfunctions of the Laplacian also form a vector space of steady state solutions with v · ∇q = 0. Consequently, if both states, initial and final, are constructed with the same sets of eigenfunctions with identical eigenvalues then we expect a similar result as above.

CONCLUSIONS
We have adapted a numerical optimization algorithm called the minimum action method and applied it to a simple model of two-dimensional geophysical turbulence. We have shown, using specific examples, that such an algorithm can be used to compute the most probable rare transitions between two states in cases of bistability in turbulent systems. Using equilibrium theory derived in [28], we showed how the numerically predicted transition agreed with those computed through the relaxation equations of the corresponding dual system when the equilibrium hypothesis holds. Furthermore, we considered a more general problem of computing the most probable rare transition between two different zonal flow configurations where the equilibrium hypothesis does not hold; an important example of relevance to geophysics.
The minimum action method is a viable way to compute rare events in simple turbulent models. It is straightforward to extend this method to more complex turbulent models such as magneto-hydrodynamics where rare transitions between different magnetic field polarizations can be observed [14]. Moreover, natural extensions to the algorithm proposed here could be of benefit, such as arc length parameterization of time [65], adaptive discretization [60], or parallelization [63].
Clearly, the next step in this approach is to compare the action minimizers with observed transitions in both experiments and direct numerical simulations. Indeed, this is one of the current directions of future work. Beside this direct comparison, many work is still to be done both at the theoretical and at the practical level in order to actually assess when minimum action methods alone will be enough to describe rare transitions. This is certainly true when we are in a Freidlin-Wentzell regime, as discussed in subsection , however for most turbulence models there is no clear criteria developed yet to assess when a rare transition in a turbulent flow is actually in the Freidlin-Wentzell regime. This is an important question that should be addressed both from a theoretical and an empirical point of view.
This work is a step in a long term program that is aimed at developing the tools to compute rare transitions and their probabilities in complex turbulent flows. Our ultimate aim is to be able to make these computations for models that are relevant for climate dynamics. Of course much is still to be achieved in this direction before climate applications can be really considered. However it is important to stress that no approach currently allows to reliably compute rare transition in climate problems.