Computing Coherent Sets using the Fokker-Planck Equation

We perform a numerical approximation of coherent sets in finite-dimensional smooth dynamical systems by computing singular vectors of the transfer operator for a stochastically perturbed flow. This operator is obtained by solution of a discretized Fokker-Planck equation. For numerical implementation, we employ spectral collocation methods and an exponential time differentiation scheme. We experimentally compare our approach to the more classical method by Ulam that is based on integration of the exact transfer operator.


Introduction
Many fluid flows at the onset of turbulence exhibit regions which disperse slowly with time (so called coherent sets [10,13]) while other regions disaggregate comparatively quickly. Often, the boundary of a slowly dispersing region can be associated to a lower-dimensional object (a so called Lagrangian coherent structure [17,15,16]) which serves as a transport barrier for Lagrangian particles within a coherent region [11].
Based on the related concept of almost invariant (resp. metastable) sets in timeinvariant dynamical systems [5,8], recently a framework has been proposed for the computation of coherent sets via singular vectors of a certain smoothed transfer operator which describes the evolution of probability densities on phase space under the given dynamics [7]. Numerically, this operator can be approximated by a Galerkin [22,5] method via evaluating the flow map explicitly by time integration of the vector field. Depending on the type of approximation space chosen, a diffusion operator has either to be applied explicitly or is already implicitly present in the discretization ( via numerical diffusion).
In this manuscript, we propose to compute coherent sets by directly solving the Fokker-Planck equation instead. More precisely, instead of computing the evolution of the basis of our approximation space under the deterministic dynamics and then applying diffusion, we directly compute the evolution of this basis under the stochastic push forward operator given by the solution operator of the Fokker-Planck equation. This advection-diffusion equation can efficiently be discretised using spectral collocation (cf. also [9]). In order to deal with aliasing in the case of dominating advection we use a skew symmetric form of the advection term and in order to deal with stiffness in time due to the Laplace operator we employ an exponential time differentiation (etd) integrator.
As a key advantage of our method we only need to sample the vector field at each time instance on a fixed grid of rather coarse resolution. In particular, we do not need to integrate trajectories of (Lagrangian) particles and no interpolation of the vector field to points off the grid.

Problem statement
We consider a time-dependent ordinary differential equationẋ = b(t, x), b : R × X → R d , on some bounded domain X ⊂ R d . We fix some initial and final time t 0 , t 1 ∈ R and assume that the vector field b is continuous and locally Lipschitz w.r.t. x for all t ∈ [t 0 , t 1 ], such that the associated flow map Φ = Φ(·, t 0 , t 1 ) : X → X is uniquely defined. In order not to obscure the key ideas we restrict to the case of X being a hyperrectangle and furthermore b being periodic in x and divergence free, i.e. the flow map Φ being volume preserving.
Roughly speaking, we would like to compute a set A 0 ⊂ X which disperses slowly under the evolution of Φ, i.e. which roughly retains its shape while being moved around by the flow Φ. A such set A 0 will be called coherent [13].
Inspired by the associated notion of an almost invariant (resp. metastable) set in the autonomous setting [5], we start formalizing this request by asking for a pair of sets A 0 , A 1 ⊂ X such that A 0 will approximately be carried to A 1 by Φ in the sense that where m denotes Lebesgue (i.e. volume) measure. Evidently, with A 1 = Φ(A 0 ) we obtain κ(A 0 , A 1 ) = 1 for any A 0 ⊂ X, so this is not a well defined problem yet. In fact, (1) does not impose any condition on the geometries of the sets A 0 and A 1 . In particular, the image set A 1 = Φ(A 0 ) might be stretched and folded all over the domain X -but this is not the type of coherent set we have in mind. Froyland [7] observed that κ(A 0 , Φ(A 0 )) is not close to 1 for every set A 0 any more as soon as one artificially adds some random perturbation to the dynamics (cf. [5,4] for related ideas in the autonomous context).

Computing coherent sets via diffusion at initial and final time
More formally, Froyland proposes the following approach [7], see also [12]. We will employ the transfer operator (resp. Frobenius-Perron operator or push forward ) P associated to Φ. This operator describes how densities u : X → [0, ∞) are evolved by Φ: If a set of points in X is initially distributed according to some density u then after flowing these points forward with Φ they will be distributed according to Pu. In our case of Φ being volume preserving, P : L 1 = L 1 (X, m) → L 1 is given by In the sequel, we restrict our attention to L 2 = L 2 (X, m) ⊂ L 1 as we want to use the canonical scalar product on L 2 . We are going to add an ε-small random perturbation to the flow map Φ via complementing the action of P by a diffusion operator D ε : L 2 → L 2 , Figure 1: Graphical illustration of the action of the perturbed transfer operator P ε as the composition of diffusion operators at initial and final time and the transfer operator P.
where α ε : X → [0, ∞) is a bounded kernel with X α ε (x)dx = 1 and α ε → δ 0 in a distributional sense as ε → 0. Here, we use a diffusion with bounded support of radius ε, namely α ε (x) = 1 Bε(x)∩X /m(B ε (x) ∩ X). With P and D ε , we finally define the evolution operator [7] P ε : cf. Figure 1. Note that P ε is stochastic, i.e. positive and P ε 1 X = 1 X , where 1 X denotes the characteristic function on X, since we assume Φ to be volume preserving. Further, with this choice of α ε , P ε is compact and as the vector field b is divergence-free has a simple leading singular value σ 1 = 1 [5,7]. In order to compute coherent sets, note that if κ(A 0 , and look for some pair (A 0 , A 1 ) of sets which maximizes this quantity. We get where c 0 = m(A c 0 )/m(A 0 ) and c 1 = m(A c 1 )/m(A 1 ) and since the functions f := c 0 1 A 0 −c −1 0 1 A c 0 and g := c 1 1 A 1 −c −1 1 1 A c 1 statisfy f 2 = g 2 = 1 and f, 1 X = g, 1 X = 0, we obtain, by relaxing to arbitrary functions f, g with zero mean, This problem is much easier to solve than ( ), where we need to maximize over characteristic functions. As for fixed f ∈ L 2 and P ε f, 1 X = 0, we obtain where V = {f ∈ L 2 : f, 1 X = 0} is the orthogonal complement of span 1 X .
1 At P ε 1 At P ε Figure 2: In contrast to the approach using diffusion at initial and final time cf. 1, in our approach diffusion is built into the model of the dynamical system.
This operator norm is given by σ 2 (P ε ) [7], the second largest singular value of P ε and the maximizing function is f = v 2 , the associated right singular function. As v 2 is an approximation to c 0 1 A 0 − c −1 0 1 A c 0 the common heuristics is to identify an associated coherent set by where θ ∈ R is some appropriately chosen threshold [5], [8]. Correspondingly, u 2 = P ε v 2 is the associated left singular function and To sum up, we can compute a pair A 0 , A 1 of coherent sets via computing the second singular value and its corresponding singular vectors of a slightly perturbed Frobenius-Perron operator. Often a low order spatial discretisation such as Ulam's method is used [26], which is a Galerkin projection on indicator functions of a partition of X into boxes X i . Usually P is then computed numerically via sampling K test points x i in every box X i , compute Φ(x i ) and count how many fall in B j : The method automatically adds sufficient numerical diffusion so that we can actually directly employ the unperturbed operator P. However, with increasing resolution this numerical diffusion decreases which results in all singular values approaching the value one, cf. [19,7].

Computing coherent sets via time-continuous diffusion
Instead of explicitly applying diffusion at the beginning and the end of the time interval under consideration, we propose to incorporate a small random perturbation continuously in time, i.e. instead of considering a deterministic differential equation we now use the stochastic differential equation in order to define the flow map Φ. Here, (B t ) t≥0 is d-dimensional Brownian motion and ε > 0. Since we assume b(t, ·) to be Lipschitz, X to be bounded and b to be periodic in x, for any initial condition ξ ∈ X, (5) has a unique continuous solution x in the sense of [24], Thm. 5.2.1.
The transfer operator P ε associated to this stochastic differential equation is given by the solution operator of the parabolic Fokker-Planck equation Appropriate boundary conditions are chosen (e.g., periodic or homogeneous Neumann boundary conditions), so that for all u, w ∈ L 2 (X) in the domain of L ε holds: More precisely, P ε u 0 = u(t 1 , ·), where u is the solution to (6) with initial condition u(t 0 , ·) = u 0 .
A proof of this claim is given in the appendix. Note that with Schauder's theorem also P ε, * is compact. Since we assumed b to be divergence free, we also have: it follows that 1 X is a steady state of (6), and consequently P ε 1 X = 1 X .
For the adjoint operator P ε, * we test with u ∈ L 2 (X): where we have used that P ε is integral conserving: thanks to the integration-by-parts rule (7). Hence X u = X P ε u for all u ∈ L 2 (X), cf. [21]. By the Riesz representation theorem, we can conclude that also P ε, * 1 X = 1 X . For proving positivity of P ε , we consider the evolution of the negative part u − (s, , we have, using the integration-by-parts rule (7), Hence if u(t 0 , ·) is non-negative, u(t, ·) is non-negative for all t > t 0 , as the norm of its negative part does not increase. To show positivity of P ε, * , let two non-negative functions u, w ∈ L 2 (X) be given. Then P ε, * w, u = w, P ε u ≥ 0 by positivity of P ε . For any fixed non-negative w, this relation holds for all non-negative u. This implies that P ε, * w is non-negative.
Proof. With the same argument as in (8) we obtain for solutions to (6): This implies that u(t, ·) 2 is non-increasing in time. Moreover, if the initial datum u is not constant on X, its gradient does not vanish almost everywhere, and hence u(t, ·) 2 decreases on a small time interval (t 0 , t 0 + τ ). Thus P ε u 2 < u 2 , and u cannot be an eigenfunction of P ε corresponding to the eigenvalue 1. With the same argument the constant function is the only eigenfunction of P ε, * corresponding to the eigenvalue 1.
Hence the leading singular value σ 1 = 1, which is an eigenvalue of P ε, * P ε is isolated.
Hence P ε is compact and doubly stochastic with isolated, simple leading singular value σ 1 . So we are in a similar setting as in section 3 and apply the same constructions.
Remark 1. These considerations also work for a non-conservative vector field b and corresponding probability measure µ = m by suitably normalizing the operator P resp. P ε , cf. [7].

Discretisation
In order to approximate the transfer operator P ε , we choose a finite dimensional approximation space V N ⊂ L 2 (X) and use collocation. As the Fokker-Planck equation (6) is parabolic and since we assume the vector field b(t, ·) to be smooth for all t, its solution u(t, ·) is smooth for all t > t 0 (see e.g. [6], Ch. 7, Thm. 7). To exploit this, we choose V N as the span of the Fourier basis Note that dim(V N ) = N d . Choosing a corresponding set {x 1 , . . . , x M } ⊂ X of collocation points (typically on an equidistant grid), the entries of the matrix representation P ε of P ε are given by where k ∈ Z d , k ∞ ≤ (N − 1)/2 and j = 1, . . . , M . For P ε , we then compute singular values and vectors via standard algorithms. Note that we might choose M ≥ N , i.e. more collocation points than basis functions. This turns out to be useful since we expect the maximally coherent sets to be comparatively coarse structures which can be captured with a small number of basis functions.
Solving the Fokker-Planck equation. In order to compute P ε ϕ k in (9) for some basis function ϕ k ∈ V N we need to solve the Fokker-Planck equation (6) with initial condition u(t 0 , ·) = ϕ k . This can efficiently be done in Fourier space via integrating the Cauchy problem is the Fourier transform of v ∈ V N . Note that the differential operators in Fourier space reduce to multiplications with diagonal matrices, while F and F −1 can efficiently be computed by the (inverse) fast Fourier transform.
Aliasing. One problem with this formulation is the possible occurence of aliasing. Aŝ u andb are trigonometric polynomials of degree N , the multiplication F −1 (û)b in the advection term leads to a polynomial F −1 (û)b of degree 2N that cannot be represented in our approximation space V N . The coefficients of degree ≥ N of this polynomial act on the coefficients of lower degree, leading to unphysical contributions in these. This shows up in high oscillations and blow ups (see [2], Ch. 11) in the computed solution. One way to deal with this problem is to use the advection term div (bu) = 1 2 div (bu) + 1 2 (b∇u) .
The spectral discretization of the left-hand side is not skew symmetric, but the discretization of the right-hand side is [27]. This leads to purely imaginary eigenvalues of the resulting discretization matrix and hence to mass conservation. Consequently, for the unperturbed operator (ε = 0), the resulting matrix has eigenvalues on the unit circle. However, this approach has to be used carefully as even though the solution does not blow up, it might still come with a large error, e.g. small scale structures may be suppressed. If the system produces such small scale structures the grid has to be chosen fine enough to resolve them.
Time integration. For low resolutions, the time integration of the space discretized system can be performed by a standard explicit scheme. For higher resolutions, the stiffness of the system due to the Laplacian becomes problematic and a more sophisticated method must be employed. Here, we use the exponential time differentiation scheme [3] for the space discretized system. The etd-scheme separates the diffusion term L = ε 2 2 D, where D is the discretized Laplacian, from the advection term N (u, t) = − div F(F −1 (û)b(·, t)), where b is evaluated via spectral collocation. The system can hence be written as Via multiplying (10) with e −tL and integrating from t 0 to t 1 we obtain with h = t 1 −t 0 . A numerical scheme is derived by approximating the integral in (11), e.g. by a Runge-Kutta 4 type rule, resulting in scheme called etdrk4. To this end note that D is a diagonal matrix. We use the version in [20], which elegantly treats a cancellation problem occurring in a naive formulation of etdrk4 by means of a contour integral approximated by the trapezoidal rule.
Extraction of coherent sets For the extraction of coherent sets several methods can be used. For the decomposition into exactly two coherent sets a simple thresholding or an a posteriori line search can be used [7,8], for the decompositition of the domain into n coherent sets the first n singular vectors should be considered (see [5,25,18] for the autonomous case) and post processed via a simple clustering heuristic, e.g. k-means, see [1,14]. We here focus on the computation of singular vectors. 6 Numerical Experiments 6.1 Quadruple gyre The first numerical example is a two dimensional flow (cf. Fig. 3), an extension of the well known double gyre flow, given byẋ x, y) = π sin(πf (t, x)) cos(πf (t, y))∂ x f (t, y) and f (t, x) = δ sin(ωt)x 2 + (1 − 2δ sin(ωt))x. We fix δ = 0.25, ω = 2π, t 0 = 0, t 1 = 10.25, h = 0.205 (i.e. 50 time steps) and choose ε = 0.02 in such a way that the six largest singular values of P ε roughly equal those obtained from Ulam's method (without explicit diffusion).
We use M = 15 collocation points and N = 5 basis functions in each direction and compute the first four singular values and right singular vectors of P ε . As shown in Fig. 4 (top row), they nicely reveal the gyres in their sign structures. The computation of P ε takes less than a second, the computation of all singular values and -vectors less than 0.01 seconds 1 . For comparison, in the bottom row of Fig. 4, we show the same singular vectors computed via Ulam's method (without explicit diffusion) on a 32 × 32 box grid using 100 sample points per box. Here, the computation of the transition matrix takes around 5 seconds, the computation of the six largest singular values resp. vectors less than 0.2 seconds.

A turbulent flow
We now turn to a case where the vector field is only given on a discrete grid. Here, the approach proposed in Section 4 is particularly appealing if we chose the grid points as collocation points (resp. a subset of them): In contrast to methods which are based on an explicit integration of individual trajectories (as Ulam's method), no further interpolation of the vector field is necessary. Furthermore, depending on the initial point of a trajectory, where v denotes the velocity field, p the pressure, and ν > 0 the viscosity. Via introducing the vorticity w := ∇ × v, the Navier-Stokes equation in 2D can be rewritten as vorticity equation where the pressure p cancels from the equation. We can extract the velocity field v from the streamline function ψ via v 1 = ∂ y ψ and v 2 = −∂ x ψ. The equation can be integrated by standard methods, e.g. a pseudo spectral method as proposed in [23]. For a first experiment, we choose an initial condition inducing three vortices, two with positive and one with negative spin, as initial condition: We solve (12) on a grid with 64 collocation points in both coordinate direction. For the computation of coherent sets we chose n = 16 basis functions and N = 32 collocation points in both directions, as well as t 0 = 0 and t 1 = 20. We hence use only every second For comparison, we show the same singular vectors computed via Ulam's method (right) on a 32 × 32 box grid using 100 sample points per box which were integrated by Matlab's ode45. Here, we need to interpolate the vector field between the grid points using splines (i.e. using interp2 in Matlab). This computation also took 35 seconds.
For a second experiment, we use a turbulent initial condition by choosing a real number randomly in [−1, 1] from a uniform distribution at each collocation point. For the coherent set computation, we restrict the time domain to [t 0 , t 1 ] = [20,40] since then the initial vector field has smoothed somewhat, cf. Fig. 6. Here, we choose M = 64, i.e. more collocation points than in the examples before as the vector field lives at smaller scales, N = 16 and ε = 10 −3 which is of the same order as ν. The results are non-obvious pairs of coherent sets, cf. Fig. 6. The maximally coherent set indicated by the second singular vectors describes the vortex in the upper left region. The computation took 100 seconds.
For comparison, we show the same singular vectors computed via Ulam's method on a 32 × 32 box grid using 25 sample points per box. Here, we need to interpolate the vector field between the grid points using splines (i.e. using interp2 in Matlab). Since the vector field is turbulent, using Matlab's ode45 for the vectorized system is infeasible. We therefore choose a fixed time-step of h = 0.01, such that the result does not seem to change when further decreasing h. This computation also took roughly 100 seconds.

Octuple gyre
Finally, based on the quadruple gyre, we construct the following flow in R 3 with eight gyres, given by the equationsẋ = g(t, x, y) − g(t, x, z) y = g(t, y, y) − g(t, y, x) z = g(t, z, x) − g(t, z, y) on the 3-torus X = [0, 2] 3 with g and the other parameters from Section 6.1. By construction the dynamics of this system exhibits eight gyres in each quadrant of the cube [0, 2] 3 . The cross sections of this vector field at, e.g. {x = 0.5} and {x = 1.5} are again given by Fig. 3.
In Fig. 7 we show the second to fourth left singular vectors, computed using 16 collocation points and 4 basis functions in each direction with ε = 0.1. Interestingly, none of the obvious gyres is identified by the second and the third singular vectors, but two coherent sets with 'centers' at [1,1,1] and [2,2,2]. This is unexpected as this set is not encoded in the vector field on purpose. Starting with the fourth singular vector, the gyre centers are identified as coherent. In Fig. 8 we show the corresponding right singular vectors at time t 1 = 10.25. The computation time is 30 seconds.

Conclusion and future directions
We proposed to compute transfer operators in time-variant flows fields by a direct integration of the associated Fokker-Planck equation using spectral methods. In particular, this approach does not require to integrate Lagrangian trajectories, which is particularly beneficial if the underlying flow field is only given by (a grid of) data. However, the  spectral method described here is restricted to periodic domains. While it might be possible to treat cubical domains via pseudo spectral methods, a different approach for more complicated domains will be needed.
Further investigations will deal with the question of how to use other (spectral) bases such that the resulting transfer matrix is sparse, how to incorporate the information at intermediate times and how to apply the basic idea to other types of diffusion.