Lagrangian ocean analysis: Fundamentals and practices

Lagrangian analysis is a powerful way to analyse the output of ocean circulation models and other ocean velocity data such as from altimetry. In the Lagrangian approach, large sets of virtual particles are integrated within the three-dimensional, time-evolving velocity fields. Over several decades, a variety of tools and methods for this purpose have emerged. Here, we review the state of the art in the field of Lagrangian analysis of ocean velocity data, starting from a fundamental kinematic framework and with a focus on large-scale open ocean applications. Beyond the use of explicit velocity fields, we consider the influence of unresolved physics and dynamics on particle trajectories. We comprehensively list and discuss the tools currently available for tracking virtual particles. We then showcase some of the innovative applications of trajectory data, and conclude with some open questions and an outlook. The overall goal of this review paper is to reconcile some of the different techniques and methods in Lagrangian ocean analysis, while recognising the rich diversity of codes that have and continue to emerge, and the challenges of the coming age of petascale computing.

regions. That is, we are interested in mapping out pathways of seawater 7 motion, since the transport of seawater and its tracer content, as well as the 8 pathways and timescales for that transport, are key facets in how the ocean 9 plays a role in climate and marine ecology. for Eulerian methods, which make direct use of ocean velocity fields on their 15 native grids. 16 The second approach makes exclusive use of the Lagrangian perspective 17 of fluid dynamics (e.g., Bennett, 2006). This method employs an ensemble of 18 virtual (passive) Lagrangian particles of zero spatial extent whose trajectories

41
Our presentation is aimed at graduate students, though any large-scale 42 oceanographer or mathematician with an interest in virtual particle analysis 43 could use this paper as a starting point. In that sense, this paper is intended 44 as an accompanying paper to Griffies et al. (2000), which provided an intro-   (Lumpkin and Pazos, 2007). Each drifter is geo-located every 6 hours and has a randomly assigned colour.
floats (e.g., Swift and Riser, 1994). Many observations remain inherently 53 Lagrangian, such as the trajectories of surface drifters shown in Figure 1 54 (Lumpkin and Pazos, 2007), the subsurface Argo floats (Lebedev et al., 2007; 55 Ollitrault and Rannou, 2013), and the tracking of fish larvae (Paris et al.,56 2013a) and turtle hatchlings (Scott et al., 2014). 57 Lagrangian analysis through virtual particle tracking within OGCMs began 58 in the 1980s, on small-scale structures, with studies on a theoretical box-model 59  as well as a model that incorporated hydrographic data 60 and realistic topography . The Lagrangian framework 61 of these small-scale examples was then applied to the velocity-field output 62 of basin-scale, three-dimensional numerical experiments. Examples include 63 regional deep ocean circulation (Fujio and Imasato, 1991), western boundary 64 currents (Imasato and Qiu, 1987), fronts (Pavia and Cushman-Roisin, 1988) as nutrients (e.g. Chenillat  Sebille et al., 2015), jellyfish (e.g. Dawson et al., 2005), icebergs (e.g. Marsh 78 et al., 2015), surface drifters (e.g. Kjellsson and Döös, 2012b), oil droplets 79 (e.g. Paris et al., 2012), eel (e.g. Baltazar-Soares et al., 2014), pumice (e.g. 80 Jutzeler et al., 2014) and many more. 81 The ocean circulation covers an enormous range of scales and regions. As Hydrodynamics (e.g. Cummins et al., 2012). While advances in this field 93 have been made in large scale oceanography, both for sub-components of 94 ocean models (e.g. Bates et al., 2012) and for fully Lagrangian ocean models 95 (Haertel and Randall, 2002;Haertel and Fedorov, 2012), this topic is not the 96 focus of this review. Instead, we focus on Lagrangian diagnostic methods to 97 identify oceanic pathways. 98 The Lagrangian framework for analysing pathways is complementary to 99 the analysis of tracers. One of the key differences is the computational cost.  fields, although the latter has its own advantages, including a more natural 118 alignment with the treatment of advection and diffusion within models.

119
Finally, another great advantage of Lagrangian particle experiments is 120 that particles can be advected, at least in offline mode when velocity fields are 121 stored, backwards in time. This reverse-time analysis allows one to investigate 122 where water masses found within a model at a certain location come from.  reference frame that is moving with an infinitesimal fluid particle (equivalently 142 a "fluid parcel"). Fluid motion is thus the accumulation of continuum particle 143 motion. The fluid particle framework that forms the basis for Lagrangian 144 kinematics offers a powerful conceptual picture of fluid motion (e.g., Salmon, 145 8 1998; Bennett, 2006), with this picture taken as the basis for Lagrangian 146 methods of analysis.

147
Eulerian kinematics is a complement to Lagrangian kinematics. The

148
Eulerian approach is based on describing fluid motion in a reference frame 149 that is fixed in space. Eulerian kinematics is the basis for most numerical  The motion of a classical point particle is described by knowledge of its 156 position vector, X(t), which provides the position of the particle at time 157 t. As the particle moves, it traces out a curve in space referred to as a 158 trajectory. When describing N discrete particles, we add a discrete label 159 to each of the particle positions, X (n) (t). For continuum matter, such as 160 seawater, the discrete label n becomes a continuous vector, X(a, t), with 161 a = X(t = t 0 ) a common (though not necessary) choice. In general, the label 162 vector, a, is referred to as the material coordinate (e.g., Salmon, 1998), since 163 this coordinate distinguishes between infinitesimal particles comprising the 164 continuum.

165
A fluid particle is conceived of as a microscopically large collection of 166 many molecules, whose velocity is formally determined as a mass weighted 167 mean of the velocity of the individual molecules (i.e., barycentric velocity 168 as defined in Section II.2 of DeGroot and Mazur (1984) and Section 1.9 of 169 Salmon (1998)). Alternatively, by making the continuum hypothesis, we 170 dispense with molecular degrees of freedom, so that a particle is considered 171 a macroscopically small material fluid volume, treated as a mathematical 172 continuum and labelled by the material coordinate a. For an incompressible 173 fluid, the fluid particle has constant volume; however, its constituents do 174 not remain fixed, as they are generally exchanged with adjacent particles 175 through mixing, thus changing the particle's tracer content (e.g., water, salt, 176 2 The top and bottom faces of grid cells are generally moving, since the general vertical coordinates defining these surfaces need not be static. For example, these cell faces may be defined according to constant pressure, constant potential density, or constant rescaled ocean depth. 9 nutrients), as well as altering its heat, all the while maintaining a constant 177 volume.

178
The velocity of a fluid particle is the time derivative of the trajectory, 179 computed with the material coordinate held fixed. The mathematical connec-180 tion between Lagrangian and Eulerian descriptions is enabled by equating the 181 particle velocity crossing a point in space, X(a, t) = x, to the fluid velocity 182 field at that point The relation (1) provides a starting point for Lagrangian fluid analysis. Note 184 that the resulting fluid particle trajectories are sometimes called material 185 pathlines in the fluid mechanics literature (e.g., Aris, 1962;Batchelor, 1967 Note that, when trajectories are dispensed with (as in the Eulerian descrip-190 tion), we recover the more succinct expression for the material time derivative Consequently, the volume of a material fluid particle remains constant (i.e.,

215
A streamtube is a bundle of streamlines, so that streamtube sides are 216 parallel to the velocity (see e.g. Figure 3.6 in Kundu et al., 2012) wheren is the outward normal at the respective end, and dA the corresponding 225 area. By construction, v ·n = 0 on the streamtube sides, so the sides do 226 not contribute to the balance in equation (6). Hence, the volume transport 227 entering one streamtube end equals to that leaving the other end. Furthermore, 228 the area of the streamtube is inversely proportional to the local normal velocity.

229
The transport constraint (6) holds regardless of whether there is diffusive end (see also Section 2.5).

235
The above properties make streamtubes useful for understanding the 236 circulation in a steady incompressible fluid. In particular, they provide 237 the mathematical basis for Lagrangian analysis methods that tag particles 238 with volume transport (e.g. Eckart, 1948;Welander, 1955   In so doing, we also introduce the residual mean velocity.
where the transport tensor J has units of squared length per time. It is 282 convenient to split the transport tensor into the sum of a symmetric and 283 anti-symmetric tensor The symmetric tensor, K, has components satisfying 7 This tensor corresponds to diffusion so long as it is positive definite. The 286 anti-symmetric tensor, A, corresponds to skew diffusion or equivalently to 287 advection (e.g., Middleton and Loder, 1989;Griffies, 1998).

288
Given the decomposition of the transport tensor (8), we find it useful to 289 write the tracer equation in the form where defines the residual-mean velocity and is known as the eddy-induced velocity 8 . Notably, the eddy-induced velocity 293 is non-divergent due to the anti-symmetry property Consequently, the tracer equation (10) can be written in the flux-form Since both v and v † are divergence-free, one can define a streamtube in a    we use the identity so that where we introduced the drift velocity The drift velocity generally has a non-zero divergence One notable case where ∇ · v drift = 0 is isotropic diffusion with a constant diffusivity; e.g., molecular diffusion. Molecular diffusion is generally not relevant for large-scale ocean models, as models (and large-scale observations) do not resolve down to the Kolmogorov scales. Hence, large-scale models make use of the far larger, and flow dependent, eddy diffusivities.

329
There is yet another way to consider tracer transport pathways using where N is the total number of particles, x i is the particle position, and W is 343 a mapping kernel function (dimensions inverse volume) that maps the particle where Ω is the integral volume in three dimensions. and 2).

365
When using an ocean model, we distinguish between two techniques of

377
As an alternative to velocities generated by OGCMs, we may use observation- time. In this section, we detail these aspects. 3.2. Temporal integration of the virtual particle trajectory equation 393 When the nth virtual seawater particle is located at the point X (n) (t) = x, 394 we can update its position by time stepping the velocity equation (1) 395 where we dropped the trajectory super-script n to simplify notation. Note     of Eq (21). One popular method is the 4 th order Runge-Kutta scheme (e.g., 438 Butcher, 2016), where information of the (interpolated) velocity field at four 439 increments between time steps t n and t n+1 is used.

440
The fourth order Runge-Kutta method is a member of a family of inte-  When working with stored velocity data, as when virtual particle tra-452 jectories are computed offline, temporal interpolation is usually required.

453
Interpolation is needed because the interval between consecutive stored ve-454 locity fields is generally longer than the time step, ∆t, used to advance the    Streamtube-based volume transport is reversible, so that backward inte-531 grations can be performed to track the origin of a given volume. It is for these 532 reasons that practitioners of discrete streamtube methods generally do not 533 introduce diffusion (or stochastic noise) when computing particle trajectories.

534
Rather, the method is focused on determining volume transport pathways 535 defined from the resolved or the residual mean flow.  Gardiner (1985), Jazwinski (1970), and Kloeden and Platen (1992), as well 562 as the oceanographic review by Visser (2008 In this equation, X i (t) are components of the tracer trajectory vector X(t), which means that information on the probability density of the trajectory 578 X(t) at time t is sufficient to make predictions at later times. Non-Markovian 579 models require information at earlier times, which is generally impractical.

580
The presence of the Wiener process means that integrating the equation using 581 deterministic calculus does not produce a unique solution. We make use The Itô calculus used here is but one mathematical approach for realizing a unique solution to a SDE (e.g., Gardiner, 1985). Stratonovich and Itô-backward approaches offer alternative stochastic integration methods, and they can also be used to derive stochastic particle models (Gräwe et al., 2012;Shah et al., 2011;Spivakovskaya et al., 2007a,b). We focus on the Itô calculus as it is well known to physicists, as is the corresponding Fokker-Planck equation. Furthermore, the drift, a i (t, X), of an Itô SDE represents the mean of the stochastic particle tracks. Finally, the well known Euler scheme (see equation (28) below) is a straightforward numerical approximation of the Itô SDE, whereas this A cloud of particles will estimate the probability density P (t, x) for the 585 stochastic tracer trajectories. Use of an Itô stochastic process X(t) ensures 586 that the probability density function evolves according to the following Itô 587 form of the Fokker-Planck or forward Kolmogorov equation with We can relate the Fokker-Planck equation (24) to the Boussinesq form of the 590 tracer equation (16), so that 13 The corresponding SDE for the trajectory is given by scheme cannot be used to discretize a Stratonovich or an Itô-backward SDE. 13 The tensor elements σ ik (t, X) are not uniquely determined by the diffusion tensor K. However, all choices consistent with the relation 2 K ij = σ ik σ jk result in statistically identical diffusion processes.
In this equation, ∆W k (t) is a Gaussian random variable with zero mean 599 and variance ∆t, generated via a random generator. The accuracy of the 600 Euler scheme is O(∆t 1/2 ) in the strong sense; i.e., for approximating the  There are methods to compute trajectories directly from a SDE for many 608 applications (e.g., Kloeden and Platen, 1992). Trajectory computation di-  The second approach to adding the effects of diffusion and unresolved 629 physics to particles is to 'ad hoc' find an SDE that matches the statistics -e.g. with either observations or particles simulated in finer-resolution models. This 632 approach has been developed by Griffa (1996) and further by Berloff and

636
A hierarchy of Markov models is considered, whereby the stochastic 637 term is added to either particle displacement (zeroth-order Markov model,638 corresponding to uncorrelated eddy velocity field), the particle velocity (first-639 order model, accounting for correlations of the velocity) or the particle 640 accelerations. In most cases, the first-order model is found to best approximate 641 the oceanic mesoscale turbulence introduced by coherent eddies.

642
In the first-order Markov model (multiplicative noise), stochastic noise 643 is used to modify the present position of a particle when updating to a new 644 position, in which case the trajectory equation (21) can be written as where is a random number. Notably, the application of noise in this manner 646 does not ensure that X(t + ∆t) results from time stepping a divergence-free 647 velocity. For that purpose, we consider an alternative approach, whereby we 648 introduce a stochastic divergence-free velocity 649 We can ensure ∇ · v noise = 0 by introducing a stochastic vector streamfunction, 650 so that for each each grid cell we have Since the stochastic velocity remains non-divergent, this approach offers a   (e.g., Durran, 1999). This implementation is intrinsically free of if-statements.

736
However, free-slip boundary conditions require further adaptation.   For most applications, the raw particle trajectories output by Lagrangian 782 analysis codes need to be further processed to help answer scientific questions.

783
In this section, we overview ways in which Lagrangian particle trajectories 784 can be used and analysed to improve our understanding of ocean circulation 785 and dynamics.  The single-particle diffusivity stems from the seminal work of Taylor 794 (1921). It quantifies the ensemble-mean rate of particle dispersion from an 795 initial location, so that we have In this equation, X(t) is the Lagrangian virtual particle trajectory, and 797 V (t) = dX(t)/dt is the Lagrangian particle velocity.    the estimation of single-particle statistics must be further refined to account 805 for non-stationarity and inhomogeneities of the underlying Eulerian field 806 (Davis, 1983(Davis, , 1985(Davis, , 1987(Davis, , 1991. Under the assumption that the velocity field   Double-particle statistics builds upon the works of Batchelor (1952) and 827 Bennett (1987). The relative diffusivity (the time rate of the mean square 828 pair separation) is where the sum is over all pairs of particles (m, n). At times longer than  The single-and double-particle diagnostics derived from simulated trajec-839 tories may be used for the following.     : a) Probability with that a 1 • x1 • bin spanning the whole depth range is occupied by a particle during the considered time span. The probability for each bin has been obtained by counting the number of particles occupying this bin at each time step, summing up this particle counts over the whole integration period and then dividing it by the total number of recorded particle counts for all bins. Thus, the sum of the probabilities of all bins yields 100%; b) Probability that a particle occupies a particular bin at least once during the considered time span. In this case the probability for each bin has been obtained by counting the number of different particles occupying this bin and dividing by the total number of particles. Thus, the probability for each bin can range between 0 and 100%. The Lagrangian analysis was performed with the ARIANE tool using the 3D 5day-mean velocity fields from the high-resolution model INALT01 (Durgadoo et al., 2013).
the highest probabilities, highlighting the most probable spreading pathways 958 along the major currents and via mesoscale eddies. But even between the 959 AC and ARC there is a region with comparable particle position counts.
960 Figure 4b reveals that this is not due to a particularly strong circulation 961 feature transporting many particles, but rather due to the recirculation of 962 fewer particles.

963
One consideration in the choice of bin resolution is aliasing. If either the 964 grid resolution is too fine or the period of particle position updates is too long, 965 trajectories may pass through more than one histogram bin within a given steering (Davis, 1998). Finally, an alternative to binning was proposed by 1004 Koszalka and LaCasce (2010). Rather than grouping the Lagrangian data in 1005 bins of fixed size, they grouped a fixed number of nearest-neighbor particle 1006 positions together using a clustering algorithm.

1018
The age of a parcel of ocean water, described by numerous particle trajec-    and a hybrid approach utilising on-node openMP, GPU, or MIC threading 1159 will be required on next-generation architectures to obtain peak performance.
Task-based parallelism, if implemented for OGCMs, may provide at least 1161 a partial solution. However, at present, no definitive framework or "best 1162 practice" has been adopted.

1163
Several OGCMs already have on-line particle diagnostics (see section A.2), 1164 yet no general library for coupled Lagrangian particle tracking exists so far. As    Figure 7: Illustrating how water particles follow the water cycle in the Earth System. We emphasize here the global ocean conveyor circulation in a pole-to-pole, meridional-vertical plane, coupled to selected and idealized atmospheric circulation in the same plane, omitting land for convenience [terrestrial processes such as evapo-transpiration, storage and runoff are also part of the full water cycle]. Individual water particles are represented by color-coded boxes, advected quickly/chaotically through the atmosphere, and slowly/steadily through the ocean. Particles are stored on a wide range of timescales: in clouds (hours-days); in sea ice (seasons-years); in the ocean (years-centuries); in ice sheets/shelves (centuriesmillennia). In the ocean, colour coding identifies selected water masses and advection thereof: Antarctic Bottom Water (AABW); North Atlantic Deep Water (NADW); Labrador Sea Water (LSW); Antarctic Intermediate Water (AAIW); North Pacific Intermediate Water (NPIW); Subantarctic Mode Water (SAMW). In the atmosphere, colour-coding distinguishes vapor and liquid phases. Highlighted processes involve phase change or ocean-atmosphere exchange: ocean surface heating/cooling; evaporation; condensation; precipitation; sea ice freezing/melting; ice shelf basal melting. Highlighted processes internal to the ocean transform water particle density: deep convection; entrainment; enhanced abyssal mixing; eddy stirring; weak background mixing.  Secondly, for water to be consistently traced between components of the 1263 climate system, water must be conserved between them. This is for example  In this appendix, we provide further background to the different community 1321 codes, both for offline and online particle tracking, listed in Tables 1 and 2. Following the methodology proposed by Döös (1995) and taken up by 1343 Blanke and Raynaud (1997) to take advantage of such a scheme, water volume 1344 transfers between selected control sections can be assessed with great accuracy. of two-dimensional velocity fields, in particular for oceanic current datasets.

1392
The source code is freely available and distributed under a GPL license upon 1393 direct request to the authors (d' Ovidio and Nencioli).

1394
The package provides routines to compute particle trajectories and La- Note that the vertical density gradient is assumed to be constant, but the 1578 horizontal one is not, so that the isopycnal surfaces are not flat.

1579
The concentration satisfies the following initial value problem: To solve this problem with the stochastic model given by equation (27), one 1581 needs to decompose the diffusion tensor K in the form 2 K ij = σ ik σ jk . Using dY (t) = a y dt + √ 2σ yx dW x (t) + √ 2σ yy dW y (t), where the drift coefficients a x , a y and a z are given by where the diffusivity tensor K ij is symmetric and positive definite. The 1613 latter partial differential equation may be transformed into a Fokker-Planck 1614 equation in which HC (rather than C) should be viewed as the unknown: where the drift velocity is (Heemink, 1990) 1616 The first two terms on the right-hand-side of equation (44) are equivalent 1617 to those used in three-dimensional models, whilst the last one is specific to 1618 depth-integrated models.

1619
If the last term in (44) is not taken into account in a Lagrangian model, 1620 then particles might tend to concentrate into the shallowest areas, which clearly 1621 is unphysical and may lead to erroneous conclusions (e.g. Spagnol et al., 2002).

1622
A test case was designed by Deleersnijder (2015)

1627
Somewhat similar developments are made when designing a one-dimensional, cross section-averaged transport model. Such models are often used to simulate transport processes in elongated domains such as rivers or estuaries (e.g. Everbecq et al., 2001;Hofmann et al., 2008). In this case all the variables and parameters depend on time and the along-flow coordinate x. If S, u and C denote the cross-sectional area, the cross-section-averaged velocity and the cross-section-averaged concentration, respectively, then the one-dimensional counterparts to equations (41)-(44) are is the drift velocity, and K is the along-flow diffusivity.

1629
In depth-and section-averaged models, the diffusion term is rarely meant 1630 to represent turbulent diffusion per se. Instead, it is essentially the effect of 1631 shear dispersion (e.g. Young and Jones, 1991) that is to be parameterized, i.e. 1632 the impact on the resolved (reduced-dimension) processes of the combined 1633 effect of sheared-advection and turbulent diffusion in the transversal direction.

1634
As a consequence, the diffusivity coefficients are often significantly larger than 1635 those that would be used in a three-dimensional model of the same domain. 1636