Statistical topology of the streamlines of a two-dimensional flow

Recent experiments on mucociliary clearance, an important defense against airborne pathogens, have raised questions about the topology of two-dimensional (2D) flows. We introduce a framework for studying ensembles of 2D time-invariant flow fields and estimating the probability for a particle to leave a finite area (to clear out). We establish two upper bounds on this probability by leveraging different insights about the distribution of flow velocities on the closed and open streamlines. We also deduce an exact power-series expression for the trapped area based on the asymptotic dynamics of flow-field trajectories and complement our analytical results with numerical simulations.


Introduction
The study of the properties of random flows is long established in the field of fluid modeling, particularly in the area of turbulence [1,2,3]. Dating back at least to Kolmogorov's 1941 studies, [4,5], the use of statistical ensembles to model characteristics of interest in fluid flows has achieved widespread success. In the cases where it may not be possible or desirable to specify detailed structure of a fluid flow, the statistical methods allow the averaging over the irrelevant features in order to interrogate quantities of interest. Another method to abstract away irrelevant details is the use of the topology of streamlines. In an ideal non-viscous flow, the topology of streamlines remains invariant in time in Eulerian coordinates. This means that the properties of the streamlines can provide useful information about the flow: for example, helping to determine minimal energy states for a flow subject to topological constraints [6,7].
Statistical topological features of random functions have also been studied from a more abstract perspective [8], as well as for the purpose of understanding the features of the Cosmic Microwave Background [9,10].
In two dimensions, the topological question is relatively simpler, as there are two available streamline topologies: 'open' and 'closed.' Understanding the topology of twodimensional flows is essential for important biological applications. For instance, one key biophysical mechanism employed in the body's defense to respiratory pathogens is mucociliary clearance [11]. In humans and many other animals, the airway is covered by a thin, quasi-2D layer of mucus, which acts as a barrier to airborne pathogens. This mucus layer is actively 'cleared' by a layer of ciliated cells. The beating of these cells generates a quasi-static two-dimensional flow field in the mucus. This flow is directed [12], but exhibits significant spatial structure due to the disordered arrangement of ciliated cells [13,14] and the properties of the beating itself [15,16]. It has been shown that an increased disorder in this system is associated with pathological outcomes in humans [13]. Recent work by Ramirez-San Juan et al. [17] has investigated the characteristics of disordered flows in the airway system. These authors show that the proportion of closed streamlines in the flow negatively impacted the ability to clear viral particles, which became 'trapped' in their orbits rather than being advected out of the system. Providing a precise characterization of the statistics of trapping in 2D flow may bring clarity about the magnitude of this effect. This might provide insight into the pathologies caused by increasing trapping.
There is a strong connection between the study of random flows and the study of magnetic fields in disordered materials. The zero-divergence condition present in both incompressible flows (due mass conservation) and magnetic fields (due to the absence of magnetic monopoles) means that the formal techniques used for one problem often also work for the other. In this context, the problem of identifying the trapped area in a magnetic field is closely related to certain variations of the Hall effect [18,19,20]. Electrons moving in a disordered material tend to follow magnetic field lines. As magnetic bias is provided, more and more streamlines become open, leading to a non-zero net flow. Isichenko [18] studied the proportion of topologically open versus topologically closed streamlines in a biased magnetic field as a function of the control parameter γ = µ/σ, where µ is the magnitude of the mean field and σ 2 = σ 2 x + σ 2 y is the sum of the componentwise variances. According to [18], the critical point is γ = 0, and near this point the fraction a of the area inside closed streamlines (the 'trapped area') scales as 1 − a ∝ γ 4/7 . The scaling exponent 4/7 for this un-trapped area is related to the critical exponents of percolation theory [18,21,22,3,23] through the relation c = 1/D h = ν/(1 + ν), where D h = 1 + 1/ν is the cluster-hull dimension and ν = 4/3 is is the correlation-length exponent, both known exactly in two dimensions. This analysis, however, leaves open the question of the scaling behavior of a in other regimes, particularly including the biologically relevant near-ordered behavior where γ → ∞.
In this paper we develop novel theoretical insights about the the statistical topology of random streamlines in 2D biased random fluid flow (or magnetic field) for the full range of the control parameter γ. We first consider properties that are independent of the specific ensemble of flow fields, and identify a general upper bound on the trapped area, based on considerations about the expected value and variance of the velocity on open and closed streamlines. These insights are then extended to a more specific result, leading to an exact expression for the trapped area fraction a in terms of the variance in the asymptotic displacement of the flow field. We then derive a formal power series for this quantity in terms of moments of the flow field. We subsequently derive a tighter bound on the trapped area for the narrower class of flows described by a Gaussian ensemble using a different approach specific for this ensemble. Finally, we provide numerical data for the trapped area in the case of a particular Gaussian flow ensemble, and verify that the bounds we established are satisfied. We observe that the second bound accurately captures the asymptotic statistics of the trapped area in the large-γ regime.

Analyzing the trapped area
Consider a zero-mean-flow ensemble from which we draw a divergence-free 2D vector field v 1 , and a constant vector µv 0 , withv 0 taken to be a unit vector in an arbitrary direction, so that µ represents the magnitude of the mean flow. We will add v 0 to our zero-mean field v 1 to produce a biased random flow field v. It will be useful for our purposes to consider the streamfunction in addition to the underlying flow field, which is the function Φ defined by the relationship ∇Φ = R π/2 ∇v, where R π/2 is the π/2-rotation operator in the plane. The orbits of the flow field are associated with the level sets of this function, so the problem of studying the orbits can be reduced to the problem of studying the level sets of Φ. For the biased random flow field described above, the streamfunction is given by Φ = Φ 1 + µR π/2v0 · x, where Φ 1 is the streamfunction associated with the unbiased portion of the flow. We will define ϕ t (x) to be the trajectory of a particle at an initial position x advected by the flow for a time t, i.e. the map defined such ). The central question we will be interested in is the following: how does the percentage of trapped area a (i.e., the area that does not lie on an open, 'escaping' trajectory) change as the magnitude of the mean velocity µ increases? We can leverage a relatively simple insight to obtain an upper bound on the trapped area. Consider a closed orbit ∂B of the velocity, which represents the boundary of a trapped region B. The mean velocity in this area is ∂B is by definition a closed orbit of the flow field, so the streamfunction Φ is level on the boundary. The integral of ∇Φ over B is thus the integral of a total derivative of a function over a region on whose boundary the function is level, which therefore must be zero.
Since the global mean flow is µv 0 , we can derive the mean flow on untrapped areas as well as for trapped areas. Taking a to be the proportion of the plane lying on a closed streamline, we find that the mean flow on open streamlines must be µv 0 /(1 − a). By itself this observation does not allow us to infer anything about a. However, the fact that we have two separate regions in the plane with known mean values for the flow field lets us place a lower bound on the variance of the flow field. This variance is the Bernoulli variance associated with being on either the trapped or untrapped area.
We formalize this below. Let A be the set of points on closed streamlines. We can now decompose the velocity as follows: where I[B] is the indicator function of set B. The first term represents the mean flow on each topologically-defined area, taking the value 0 on trapped areas and µv 0 /(1 − a) on untrapped areas. The second and third term represent the mean-subtracted value of the flow on a specific topological area. Each of these terms represents a random variable with respect to the sampling of x, with the first representing a Bernoulli random variable with the probability a and magnitude µv 0 /(1 − a), and the second two representing zero-mean random variables. Taking the variance of the velocity magnitude over space where σ 2 A and σ 2 ¬A represent the conditional variances of velocity in the trapped and the escaping areas respectively. The last two terms are strictly positive, yielding the inequality σ 2 ≥ µ 2 a 1−a . Rearranging yields the following bound: where γ = µ/σ is the dimensionless parameter that quantifies the ratio of 'order' to 'disorder' of the velocity field.

An exact expression for the trapped area
We can extend the insights above to get an expression for the trapped area. Let The conditional average displacement on the trapped and the escaping areas are 0 and µv 0 t/(1 − a) correspondingly. The variance of displacement d t · d t must grow slower than t 2 . Indeed, this variance should be time reversible (i.e., for large t we should not be able to distinguish If the variance were asymptotically O(t 2 ), a measurable proportion of trajectories would diverge by arbitrarily large distances from their mean, which occurs with probability zero. Thus asymptotically this variance is O(1). This yields the following asymptotically exact formula for the trapped area: where d t is the displacement at time t and µ is the square of the expected velocity. Therefore where κ = lim t→∞ d t · d t /t 2 − µ 2 . Thus, if we can compute κ, we can compute the trapped area.
To get a handle on this expression, we can take the Taylor expansion of the displacement: Subsequently we have where v are of fundamental interest for determining the trapped area. Note that K † v = −K v (the flow operator is unitary as it preserves the Lebesgue measure; therefore, its generator K v must be anti-Hermitian). We have, therefore, Hence, Note that when we take the time-average of d t · d t /t 2 , the flow of time becomes equivalent to a rescaling transformation, since K v is a covariant operator, and the coefficient in front of the term involving K 2n is t 2n . Evaluating the series above for a generic flow ensemble is difficult. One flavor of this difficulty is discussed in Appendix A, where we present an initial sketch of the calculations required for a particular Gaussian ensemble. Evaluating each term often requires a combinatorial expansion that becomes intractably difficult as the order of the terms increases; and, even in cases where simplifications arise, we have no guarantee that the resultant expressions will lend themselves to the kind of asymptotic analysis required to derive the trapped area.

An upper bound on the trapped area for the Gaussian ensemble
While we have an upper bound on the trapped area in the form a ≤ 1/(1 + γ 2 ) from simple considerations about the variance of the velocity, we can obtain an improved bound by using a different approach in the case where the flow field is drawn from an ensemble with known secondary characteristics. Here, we will consider the case of a flow field v wherein the marginal distribution for each component of the velocity, conditioned on a single point in space, is a uniform Gaussian. We will assume that thev 0 -parallel component of the field has the marginal distribution N (µ, σ/ √ 2), and thev 0 -orthogonal component the marginal distribution N (0, σ/ √ 2). This scaling ensures consistency with our previously given definitions of µ and σ and preserves the rotational invariance of the unbiased component. This new bound leverages the simple insight that v(x) x∈A = 0. Since A is a set with zero expected velocity, its area fraction a must be no larger than the area fraction of the maximal set S such that v(x) x∈S = 0. We will define this area fraction as s, and write our bound as s ≥ a.
Note that, for a velocity component v x ∼ p with a positive expected value, the largest subset we can achieve with zero expected velocity is given by selecting all values of v x ≤ k for some k such that The fraction of the total area belonging to this subset is then given by the value of the CDF at k, i.e.  In our case, the component of the velocity aligned withv 0 is distributed according to v ∼ N (µ, σ/ √ 2), so we have the conditions In terms of k we can write s as Solving these equations numerically for s gives us an upper bound on a for the Gaussian ensembles. In Appendix B we show that this bound goes as s ∼ O(γ −1 e −γ 2 ) when γ → ∞. If we find ourselves in possession of a convenient expression for the CDF of another ensemble, we can apply similar reasoning derive an upper bound for the trapped area in that ensemble. This analysis also reveals a procedure for generating progressively tighter bounds on a. A is defined to be precisely the set where the expected displacement is zero for all values of t. This entails that each derivative ∂ t n d t (x) = [v · ∇] n v is zero. If we can identify the marginal distribution for each derivative or, better, the joint distribution between all derivatives, we can follow a similar argument to identify the largest set that has zero expected value for each derivative considered.

Numerical results
In order to investigate the trapped area numerically, we start by generating values of γ uniformly spaced in a logarithmic scale in the range 10 −2 ≤ γ ≤ 4. Subsequently, for each γ, we initialize 1000 random zero-mean streamfunctions, using a procedure that we will describe, compute the value of µ associated with their variance, and generate a velocity field by combining the streamfunction gradient and the mean velocity. For each streamfunction we then integrate the trajectory that the point x 0 = (0, 0) (the statistical ensemble from which the velocity fields are drawn is translationally invariant, so the initial point on the trajectory can always be taken to be the origin without loss of generality) follows under this velocity field over a time range of t = 600, using the function odeint in the Python package scipy. We then determine whether each trajectory is open or closed by examining whether the closest distance it approaches to the origin at large times exceeds a critical threshold, in which case we classify it as an open trajectory, and otherwise consider it closed. The proportion of closed trajectories out of the 1000 samples becomes our estimate for the trapped area for each value of γ. The choice of a very large time range of t = 600 ensures that there is no ambiguity between open and closed trajectories, as it is much larger than the period of all closed curves in our samples.
In order to generate our streamfunctions, we note that, in the infinite-size limit, a stationary Gaussian process can be characterized exactly by its power spectral density. That process can, therefore, be sampled by choosing a random phase for each wavevector. We employ unbiased streamfunction of the form where λ is a correlation-length parameter and L is a parameter that picks the maximum wavelength that enters into the streamfunction. This approach approximates sampling a streamfunction with a Gaussian radial basis function covariance kernel in the largesize limit. Each phase offset φ k is sampled uniformly from the interval [0, 2π]. In this work, we choose L = 5 and λ = 10. This approach to sampling a streamfunction, as opposed to directly sampling a Gaussian process on a lattice, is beneficial because a) it is computationally efficient relative to direct sampling, which for an N × N lattice requires computing the Cholesky decomposition of a covariance matrix of size N 2 × N 2 [an O(N 6 ) operation]; b) it avoids discretization issues that could occur when integrating a trajectory derived from a velocity field sampled on a lattice; and c) it can be naturally extended to an arbitrary distance from the origin, which allows us to ignore boundary issues at the lattice edge. This approach, however, suffers from the potential drawback that it does not fully capture the statistics of the Gaussian process outside of the bandwidth used for sampling, which means that it is inapplicable for very small γ (when the average loop size becomes smaller than the minimum wavelength) and for very large γ (when the average loop size exceeds the finite system size). In practice, see below, we find that a large range of a can be accessed for γ that avoid these limits. We first examine the scaling of the trapped area in the critical (γ → 0) regime in Fig. 2, which is expected to follow the law 1 − a ∝ γ c , with c = 4/7 [18]. In the γ ≤ 0.3 range, we find a best-fit exponent of c = 0.553 (13), compatible with the value c = 4/7 ≈ 0.571 . . . For larger γ, however, the trapped area strongly deviates from this behavior. Indeed, as discussed above and in Appendix B, in the γ → ∞ limit, the trapped area is bounded by a ≤ e −γ 2 /(2γ √ π). Between these two limiting behaviors, we find empirically that there is a very robust exponential dependence of the form a ≈ exp(−dγ), with a best-fit value of d = 2.84(4), which we illustrate in Figure 3 ‡. While there are no evident analytical simplifications that would allow us to predict this behavior directly, we note that intuitively this scaling means that the rate of decrease of a given trapped region as γ increases is roughly proportional to the area contained inside. In short, the trapped area is effectively described by an exponential dependence on γ for a wide range. ‡ We actually consider a sum of two expontials, a = A 1 e −d1γ + A 2 e d2γ . This effective function fits all of our data up to γ ≈ 2. Figure 3 also shows our analytical bounds on the trapped area. The first important thing to notice is that the 'variance bound' a ≤ 1/(1 + γ 2 ) is conservative. In our computations the trapped area decreases at a significantly faster rate. The bound offered by considering the largest set with zero expected velocity (the 'expectation bound') is much tighter.

Discussion
In this paper we have discussed the problem of determining the fraction of the 'trapped area', i.e., the area of topologically closed trajectories, for random two-dimensional flow fields. Our analysis extends beyond the (critical) disordered limit, which can be understood using the percolation-theory universality class, and into the ordered regime which, moreover, is the more relevant for many applications. We introduce several techniques that lead both to a general upper bound for a broad class of flow ensembles and to a specific upper bound for Gaussian flow fields. We also derive a powerseries expression for the trapped area using asymptotic analysis of the displacement variance under the flow map. Finally, we compute the trapped area numerically for a Gaussian flow field and find that beyond the critical regime it is well approximated by an exponential.
In addition to providing theoretical insights for the features of interest in twodimensional flows, this work has a number of broader implications. First, we expect that the techniques introduced in this paper could be extended to allow an estimation or even an exact computation of other topological features of fluid flows, potentially even in higher-dimensional regimes. Second, in some applications of fluid flows and their close cousins, magnetic fields, the topology of orbits plays a crucial role in the overall behavior of the system. The orbit topology in magnetic materials has been related to certain electrical properties and to the Hall effect [19,18]. Orbit topology has implications for the behavior of ciliated flows in biological systems, ranging from mucosal transport in the airway to the feeding of starfish larvae [17,24,25,26]. Being able to compute the proportion of area trapped in closed streamlines will enable analytical investigations into the fundamental principles and tradeoffs underling these systems. In particular, this analysis may have significant implications for understanding the progress of infections in the airway, where areas of trapped flow provide a foothold for viral infections [17,27].
Data availability The code and data used to generate the figures in this paper can be accessed at: https://github.com/Kambm/TrappedAreas.

Appendix A. Analyzing the series
Here we present a sketch of the calculations required to evaluate the series 10 exactly. We do not complete this here but instead provide a partial solution in order to give the reader a sense of the difficulties involved.
We can express the term v, K n v v using Einstein notation: To evaluate these terms we need to sum over all ways to distribute the derivatives across each term. We can illustrate this symbolically as follows: We then sum over each allowed choice of placement for each derivative term. We also note that v = R π/2 ∇Φ (or equivalently v i = (−1) i ∂ 1−i Φ), so we can ultimately express the entire term using derivatives of the n-point correlation function for Φ at the origin. E.g.: x,y,z,w=0 In order to proceed further we will need to get a handle on the correlation function. When µ = 0, we can employ Wick's theorem to express the even-order correlation functions as a sum over products of second-order correlations: When µ = 0, we can can write C n = (Φ 1 (x) + µR π/2v0 · x)(Φ 1 (y) + µR π/2v0 · y) · · · . Distributing terms gives where V = {x 1 , · · · , x 2n } and C n 0 refers to the n-point correlation function for the unbiased streamfunction. To proceed further, we will have to posit a form for the unbiased two-point correlation function. We will choose the form where λ is the characteristic length. This choice ensures that v·v = Tr[∇ x ∇ y C(0, 0)] = σ 2 . What happens when we differentiate this at the origin? The 2kth derivative of the Gaussian exp(−λ 2 x 2 /2) at the origin is (2k)!(−1) k /((2λ 2 ) k k!), while all odd derivatives are zero. Now we consider evaluating a term with a specific arrangement of auxiliary derivatives. We can categorize each such term using a list of tuples (a i , b i ) for i = 1, · · · , 2n + 2, where the tuple (a i , b i ) indicates that the operator is applied to the ith argument. We then pair each argument with another, apply the respective derivatives to the two-point correlation function associated with the paired arguments, take the product of all the results, and sum over all possible pairings. The contribution from the isotropic component of the two-point correlation function, from each pair (i, j), is (−1) (a i −a j )+(b i −b j ) λ 2 σ 2 2 2(a i + a j )! (a i + a j )! 2(b i + b j )! (b i + b j )! (−1) a i +a j +b i +b j (2λ 2 ) a i +a j +b i +b j = λ 2 σ 2 2 2(a i + a j )! (a i + a j )! 2(b i + b j )! (b i + b j )! 1 (2λ 2 ) a i +a j +b i +b j . (A.8) Eq. (A.8) assumes that a i + a j and b i + b j are even; otherwise the result is 0. We see that the parity associated with the overall order of the derivatives and the parity associated with the pairing of arguments for each component cancel out. If the pairing is specifically (0,1), (0,1), the result from the deterministic component is µ 2 (we assume without loss of generality thatv 0 =x), and otherwise the deterministic component contributes nothing.
In order to complete the analysis, one would like to solve the combinatorial problem of summing these coefficients over derivative pairings. This is ultimately a difficult problem, and the authors leave it to the interested reader to pursue it in further depth. s ∼ e −γ 2 2γ √ π . (B.5)