Coupling of Brownian motions in Banach spaces

Consider a separable Banach space $ \mathcal{W}$ supporting a non-trivial Gaussian measure $\mu$. The following is an immediate consequence of the theory of Gaussian measure on Banach spaces: there exist (almost surely) successful couplings of two $\mathcal{W}$-valued Brownian motions $ \mathbf{B}$ and $\widetilde{\mathbf{B}}$ begun at starting points $\mathbf{B}(0)$ and $\widetilde{\mathbf{B}}(0)$ if and only if the difference $\mathbf{B}(0)-\widetilde{\mathbf{B}}(0)$ of their initial positions belongs to the Cameron-Martin space $\mathcal{H}_{\mu} $ of $\mathcal{W}$ corresponding to $\mu$. For more general starting points, can there be a"coupling at time $\infty$", such that almost surely $\|\mathbf{B}(t)-\widetilde{\mathbf{B}}(t)\|_{\mathcal{W}} \to 0$ as $t\to\infty$? Such couplings exist if there exists a Schauder basis of $ \mathcal{W}$ which is also a $\mathcal{H}_{\mu} $-orthonormal basis of $\mathcal{H}_{\mu} $. We propose (and discuss some partial answers to) the question, to what extent can one express the probabilistic Banach space property"Brownian coupling at time $\infty$ is always possible"purely in terms of Banach space geometry?


Introduction
When can there be an (almost surely) successful coupling of two Brownian motions B and B defined on a separable Banach space W? (When can B and B be made to coincide at and after some random time τ < ∞?) Is a weaker kind of success more widely available? The purpose of this paper is to explore this weaker kind of success, and to raise an interesting open question.
Naturally the answer to the first question depends on the initial displacement of B relative to B. Expressed more precisely, given a W-valued Brownian motion B = {B(t) : t ≥ 0} started at 0, for which x ∈ W is it possible to construct a second Brownian motion B = { B(t) : t ≥ 0} starting at x and such that B(τ + ·) = B(τ + ·) after some random time τ ? (The general case B(0) − B(0) = x follows by translation invariance. ) We establish criteria for relative displacements x permitting almost surely successful (classical) coupling, also almost sure coupling at time ∞ (almost surely B(t) − B(t) W → 0 as t → ∞ when · W is the norm of W: see Section 4). In both cases the coupling can be chosen to be an immersion or Markovian coupling: martingales in the natural filtration of B, respectively the natural filtration of B, remain martingales in the joint natural filtration of B and B (Kendall, 2015).
The question of successful classical coupling is resolved by recalling the notion of the Cameron-Martin space of a Gaussian measure µ on W (Cameron and Martin, 1944). Given a Gaussian measure µ, there is a standard construction of a Hilbert space H µ densely embedded in W, such that translations of µ by elements of H µ are exactly those that induce translated measures which are absolutely continuous with respect to µ. This theory, together with the well-known Aldous inequality (Aldous, 1983, Lemma 3.6), immediately yields the following.
Theorem (See Theorem 10 below). Let W be a separable Banach space with norm · W . Consider a W-valued Brownian motion B started at 0 and an element x ∈ W. Another Brownian motion B can be constructed to start at x and almost surely meet B within finite time if and only if the relative initial displacement x lies in H µ , for µ = L (B(1)). In that case the "fastest possible" coupling time is realized as the hitting time τ of x, B(τ ) Hµ on 1 2 x 2 Hµ . If the initial displacement does not lie in H µ then in many cases a weaker form of coupling is still available, namely "coupling at time ∞".
Theorem (See Theorem 13 below). Let W be a separable Banach space with norm · W . Consider a W-valued Brownian motion B started at 0, such that the associated H µ contains an orthonormal basis which is also a Schauder basis for W. For any x ∈ W, one can construct another Brownian motion B started at x which almost surely couples with B at time ∞: The  (0)) couple at time ∞ almost surely.
The paper is organized as follows. Section 2 surveys the basic theory of Gaussian measures and Brownian motions on Banach spaces. Section 3 treats the case when the initial displacement x to be in H µ , while the case of x ∈ W \ H µ is discussed in Section 4 (under the assumption that W admits a Schauder basis which lies in H µ and which also forms an orthonormal basis for H µ ). Section 5 discusses extensions and future work and raises the Open Question 18.

Gaussian measures and Banach-valued Brownian motion
Recall the following facts about Gaussian measures in Banach space. Proofs can be found for example in Kuo (1975), Hairer (2009), Stroock (2011), Eldredge (2016. A Gaussian probability measure µ on a separable Banach space W is a Borel measure on W such that the push-forward ℓ#µ by any continuous linear functional ℓ ∈ W * is a Gaussian probability measure on R. If ℓ#µ has mean 0 for every ℓ then µ is centered. For simplicity, consider only centered Gaussian measures which are non-degenerate: ℓ#µ is non-degenerate for all non-zero ℓ.

The Cameron-Martin space
The (non-degenerate) Gaussian probability measure µ is canonically associated with its Cameron-Martin space H µ ; a Hilbert space densely and continuously embedded in the Banach space W, so H µ ֒→ W (Hairer, 2009, Section 3.2;Eldredge, 2016, Section 4.3 In particular, for any x ∈H µ there is an element x * ∈ W * associated to x for which The following result is key for analyzing the possibility of successful Brownian coupling. Theorem 3 (Feldman-Hajek Theorem, see Kuo, 1975, Theorem 3.1). For w ∈ W define the translation map T w : W → W by T w (x) = x + w. If w ∈ H µ then the push-forward measure T w #µ is absolutely continuous with respect to µ, while if w ∈ H µ then the push-forward measure and µ are mutually singular. Moreover the join of the probability distributions µ and T w #µ has total mass given by twice the probability that a standard normal random variable exceeds the value 1 2 w Hµ .

Brownian motion on a Banach space
Stroock (2011, Chapter 8) uses these considerations to define Brownian motion (a process with stationary Gaussian increments independent of the past, and continuous sample paths). Let K(H µ ) be the Hilbert space of absolutely continuous functions h : [0, ∞) → H µ with almost-everywhere defined derivativeḣ, such that h(0) = 0 and h 2 which becomes a Banach space when endowed with the norm Each centered non-degenerate Gaussian measure then corresponds to a Brownian motion: Theorem 4 (Stroock, 2011, Theorem 8.6.1). Given K(H µ ) and C W as above, there is a unique measure µ W such that (C W , K(H µ ), µ W ) is an abstract Wiener space.

.. The corresponding W-valued Brownian motion B decomposes as a tuple of independent Brownian motions
; the resulting sum converges not only in L 2 (W, R) but also almost surely in W.
The proof of Theorem 2.3 uses a fundamental result: A proof of Theorem 7 is given in Pisier (2016, Theorems 1.14 and 1.30); a proof of the original (non-Banach) Marcinkiewicz' Theorem is given in Stroock (2011, Chapter 5).
We spell out the proof of Theorem 6 to help establish notation.
Proof of Theorem 6.
Step 1. Separability means that W * generates the Borel σ-algebra of W. The elements of W * are continuous functions on W, hence are measurable with respect to the Borel σ-algebra B W of W.
Being separable, W is generated by a countable system of open balls; by separability, each of these balls can be expressed as the intersection of countably many half-spaces of W. Hence B W is the smallest σ-algebra making all maps λ : θ → θ, λ W * ;W = λ(θ) measurable for all λ ∈ W * .
Step 2. There is a natural isometry embedding H µ in L 2 (W, µ). From (2.3), for every λ ∈ W * there is a h λ ∈ H µ such that λ(h) = λ, h W * ;W = h, h λ Hµ . Therefore, there is a natural isometric injection given by (Ih λ ) (θ) = θ, λ W,W * , defined for any λ ∈ W * , and this extends to I : H µ → L 2 (W, µ). Denoting by ι the dense embedding W * → H µ , ι(λ) Hµ = λ L 2 (W,µ) . The extension I : Step 3. The union of the Paley-Wiener maps of the summands is dense in the Paley-Wiener image of H µ . Given the orthogonal decomposition k ℓ be an orthonormal basis of H µ (ℓ) . By Steps 1 and 2, since the Paley-Wiener map is an isometry and each H µ (ℓ) is finite dimensional, the Ih k ℓ form an orthonormal basis for the image IH µ (ℓ) . By (2.7) the following linear span is L 2 (W, µ)-dense in IH µ : Step 4. We use the orthogonal decomposition (2.7) to establish a filtration of σ-algebras over W: . is a filtration of σ-algebras on W given by Step 3, all the Ih λ functions are measurable with respect to the µcompletion of F. Moreover, by Step 2, Ih λ maps θ to θ, λ W,W * , thus Ih λ is also measurable with respect to B W . Furthermore, Step 1 implies that B W is the smallest σ-algebra with respect to which all such maps are measurable. This implies that B W is contained in the µ-completion of F.
Step 5. Compute the F n -conditional expectation of a µ-random choice of θ ∈ W. Set On each summand µ induces a Gaussian measure, so compute summand by summand to obtain The proof of the theorem is completed by combining Theorem 7 (the Banach version of Marcinkiewicz' Theorem) with Steps 1-5: both µ-almost surely and in L 2 (W, µ), Remark 8. The summands H µ (ℓ) can also be infinite-dimensional, since infinite-dimensional summands can be expressed as orthogonal direct sums of finite-dimensional subspaces.

Coupling of Brownian motion within finite time
Recall that B(0) = 0 and B(0) = x and denote by dist TV (ν, ν ′ ) the total variation distance between the probability measures ν and ν ′ . The Aldous inequality implies .

Cameron-Martin Reflections in Banach spaces
Note that R x (x) = −x. By Theorem 6, R x produces a reflected Brownian motion B = x + R x (B): we avoid having to consider the extension of R x to all of W by writing H µ = Ker(I−R x )⊕Ker(I+R x ), then applying Theorem 6 to decompose B = (B 1 , B 2 ) accordingly (so B 1 is a Ker(I − R x )-valued Brownian motion and B 2 is a Ker(I + R x )-valued Brownian motion -in fact B 1 is essentially a one-dimensional Brownian motion x, B Hµ x/ x 2 Hµ ). Finally set B = x + (B 1 , −B 2 ) = x + B − 2 x, B Hµ x/ x 2 Hµ . Remark 9. Lindvall (1982) introduced coupling by reflection for real Brownian motion. Lindvall and Rogers (1986) adapted it to couple finite-dimensional diffusions. Further generalizations include reflection coupling on Riemannian manifolds and beyond (Kendall, 1986a(Kendall, ,b, 1998Cranston, 1991;Von Renesse, 2004).

Coupling in finite time holds exactly if initial displacement is in H µ
We now establish the first and simplest coupling result for Brownian motion in Banach space. Proof. One direction follows from the Aldous inequality (3.1) and Theorem 3 as noted at the start of Section 3 above: if x / ∈ H µ , then On the other hand, consider x ∈ H µ . Given the Brownian motion B, construct B from B using the reflection R x as described in Section 3.1. Thus, until B and B coincide, let Once B = B, the two Brownian motions evolve as a single process. By Equation (3.3) is an element of the Cameron-Martin space H µ , since it is a scalar multiple of x ∈ H µ . Hence, Now A is a difference of a Banach-valued Brownian motion and its reflection, stopped once they agree. Therefore its absolute value is a scalar Brownian motion of rate 4, begun at x Hµ , stopped on reaching 0: By the Reflection Principle, the probability of coupling by time t is exactly the total mass of the join of probability distributions as given in Theorem 3. Hence this is indeed a maximal coupling.
Remark 11. As H µ is dense in W, we can produce ε-approximate couplings of Brownian motions begun at any two different starting points in W (simply W-approximate x ∈ W by h ∈ H µ ). Such ε-approximate couplings are reminiscent of the Wasserstein variants of CFTP algorithms introduced by Gibbs (2004) for image restoration, and may be of use in future applications.

Coupling at time ∞ and Schauder basis properties
The work of Section 3 suggests the following natural question. When will there exist a coupling between B and B that allows the two Brownian motions almost surely to "couple at time ∞", whatever the starting points B(0), B(0) ∈ W? In other words, when can

Schauder basis
A Schauder basis for W is a sequence (e k : k = 1, 2, . . .) drawn from a Banach space W, such that each x ∈ W admits a unique decomposition as the conditionally convergent sum x = ∞ k=1 α k e k for some coefficients α k ∈ R depending on x. In fact Banach (1932, pages 110-112) observed that α k depends continuously (and linearly) on x (see also McArthur, 1972, page 878). Thus α k = e * k , x W * ;W , where each e * k ∈ W * depends on the entire Schauder basis, with the following holding as a conditionally W-convergent sum (4.1) Remark 12. The convergence in Equation (4.1) is conditional (it depends on the order of the sequence of basis vectors e n ) and must be interpreted using the W-norm topology: A Banach space with a Schauder basis is necessarily separable, but the converse is not true (see the celebrated counterexample of Enflo, 1973). Separable Banach spaces can admit other types of bases such as the Markushevich basis (typically referred to as M-basis), the Auerbach basis, or simply a finite-dimensional decomposition (Hájek et al., 2007, Chapter 1, Casazza, 2001. Markushevich bases will make an appearance in the concluding section 5, but we will not discuss the other notions here.

Brownian motions coupled at time ∞
In the following, B and B denote coupled W-valued Brownian motions started at 0 and x ∈ W respectively. If x ∈ W \ H µ then reflection coupling is not well defined! Theorem 13. Let W be a separable Banach space possessing a Schauder basis (e k : k = 1, 2, . . .) which also forms an orthonormal basis for H µ . Then it is possible to construct a coupling at time ∞ of Brownian motions B and B started at any two given starting points, so Remark 14. Theorem 13 holds for all possible x ∈ W, though construction details depend on the Schauder expansion (4.1) of x. If x ∈ H µ then the coupling is actually maximal, albeit only in a degenerate sense, since the distributions of B(t) and B(t) in (3.1) are then mutually singular for all time t > 0.
Proof of Theorem 13. Suppose that x ∈ W is the initial displacement of B relative to B. Because the Schauder basis is orthogonal in H µ , there exist H µ -orthogonal x 1 , x 2 , . . . with It suffices to take x n = rn−1 k=r n−1 e * k , x W * ;W e k ∈ H µ for a sufficiently rapidly increasing sequence r 0 = 1 < r 1 < r 2 < . . .. By the triangle inequality, for n ≥ 1, The H µ -orthogonal finite-dimensional projections Π n : z → rn−1 k=r n−1 e * k , z Hµ e k decompose B and B following Theorem 6: Thus B n is Brownian motion on the finite-dimensional subspace spanned by e r n−1 , . . . , e rn−1 . Up to the time of coupling of B n and B n , construct B n = R xn B n + x n to be the reflection of B n started at x n using the reflection map R xn .
Let T n be the time of coupling of B n and B n ; Theorem 10 (really the elementary theory of finitedimensional reflection coupling) implies that T n is distributed as the first time for scalar Brownian motion to hit 0 when run at rate 1 and started at 1 2 x n Hµ . Because real Brownian motion is a continuous martingale it follows that By subadditivity of probability and the bound (4.2), Using the triangle inequality, (4.4) implies that, with probability at least 1 − 1/N , Consequently, using the first Borel-Cantelli lemma, B(t) − B(t) W → 0 almost surely as t → ∞.
As a corollary, the case when W is a Hilbert space is now completely settled.
Corollary 15. If W is a separable Hilbert space, then any W-valued Brownian motion corresponding to a Cameron-Martin space H µ ֒→ W can be coupled at time ∞ from any two starting points in W.
Proof. Recall the continuous injection A : H µ ֒→ W. By Prokorov's characterization of Gaussian measures on the Hilbert space W (see Kuo, 1975, Theorem 2.3 and following remark), A * A is positive definite (n.b. A is injective) and trace-class. Its spectral decomposition therefore can be expressed in terms of finite-dimensional spaces, and for any non-zero eigenvalue λ, the renormalized operator A/λ restricted to Ker(A * A − λ 2 ) is an isometry. Let v 1 , v 2 , . . . be an orthonormal basis of H µ using eigenvectors of A * A: by an eigenvalue argument, Av 1 , Av 2 , . . . are W-orthogonal. By injectivity of A and its dense image, Av 1 , Av 2 , . . . form a W-orthogonal basis, hence a Schauder basis.
For more general Banach spaces, the methods of Theorem 13 immediately supply an approach which works at least for some initial displacements x ∈ W \ H µ .
Corollary 16. Given two initial starting points x = y in W, suppose it is possible to find orthogonal z 1 , z 2 ,. . . in H µ such that x − y = z 1 + z 2 + . . ., with the sum converging conditionally in W. Then we can construct a coupled pair of Banach-valued Brownian motions B and B starting from x and y which almost surely couple at time ∞.
Remark 17. Kuo (1975, Corollary 4.2, page 66) observes that one can always construct Banach spaces W 0 with H µ ֒→W 0 ֒→ W and W 0 strictly containing H µ , such that there is an orthonormal basis of H µ which forms a Schauder basis for W 0 .

Conclusion
In conclusion we discuss a simple open question (Subsection 5.1), and indicate some further possible lines of research (Subsection 5.2).

An open question
This paper has shown that Brownian couplings at time ∞ are always possible if W supports a Schauder basis which is also H µ -orthogonal (Theorem 13), and in particular that they are always possible in the important special case when W is a Hilbert space (Lemma 15). It is therefore natural to ask Question 18. Given an abstract Wiener space W * ֒→ H µ ֒→W, is it always possible to produce a coupling at time ∞ for two W-valued Brownian motions started from arbitrary starting points in W? (Or if not, then what Banach-space geometry property for W * ֒→ H µ ֒→W corresponds to the probabilistic property that Brownian coupling at time ∞ is always possible?) Certainly there are abstract Wiener spaces possessing the property of Brownian coupling at time ∞ which do not possess a H µ -orthogonal Schauder basis. Recall that W has the finite dimensional decomposition property (FDD) if it supports finite-dimensional subspaces W (n) such that x = ∞ n=1 x n (conditionally convergent) for unique x n ∈ W (n) : to be compatible with the Cameron-Martin geometry, we must further require that the W (n) form an orthogonal decomposition of H µ . Note there exist separable Banach spaces W with FDD but without a Schauder basis (Casazza, 2001). The proof of Theorem 13 shows that Brownian couplings at time ∞ are always possible when W has an FDD compatible with the Cameron-Martin geometry.
Adaptation of work of Terenzi (1994) permits a modest advance on the results described above. We outline the argument briefly.
Recall that a Markushevich basis is a particular kind of biorthogonal system (a sequence of pairs (x n , x * n ) n ∈ W × W * is said to be a bi-orthogonal system whenever x * n (x m ) = δ n,m ): Definition 19 (Markushevich basis). A bi-orthogonal system that is is a Markushevich basis (M-basis). (Here span w * is closure of linear span under weak * topology.) The modest advance is as follows: Brownian coupling at time ∞ is possible if W supports a norming M-basis (x n , x * n ) n ∈ W × W * , n = 1, 2, . . ., with H µ -orthogonal x n : "norming" means that defines a norm on W such that λ x W ≤ |||x||| ≤ x W for some 0 < λ ≤ 1. Here B (W * , · W * ) denotes the unit ball in the Banach space W * . Note that the argument used in the proof of Theorem 13 will apply if any x ∈ W can be represented as the limit of partial sums of orthogonal elements of H µ , so this is the objective. Terenzi (1994) establishes this representation in terms of existence of an M-basis for any separable Banach space, but without imposing any requirement of H µ -orthogonality. We now describe the steps required to adapt Terenzi (1994) to show that the argument can be sufficiently related to the underlying Cameron-Martin geometry to capture enough H µ -orthogonality to obtain the required representation.
Step 1: Assume existence of a norming M-basis whose elements are orthogonal in H µ .
Step 2: Further adjust the norming M-basis to be bounded (so that there is a constant 0 < M < ∞ such that sup n { x n W · x * n W * } ≤ M ). This uses the approach of Ovsepian and Pelczynski (1975): the M-basis is adjusted using Haar unitary matrix transformations on disjoint finite-dimensional subspaces, hence retaining H µ -orthogonality while preserving the disjoint subspace decomposition.
Step 3 x, x * n x n + v m .
Remark 21. Note that by a result of Fonf (1986), the above property holds if and only if the M-basis is norming; cf. also Terenzi (1994).
If (selecting a subsequence depending on x if necessary) we could contrive that v m → 0 in W for the M-basis obtained in Steps 1 and 2, then we would have obtained the required expression of x as the limit of partial sums of orthogonal elements of H µ .
Step 4: We seek a block perturbation providing the subsequence required in Step 3. A block perturbation {y n , y * n } n of {x n , x * n } n amounts (for our purposes) to finding an increasing sequence {q m } m∈N of positive integers such that for every m If it is not possible to find a block perturbation leading to v m → 0 in W as described in Step 3, then the Terenzi (1994) argument uses further careful constructions of block perturbations of the M-basis, together with the Dvoretzky (1961) Hilbert space approximation Theorem, to produce a new norming M-basis (depending on x) which can be used to generate successive partial sums which approximate x in W-norm. Because block perturbations are used, these successive partial sums can be expressed in terms of a sequence of orthogonal elements of H µ (depending on x), and so the objective is attained. We end this section with a couple of remarks about the last proof.
Remark 22. We have assumed that we can pick an M-basis for W whose elements are orthogonal in H µ . At present we are not aware of useful sufficient conditions to guarantee this. Indeed we do not know if it is possible always to find a H µ -orthogonal M-basis for W, though this is implausible.
Remark 23. Note that the orthogonal M-basis needs to be norming. The norming property is guaranteed when W is quasi-reflexive (i.e., the canonical image of W in its second dual has finite codimension), following a result of Petunin (1964) (cf. also Singer, 1962). However (Davis and Lindenstrauss, 1972) if W is not quasi-reflexive then it is always possible to find a total subspace of W * that is not norming in W. Hence the norming assumption is necessary when dealing with general separable Banach spaces.

Further work
Apart from addressing the open question 18, one might also ask whether one could additionally couple functionals of the two Banach-valued Brownian motions, such as for example their Lévy stochastic areas. This can be done in the finite-dimensional case Ben Arous et al. (1995); Kendall (2007Kendall ( , 2010. However Lévy stochastic areas are quadratic objects, so it seems likely that only rather limited results can be obtained in infinite-dimensional cases. Banerjee and Kendall (2016) have shown that geometric criteria tightly constrain Markovian maximal couplings for (finite-dimensional) smooth elliptic diffusions: in dimension 2 and above, the existence of a Markovian maximal coupling forces the diffusion to be Brownian motion on a simplyconnected space of constant curvature, with drift given by a combination of a Killing vectorfield and (in the Euclidean case) a vectorfield related to scaling symmetry. It is evident from Theorem 10 that the notion of maximality does not extend usefully to the Banach-space case; but is there a weaker form of optimality which also imposes special requirements in some underlying geometry?
Turning to potential applications, it is natural to speculate about applications of these coupling constructions to multi-scale problems. For example multiresolution image analysis models an image as an infinite hierarchy of features of progressively finer resolution: there are interesting phase transition phenomena linked to image analysis issues (Kendall and Wilson, 2003), while a Coupling-from-the-Past algorithm has been developed for a point process example (Ambler and Silverman, 2010). The "coupling at time ∞" constructions of Theorem 13 are suggestive for such problems. Finally, Hairer and others (for example, Hairer, 2002;Hairer et al., 2011) have discussed applications of rather specific couplings to certain SPDE. It would be interesting to relate the abstract considerations of the present paper to this applied context.