Couplings, generalized couplings and uniqueness of invariant measures

We provide sufficient conditions for uniqueness of an invariant probability measure of a Markov kernel in terms of (generalized) couplings. Our main theorem generalizes previous results which require the state space to be Polish. We provide an example showing that uniqueness can fail if the state space is separable and metric (but not Polish) even though a coupling defined via a continuous and positive definite function exists.


Introduction
One important question in the theory of Markov processes is that of existence and uniqueness of invariant probability measures (ipms). In this note we will concentrate on uniqueness. A sufficient condition for uniqueness of an ipm is provided by Doob's theorem based on appropriate equivalence assumptions of the transition probabilities. In fact, such kind of conditions even imply total variation convergence of all or almost all transition probabilities (see [6], [8]). On the other hand, there are a number of cases in which an ipm is known to be unique and for which it is also known that equivalence of transition probabilities fails, for example certain classes of stochastic functional differential equations, see e.g. [5]. In [5, Theorem 1.1, Corollary 2.2] and later in [7, Theorem 1, Corollary 1], the authors provided uniqueness criteria in terms of generalized couplings. A basic assumption in both papers is that the state space is Polish (i.e. a separable and completely metrizable topological space), a fact which is used in order to apply an ergodic decomposition theorem but also to guarantee inner regularity of finite Borel measures.
In recent years, there seems to be growing interest in invariant measures for Markov processes with non-Polish state space, like spaces of bounded measurable functions (e.g. [2]).
In this note, we generalize previous results to (not necessarily separable) metric spaces and, in the Polish state space case, we allow that the distance function which appears in the coupling assumption, is a lower semi-continuous positive definite function and not necessarily a metric. We also provide an example showing that this generalization fails to hold if the state space is separable and metric but not complete.
Let us briefly recall the previous approaches to show uniqueness via generalized couplings in the case of a Polish state space. Assume that a Markov kernel P admits more than one ipm. Then it is known that P admits two distinct ergodic and hence mutually singular ipms µ and ν (see, e.g., [4]). Therefore, there exist disjoint compact sets A and B of µ(resp. ν)-measure almost 1. By ergodicity, starting in A, the Markov chain will almost surely spend a large proportion of time in A and similarly for B. No matter how we couple the chains starting in A and in B: most of the time, the first process is in A and the second one is in B and so their distance is at least equal to the distance of the sets A and B (which is strictly positive), thus contradicting the usual coupling assumption that there exists a coupling for which the processes starting in A and in B are very close for large times. This argument still holds if couplings are replaced by generalized couplings (see the definition below).
The note is organized as follows. In the following section, we provide three elementary propositions, where the first and the third one constitute an elementary substitute for the ergodic decomposition property which does not seem to be known for a general state space. Then we present and prove the main result along the lines [5] and [7] but using these propositions instead of ergodicity and inner regularity in the Polish case.

Preliminaries
Let P be a Markov kernel on the measurable space pE, Eq. We denote the set of probability measures on pE, Eq by M 1 pE, Eq or just M 1 pEq. If µ P M 1 pEq, then we write µP for the image of µ under P . We are interested in providing criteria for the uniqueness of an invariant probability measure (ipm), i.e. a probability measure π on pE, Eq satisfying πP " π. We call two probability measures µ and ν on the measurable space pE, Eq (mutually) singular, denoted µ K ν, if there exists a set C P E such that µpCq " 1 and νpCq " 0. As usual, µ ! ν means that the measure µ is absolutely continuous with respect to ν. If E is a topological space, then we denote its Borel σ-field by BpEq.
For x P E, we denote the law of the chain with kernel P and initial condition x by P x . Note that P x is a probability measure on the space`E N 0 , E N 0˘. CpP x , P y q :" ξ P M 1 pE N 0ˆE N 0 q : π 1 pξq " P x , π 2 pξq " P y ( is called the set of couplings of P x and P y . Here, π i pξq denotes the image of ξ under the projection on the i-th coordinate, i " 1, 2. The set of generalized couplingŝ CpP x , P y q is defined aŝ The following elementary proposition is a consequence of the ergodic decomposition theorem under the assumption that the space pE, Eq is standard Borel, i.e. measurable isomorphic to a Polish space equipped with its Borel σ-field, but we are not aware of a proof in the general case.
Proposition 2.1. Assume that P admits more than one ipm. Then there exist two mutually singular ipm's.
Proof. Let µ and ν be two distinct ipm's. Assume first that µ and ν are mutually equivalent and define f pxq " dµ dν pxq, x P E and A :" tx P E : f pxq ą 1u. Then µpAq, νpAq P p0, 1q. We have (by invariance of ν and µ) Since f pyq ą 1 on A and f pyq ď 1 on A c , it follows that all four expressions in the two equations are in fact equal. This implies P py, A c q " 0 for (µ or ν)-almost all y P A and hence P py, Aq " 0 for almost all y P A c . Therefore, the probability measures 1 µpAq µ| A and 1 µpA c q µ| A c are mutually singular ipm's.
Let us now assume that µ and ν are distinct ipm's which are neither equivalent nor singular. Without loss of generality we assume that µ is not absolutely continuous with respect to ν. Then there exist disjoint sets A, B, C P E such that A Y B Y C " E and µ and ν restricted to B are equivalent, νpAq " 0 and µpCq " 0. By assumption, µpAq ą 0 and µpBq, νpBq ą 0. Then P px, Bq " 1 for (µ or ν-)almost all x P B showing that the normalized measures µ restricted to B and to A are mutually singular invariant probability measures.
If E is a non-empty set, A and B are subsets of E and ρ : EˆE Ñ r0, 8q, then we define where the infimum over the empty set is defined as`8. If A " txu, then we write ρpx, Bq instead of ρptxu, Bq. Further, we call such a function ρ positive definite, if ρpx, yq " 0 iff x " y.
Proposition 2.2. Let µ and ν be probability measures on the Borel sets of a metric space pE, dq such that µ K ν. Let C P BpEq be such that µpCq " 1 and νpCq " 0. Then, for every ε ą 0, there exist closed sets A Ă C and B Ă C c such that µpAq ą 1´ε, νpBq ą 1´ε and dpA, Bq ą 0.
If E is Polish and d is a (not necessarily complete) metric which generates the topology of E, then, in addition, A and B can be chosen to be compact. In this case, it holds that for any lower semi-continuous and positive definite function ρ : EˆE Ñ r0, 8q, we have ρpA, Bq ą 0.
Proof. By [3, Lemma 7.2.4.], there exists a closed set A Ď C such that µpAq ą 1´ε. Similarly, there is a closed set B 0 Ă C c for which νpB 0 q ą 1´ε{2. For n P N, let B n :" ty P E : dpy, Aq ě 1{nu. Choose n P N such that νpB n X B 0 q ą 1´ε. Then A and B :" B n X B 0 satisfy all properties stated in the proposition (and dpA, Bq ě 1{n).
On a Polish space, every finite measure on the Borel sets is regular ([3, Proposition 8.1.12]) and therefore, there exist compact sets A Ă C and B Ă C c such that µpAq ą 1´ε and µpBq ą 1´ε. Since A and B are disjoint, we have dpA, Bq ą 0. Moreover, if ρ : EˆE Ñ r0, 8q is lower semi-continuous and positive definite, then, automatically, ρpA, Bq ą 0 by compactness of A and B.
Proposition 2.3. Let µ be an invariant probability measure of the Markov kernel P on the measurable space pE, Eq and let f : E Ñ R be bounded and measurable. For γ P R define ψ γ : E Ñ r0, 1s by Then, ψ γ pxq P t0, 1u for µ-almost all x P E.
If, moreover, f pxq P r0, 1s for all x P E, m :" ş f dµ, and γ P r0, ms, then µ` x : ψ γ pxq " 1 (˘ě 1´1´m 1´γ . Proof. Let X 0 , X 1 , ... be the Markov chain started with pX 0 q " µ defined on a space pΩ, F , Pq. Then ψ γ pX n q, n P N 0 is a stationary process and a (bounded) martingale with respect to the complete filtration pF n q generated by pX n q, so Z :" lim nÑ8 ψ γ pX n q exists almost surely by the martingale convergence theorem. Stationarity implies that n Þ Ñ ψ γ pX n q is almost surely constant. Further, Z is F 8 -measurable and therefore Z P t0, 1u almost surely. Hence, ψ γ pxq P t0, 1u for µ-almost all x P E.
To establish the final statement, we apply Birkhoff's ergodic theorem to see that µ` x : ψ γ pxq " 1 (˘" PpY ě γq " 1´Pp1´Y ą 1´γq ě 1´1´m 1´γ , so the proof is complete. Remark 2.4. Note that, due to Birkhoff's ergodic theorem, we could replace the lim inf in (1) by lim sup or lim. This changes the value of ψ γ only on a set of µ-measure 0.

Main result
Before we state the main result we address a small technical issue. If the metric space pE, dq is not separable, then it may happen that the map px, yq Þ Ñ dpx, yq is not BpEq b BpEq-measurable (the map is of course BpEˆEq-measurable but BpEq b BpEq may be strictly contained in BpEˆEq). If ξ is a probability measure on pEˆE, E b Eq, then we silently assume that an expression like ξpAq is interpreted as ξ˚pAq in case A is not measurable where ξ˚denotes the outer measure associated to ξ.
Theorem 3.1. Let µ 1 and µ 2 be invariant probability measures of the Markov kernel P on the metric space pE, dq with Borel σ-field E :" BpEq. Assume that there exists a set M P E b E such that µ 1 b µ 2 pM q ą 0 and that for every px, yq P M there exists some α x,y ą 0 such that for every ε ą 0 there exists some ξ ε x,y PĈpP x , P y q such that (2) Then µ 1 and µ 2 cannot be mutually singular.
If, moreover, E is Polish and ρ : EˆE Ñ r0, 8q is a lower semicontinuous and positive definite function for which (2) holds for d replaced by ρ then, again, µ 1 and µ 2 cannot be mutually singular.
The following corollary is a simple consequence of Theorem 3.1 and Proposition 2.1.
Corollary 3.2. Let P be a Markov kernel on the metric space pE, dq with Borel σ-field E :" BpEq.
Assume that there exists a set M P E such that µpM q ą 0 for every invariant probablity measure µ and that for every x, y P M there exists α x,y ą 0 such that for every ε ą 0 there exists some ξ ε x,y PĈpP x , P y q such that where either ρ " d, or ρ is lower semicontinuous and positive definite and E is Polish, then there exists at most one invariant probability measure. (2) and (3) are slightly weaker than [7, (2.5)]: condition (2) is of the form P`lim sup nÑ8 Z n ě α˘ą 0 while (2.5) in [7] is of the form lim sup nÑ8 EZ n ě α.

Remark 3.3. Conditions
Proof of Theorem 3.1. Assume that µ 1 and µ 2 are mutually singular invariant probability measures of P . Let C P E be a set such that µ 1 pCq " 1 and µ 2 pCq " 0. By Proposition 2.2, there exist closed sets A Ă C and B Ă C c such that µ 1 pAq ą 1´κ, µ 2 pBq ą 1´κ, and ρpA, Bq ą 0 with ρ :" d if E is not Polish.
Denoting the chain starting at X 0 " x P E by pX x i q, i P N 0 , we have, by Proposition 2.3, where γ P p0, 1q.
We now proceed to assign specific values to the variables γ and κ.

A counterexample
The basic set-up of the following example is inspired by [1,Example 1] in which the authors show that the "gluing lemma" need not hold on a separable and metrizable space. Our example shows that even if there exists a continuous and positive definite function ρ : EˆE, where E is separable and metric, such that for every pair x, y P E there exists a (true) coupling pX n , Y n q for which ρpX n , Y n q converges to 0 almost surely, uniqueness of an invariant probability measure may not hold.
Note that E is separable (but not Polish since otherwise the following construction could not work). We define ρ : EˆE Ñ r0, 1s as ρ`px, iq, py, jq˘" |x´y| for px, iq P E i , py, jq P E j , i, j P t1, 2u. Obviously, ρ is continuous. Further, ρ is positive definite since ρ`px, iq, py, jq˘" 0 implies that i " j and hence either both x and y are in I or both x and y are in J (since I and J are disjoint). In fact, ρ is a (continuous) metric on E which makes pE, ρq a Polish space (which is isometric to the interval r0, 1s equipped with the Euclidean metric). Note that the topology generated by ρ is different from the one generated by d.
Next, we construct an E-valued Markov chain with two different invariant measures µ and ν and a coupling pX n , Y n q of two copies of the chain starting at px, yq such that lim nÑ8 ρpX n , Y n q " 0 almost surely.
We define the Markov kernel P on E by P px, .q "