Asymptotic behaviour of random tridiagonal Markov chains in biological applications

Discrete-time discrete-state random Markov chains with a tridiagonal generator are shown to have a random attractor consisting of singleton subsets, essentially a random path, in the simplex of probability vectors. The proof uses the Hilbert projection metric and the fact that the linear cocycle generated by the Markov chain is a uniformly contractive mapping of the positive cone into itself. The proof does not involve probabilistic properties of the sample path and is thus equally valid in the nonautonomous deterministic context of Markov chains with, say, periodically varying transitions probabilities, in which case the attractor is a periodic path.


1.
Introduction. Markov chains with a tridiagonal generator are common in biological applications, see, e.g., [1,13,21]. Their asymptotic behaviour is well understood when the transitions probabilities are constant, i.e, the Markov chain is homogeneous or, equivalently, autonomous in the language of dynamical systems. In this paper we consider the case where the transitions probabilities can vary in time, e.g., periodically, or randomly due to a periodically or randomly changing environment. The Markov chains are then nonautonomous or random dynamical systems [2,8,15] and the concepts of autonomous dynamical systems such as equilibria are inadequate. Thus a new concept of nonautonomous or random attractors is needed.
The results in this paper are presented in the context of random Markov chains and random dynamical systems, although the proofs are do not depend at all on probabilistic properties of the sample path parameter ω and are thus equally valid in the nonautonomous deterministic context of Markov chains with, say, periodically varying transitions probabilities.
Tridiagonal Markov chains, both deterministic and random, are presented in Section 2. The long term dynamical behaviour of the autonomous deterministic case is then given for completeness in Section 3, although it follows as a special case, because it provides useful background information for the random case. The proof uses the fact that random Markov chains generate contractive linear cocycles which map a positive cone into itself. First, in Section 4, a general theorem on existence of a random attractor with singleton subsets in a metric space is formulated and proved. The assumptions seem rather restrictive at the first sight, but are just what is needed later. Then, in Section 5 some preliminaries from the theory of positive linear operators are recalled, the essence of which is that in a quite a general situation, for linear maps positive with respect to the same invariant cone, there exists a common metric, the Hilbert projective metric, in which all these linear mappings are uniformly contractive. Finally, in Section 6, it is shown that the linear cocycles generated by the random Markov chains in Section 2 satisfy the conditions of the abstract theorem from Section 4 under uniform upper and lower positivity bounds on the tridiagonal transitions probabilities and thus have a random attractor consisting of singleton subsets. The random attractor is essentially a randomly varying path in the simplex of probability vectors which pathwise attracts all other iterates of the Markov chain. In the nonautonomous deterministic setting with periodical transitions probabilities it is a periodic path.
There is an extensive literature on products of random Markov chains, see e.g., [10,11,12,17,18,22]. Although the problem investigated here does not seem to have been addressed directly as such yet in the literature, the results could probably be obtained by extending the proofs in [11,22] after similar computations to those that are needed below. The proof given here is preferable since it is direct and is written in the language of random dynamical systems. In particular, the paper demonstrates that the effect of a random attractor to be singleton is valid not only for some classes of monotone random systems, as it is reported earlier in [9], but also for the class of random Markov chains studied below.
2. Tridiagonal Markov chains in biological models. Markov chains with tridiagonal transition matrices are common in biological models, for example, birth-anddeath processes [1], cell-cell communication [13] and cancer dynamics [21], to name just a few.
To fix ideas, consider the distance d(t n ) between two cells at time t n = n∆, which is supposed to take discrete values in {1, . . . , N }, essentially the distance that they can move in one unit of time, where d(t n ) can stay unchanged or change to d(t n ) ± 1 with certain probabilities. This can be formulated as an N state discrete-time Markov chain with states {1, . . . , N } corresponding to the value of d(t n ).
Let p(t n ) = (p 1 (t n ), . . . , p N (t n )) T be the probability vector for the state of the system at time t n . The dynamics are described by the system of difference equations where q j > 0, j = 1, . . . , 2N − 2, and the p j satisfy the probability constraints This is a vector-valued difference equation This is a discrete-time finite-state Markov chain with the transition matrix [I N + ∆Q]. It is a first order linear difference equation on Σ N corresponding to the Euler numerical scheme for the ordinary algebraicdifferential equation dp dt = Qp, p ∈ Σ N , with the constant time step ∆ > 0.
2.1. Random Markov chains. Let (Ω, F, P) be a probability space and suppose now that the coefficients in the Q matrix are random, i.e., the q j : Ω → R are Fmeasurable mappings or, equivalently, Q : Ω → R N ×N is an F-measurable N × Nmatrix valued mapping. This corresponds to a random environment, which is supposed to vary or be driven by a stochastic process modelled by a metrical (i.e., measurable) dynamical system Θ = {θ n , n ∈ Z} on Ω generated by a bi-measurable invertible mapping θ : Ω → Ω. In particular, Θ satisfies θ 0 ω = ω and θ m+n ω ≡ θ m (θ n ω), ∀ m, n ∈ Z, ω ∈ Ω. See Arnold [2] for more information.
Define L ω := I N + ∆Q(ω) and assume that ∆ > 0 is sufficiently small, so that the eigenvalues of each matrix for given ω lie in the unit disc of the complex plane (see next section).
This gives the random Markov chain which is a random linear difference equation on Σ N , see [2,8,14] for random difference equations. The iterates of (3) are random probability vectors in Σ N , i.e., F-measurable mappings p : Ω → Σ N .
3. Dynamical behaviour: deterministic case. It is well-known that, under certain nondegeneracy conditions, the "deterministic" Markov chain (2) has a unique equilibrium state which is globally asymptotically stable in Σ N . This result follows as a special case of the main result of this paper below. A direct proof using elementary methods will now be given, since it also provides useful background information for the general "random" case. Let 1 N be the column vector in R N with all components equal to 1. Then i.e. each column of Q adds to zero. Moreover, 1 T N I N = 1 N , so 1 N is a left eigenvector corresponding to eigenvalue λ = 1 of the matrix I N + ∆Q. Note that the matrix I N + ∆Q is a stochastic matrix.
The Perron-Frobenius theorem applies to the matrix L ∆ := I N + ∆Q when ∆ > 0 is chosen sufficiently small. In particular, it has eigenvalue λ = 1 and there is a positive eigenvectorx which can be normalized (in the · 1 norm) to give a probability vectorp, i.e. [I N + ∆Q]p =p, so Qp = 0. In fact, we can show these properties directly for the given matrix.
One can solve Qx = 0 uniquely in R N + (up to a scalar multiplier) since by assumption the q j > 0. Specifically, and, hence, the probability vector The corresponding Markov chain is, in fact, ergodic since by assumption all the q j are positive. In particular, when q 2j−1 = q 2j for each j, thenp is the uniformly distributed probability vector with identical componentsp i = 1 Then the probability eigenvectorp is an asymptotically stable steady state of the difference equation (2) on the simplex Σ N .
Proof. First note that by Geshgorin's theorem applied to columns the eigenvalues of the matrix Q lie in the union of the closed discs centered on −q 1 + 0ı, . . . , −q 2j−2 − q 2j−1 +0ı, . . . , −q 2N −2 +0ı in the complex plane with respective radii q 1 , . . . , q 2j−2 + q 2j−1 , . . . , q 2N −2 . These all contain the origin 0+0ı on their boundary, but otherwise lie in the negative real part of the complex plan. It is already known that 0 is an eigenvalue, so all other eigenvalues have strictly negative real parts. Moreover, 0 is a simple eigenvalue with the positive eigenvectorp.
It is easy to show that no generalized eigenvectors exist, since one would satisfy the equation Qx = 0x +p, i.e., Qx =p, which is impossible since the sum of components on the left hand side is equal to 0, while the sum on the right side is equal to 1.
It follows thatp is an eigenvector of the matrix L = I N + ∆Q corresponding to the simple eigenvalue λ = 1. Moreover, if ∆ is small enough, then all other eigenvalues of the matrix L lie inside the unit disc in the complex plane, i.e., satisfy |λ| < 1. It then follows that all solutions p (n) of the difference equation (2) This can be shown by adapting the proof of Theorem 10.9 in [19]. Consider the Jordan canonical decomposition LQ = QJ, where Q is the matrix of eigenvectors and generalized eigenvectors of the matrix L withp as its first column corresponding to a 1 × 1 Jordan block [1] and the other Jordan blocks corresponding to the other eigenvalues with |λ| < 1. Then J k converges to an N × N matrix Z with z 1,1 = 1 and all other components z i,j = 0. This implies that 4. Random attractors of uniformly contracting cocycles. Let M be a complete metric space equipped with the metric ρ.
on M with respect to the driving system Θ if it satisfies the initial condition and the cocycle property The pair (Θ, F ) is called a (discrete-time) random dynamical system in [2].
Define f (ω, x) := F (1, ω, x). Then, clearly, due to cocycle property (5) for any n ∈ Z + the map F (n, ω, x) can be expressed as a superposition of maps f (ω, x) for different ω and x: The map f (ω, x) is called the generator of the cocycle F (n, ω, x).
In what follow it will be supposed that the cocycle F (n, ω, x) is continuous in x for every n ∈ Z + and ω ∈ Ω. By (6) the cocycle F (n, ω, x) is continuous in x if its generator f (ω, x) is continuous in x for every ω ∈ Ω.
An F-measurable family A = {A ω , ω ∈ Ω} of nonempty compact subsets of M is the family of image sets of an F-measurable set valued mapping ω → A ω , i.e., for which the real valued mapping ω → dist ρ (x, A ω ) is F-measurable for each x ∈ M , see [4,Theorem 8.1.4]. It is called f -invariant if f (ω, A ω ) = A θω , and hence F (n, ω, A ω ) = A θnω , ω ∈ Ω, n ∈ Z + .
Recall that the Hausdorff separation, or semi-metric, H * ρ (X, Y ), of the nonempty compact subsets X and Y of M is defined by where dist ρ (x, Y ) := min y∈Y ρ(x, y), and the Hausdorff metric H ρ (X, Y ) for the nonempty compact subsets X and Y of M is X) . Finally, the diameter of a subset X of M is defined by diam(X) := sup x,y∈X ρ(x, y).
We can now formulate an abstract theorem on existence of a random attractor. Proof. Fix numbers N d , N c ≥ 1 and λ < 1 and a closed bounded set M 0 for which (7) and (8) hold. First, it will be shown that for any bounded set D ⊆ M and any ω ∈ Ω the sets F (n, θ −n ω, D) converge to some single-point set A ω as n → ∞. To do this, in fact, an even stronger statement will be proved: for any ω ∈ Ω the sets M n (ω) := F (n, θ −n ω, M ) converge to some single-point set A ω as n → ∞.
Note that for each ω the sets M n (ω) are closed as images of the closed set M 0 under continuous maps F (n, θ −n ω, ·). Moreover, for any ω ∈ Ω the sequence of sets M n (ω) is nested under inclusion: Indeed, by ( It will be shown that The inequality d n+1 ≤ d n follows from (9) provided that the both numbers d n+1 and d n are finite. Thus (10) follows if it can be shown that This last inequality readily follows from the inclusion which is a direct corollary of (7). The inequalities (10) are thus proved, but now they will be strengthened to To prove this inequality note that by (5) and, by the uniform contractivity of the cocycle F (n, ω, x), Taking the supremum over all ω ∈ Ω in the above inequality then gives (11).
To finalize the proof of the theorem it remains to note that, for any given ω ∈ Ω, the sequence of closed sets {M n (ω)} is nested under inclusion and, by (10) and (11), the diameters of the sets M n (ω) tend to zero as n → ∞. Then, by the Cantor Intersection Theorem (or Property), see, e.g., [20,Th. 13.65], the intersection is nonempty and consists of exactly one point. Remark 2. To prove Theorem 4.5 it would suffice to require that (8) holds only for x, y ∈ M 0 , provided that the cocycle F is uniformly dissipative.
Remark 3. It would be preferable to formulate properties of dissipativity and contractivity for a cocycle not in terms of the map F (n, x, y), but in terms of its generator f (ω, x). As will be seen below, in general, this is not possible in some interesting and natural applications, where the arising cocycle is uniformly dissipative and contractive, whereas neither the dissipativity for N d = 1 nor the contractivity for N c = 1 holds.
Remark 4. Theorem 4.5 has been formulated under rather severe assumptions. These can be essentially weakened, but serve perfectly well for the purposes of this paper.
Denote by K N + the cone 1 of elements x = (x 1 , x 2 , . . . , x N ) T ∈ R N with nonnegative components and by • K N + the interior of K N + , which is clearly non-empty. Then the quantity ϑ(x, y) = inf t : tx − y ∈ K N + is finite valued for any x, y ∈ is called the Hilbert projective metric (or, sometimes, the Birkhoff metric [16]).
Remark 5. Definition 5.1 is applicable to cones in a general Banach space. For the cone K N + in the finite-dimensional space R N , it can be shown to be equal to or for vectors x = (x 1 , x 2 , . . . , x N ) T and y = (y 1 , y 2 , . . . , y N ) T in K N + . Observe that ρ H (x, y) satisfies the triangle inequality whereas the relation ρ H (x, y) = 0 with x, y ∈ • K N + does not implies the equality x = y, but only the equality x = ty for some t > 0. Moreover, Thus, strictly speaking, ρ H (x, y) is not a metric on • K N + , but only a semi-metric. It becomes a metric, however, on a projective space. An important way to make it a proper metric is covered by the following theorem.
Theorem 5.2. Let X ⊆ R N be a closed, bounded set such that 0 ∈ X and any ray Moreover, if any ray {tx : t > 0} with x ∈ • K N + intersects X in exactly one point, then the metric space (X ∩  Given a positive linear operator (matrix) L : R N → R N denote byL(·) the (nonlinear) operator defined bỹ where P is the projection operator to the simplex Σ N defined by 6. Attractors of linear cocycles with a tridiagonal generator. The result of Sections 4 and 5 will be applied here to the linear system generated by the tridiagonal Markov chains that were introduced in Section 2. Let L be a set of linear operators L ω : R N → R N parametrized by the parameter ω taking values in some set Ω and let {θ n , n ∈ Z} be a (discrete-time) group of maps of Ω onto itself. The maps L ω x serve as the generator of a linear cocycle F L (n, ω)x = L θn−1ω · · · L θ1ω L θ0ω x.
In the sequel the following basic assumption will be used. hold.
Lemma 6.2. Let ∆ < 1 2β and γ = min{∆α, 1 − 2∆β}. Then, and, hence, Proof. Fix an ω ∈ Ω. By induction, it can be shown for each n = 1, 2, . . . , N − 1 that the matrix F L (n, ω) = L θn−1ω · · · L θ1ω L θ0ω is (2n + 1)-diagonal and that its components belonging to the main diagonal and to the first n sub and super diagonals are greater than or equal to γ n , while others vanish. Thus By (4) and (13) 1 T N L ω = 1 T N , ω ∈ Ω, and (16) implies that The inclusion (15) is thus established. Theorem 6.3. Let F L (n, ω)x be the linear cocycle with matrices L ω := I N + ∆Q(ω), where the tridiagonal matrices Q(ω) are of the form (1) with the entries q i = q i (ω) satisfying the uniform estimates (14) in Assumptiom 1. In addition, suppose that ∆ < 1 2β . Then, the set Σ N is invariant under F L (N − 1, ω), i.e., Moreover, the restriction of F L (n, ω)x to the set Σ N is a uniformly dissipative and uniformly contractive cocycle (with respect to the Hilbert metric), which has a random attractor A = {A ω , ω ∈ Ω} such that each set A ω , ω ∈ Ω, consists of a single point.
Proof. Under the assumptions, γ := min{∆α, 1 − 2∆β} > 0. Define It follows by formula (12) for the Hilbert metric in K N + that δ < ∞, so κ ≤ tanh Hence, by Lemmata 6.1 and 6.2, for each ω ∈ Ω the matrix F L (N − 1, ω) satisfies the conditions of Theorem 5.5 and is thus uniformly contractive in the metric space Henceforth write A ω = {a ω } for the singleton component subsets of the random attractor A . Then the random attractor is an entire random sequence {a θnω , n ∈ Z} in Σ N (γ) ⊂ • Σ N , which attracts other iterates of the random Markov chain in the pullback sense. Pullback convergence involves starting at earlier initial times with a fixed end time, see [7,15]. It is, generally, not the same as forward convergence in the sense usually understood in dynamical systems, but in this case it is the same due to the uniform boundedness of the contractive rate with respect to ω. By (17) κ(L ω ) ≤ ν := tanh 1 4 δ < 1, ∀ω ∈ Ω.
Hence, ρ H p (n) (ω), a θnω ≤ ν n ρ H p (0) , a ω for all n ≥ 0 and every ω ∈ Ω, from which follows the pathwise forward convergence with respect to the Hilbert projective metric.
The random attractor is, in fact, asymptotic Lyapunov stable in the conventional forward sense.
6.1. Deterministic nonautonomous Markov chains. The above proofs make no use of probabilistic properties of the sample path parameter ω (apart from Fmeasurability considerations, which are not an essential part of the proof). It applies immediately to deterministic nonautonomous Markov chains in which the transition probabilities vary, say, periodically in time.
As described in [15], this time variation can be modelled by letting ω be an biinfinite sequence ω = (ω n ) n∈Z ∈ Λ Z , i.e., with ω n ∈ Λ, n ∈ Z, for some compact metric space (Λ, ρ Λ ). Then Ω = Λ Z is a compact metric space with the metric ρ Ω (ω,ω) = n∈Z 2 −|n| ρ Λ (ω n ,ω n ) and the shift operator θ(ω n ) n∈Z = (ω n+1 ) n∈Z is continuous in the metric ρ Ω . It turns out then that ω → a ω is continuous here (in general, the set-valued mapping ω → A ω is only upper semi-continuous). These topological properties of the driving system replace the measurability properties in the random dynamical systems.