On Stein’s method for stochastically monotone single-birth chains

We discuss Stein’s method for approximation by the stationary distribution of a single-birth Markov chain, in conjunction with stochastic monotonicity and similar assumptions. We use bounds on the increments of the solution of Poisson’s equation for such a process. Our ﬁrst applications are to rates of convergence to stationarity. In our second set of applications we bound the total variation distance between the stationary distributions of two Markov chains, including quantifying the eﬀect of truncation of the state space.


Introduction
Let {Z t : t = 0, 1, . ..} be a positive recurrent single-birth Markov chain in discrete time on the non-negative integers, with transition matrix P whose (i, j)th entry we denote by P i,j .Throughout we assume that P i,i+1 > 0 for each i ∈ Z + = {0, 1, 2, . ..}, and that P i,j = 0 for j > i + 1. See Corollary 3.4 of [8] for conditions under which such a chain is positive recurrent.We let this chain have stationary distribution denoted by (π 0 , π 1 , π 2 . ..), and π be a random variable with this stationary distribution.
For a given function h : Z + → R, we let f : Z + → R denote the solution to Poisson's equation with f (0) = 0.In this note we will exploit this equation in conjunction with Stein's method to find explicit bounds in approximation by the stationary distribution of our single-birth chain.We will introduce the elements of Stein's method that we need in our work, but refer the reader to [12] and references therein for an introduction to this technique.
In this section we will derive Theorem 1 below, which gives an explicit bound in approximation by our stationary distribution, before exploring applications of this bound in Sections 2 and 3.In these applications it will be convenient to exploit stochastic monotonicity of the transition matrix P or other similar monotonicity properties.Recall that P is defined to be stochastically monotone if the distributions in successive rows of P are stochastically non-decreasing; in this case stochastic ordering is preserved under transitions taken according to P .See [3] for further details.
One central aim of our work is to demonstrate how assumptions of stochastic monotonicity can be used in deriving bounds in approximations by the distribution of π using Stein's method, in a similar spirit to [4].This is somewhat different to earlier work on Stein's method for stationary distributions of Markov processes, including that of Brown and Xia [2], who considered approximation by the stationary distribution of a continuous-time birth-death process without any assumptions of monotonicity, and whose applications are of a quite different flavour to ours.In the discrete-time setting, [1,11] have recently used Poisson's equation as a starting point for Stein's method, but in the case of a finite state space and again without monotonicity assumptions.The focus of our work is thus somewhat different to these other papers.While there are many other tools available in the literature for tackling examples and applications such as those we consider here (see the discussion and references in the examples below), our main purpose here is to show how Stein's method can be added to such a toolkit.
Letting h(j) = h(j) − Eh(π), Jiang et al. [7] have shown that Poisson's equation ( 1) is solved by a function f satisfying f (j + 1) − f (j) = −m j (h) for j ≥ 0, where m 0 (h) = h(0)/P 0,1 and for j ≥ 1; see their Theorem 2.1.Our approximation results will make use of bounds on sup j∈Z + |m j (h)| for h ∈ H = {h : Z + → R : |h(j)| ≤ 1 for all j}.This is the relevant class of functions for us since we derive bounds on the total variation distance between π and a non-negative, integer-valued random variable X.This total variation distance is defined by Following Stein's method, this will be bounded by rewriting the right-hand side using Poisson's equation (1): where we recall that f (0) = 0 and write ∆f (j) = f (j + 1) − f (j).This immediately yields the following result.
Theorem 1.Let {Z t : t = 0, 1, . ..} be a positive recurrent single-birth Markov chain on Z + with transition matrix P and stationary distribution (π 0 , π 1 , . ..).Let X be a random variable supported on Z + .Then Note that this result makes use of bounds on the increments of the solution f of Poisson's equation, not bounds on f itself.In the remainder of this section we note some cases in which a bound on the increments of f may be easily found and which we will use as running examples to illustrate some applications of Theorem 1 in conjunction with assumptions of stochastic monotonicity and domination throughout the remainder of this note.In Section 2 we use Theorem 1 to establish rates of convergence to stationarity for stochastically monotone singlebirth processes.Then, in Section 3 we consider the approximation by π of the stationary distribution of a Markov chain whose transition matrix either dominates, or is dominated by, P .These applications will be illustrated using the examples we introduce here.

Example: Birth-death chain
Suppose that {Z t : t = 0, 1, . ..} is a birth-death chain with P i,i+1 = b i > 0 for each i = 0, 1, . . .and P i,i−j = 0 for each j > 1 and all i.In this case, Jiang et al. [7] have shown that see their equation (15), and note that the final equality follows from E h(π) = 0.This immediately gives the bound For a straightforward illustrative example, consider a simple random walk on Z + with reflection at the origin, where P 0,0 = P i,i−1 = p for all i ≥ 1 and P i,i+1 = 1 − p for all i ≥ 0, for some p > 1/2 to ensure positive recurrence.We note that in this case P is stochastically monotone, and that π ∼ Geom(α) is geometrically distributed with parameter α = 2p−1 p and mass function

Example: M/M/1 queue
For our next example, we look beyond the class of birth-death processes into Markov chains of the type associated with GI/M/1 queues.For simplicity we will restrict our attention to the M/M/1 queue here.Although this may be formulated as a birth-death process, we instead use an alternative representation which makes it clear how this example can be extended to more general GI/M/1-type chains.Consider the M/M/1 queue in which customer interarrival times are exponentially distributed with mean 1/λ and service times are exponentially distributed with mean 1/µ, where we assume ρ = λ/µ < 1.Let {Z t : t = 0, 1, . ..} be the Markov chain embedded at customer arrival times, which may be constructed as a stochastically monotone single-birth chain with stationary distribution π ∼ Geom(1 − ρ); that is, π i = (1 − ρ)ρ i for i = 0, 1, . ... We define the probability that exactly k customers are served in the time between two consecutive arrivals.We then define our transition matrix P by writing P i,j = a i−j+1 for j = 1, 2, . . ., i + 1 and each i.The remaining non-zero entries of P are those in the left-hand column, which are given by P i,0 = j>i a j = (1 + ρ) −(i+1) , for i ≥ 0. Note that, as constructed, this is not a birth-death chain, and so we cannot use the bound (3) here.We will need to calculate separately a bound on the increments m i (h) of the solution of (1).By (2), these are given by m 0 (h) = (1 + ρ)ρ −1 h(0) and for i = 1, 2, . ... Solving this system of equations gives where the final equality uses the fact that E h(π) = 0.For h ∈ H we therefore have 2 Convergence to stationarity In the setting of Theorem 1, we may choose X to have the same distribution as Z t for some fixed t ∈ Z + .This lets us bound the total variation distance of Z t from stationarity.We set Z 0 = 0, and note that Under the assumption that P is stochastically monotone, we can couple our Markov chain in such a way that P(Z t+1 > j) ≥ P(Z t > j) for all j ∈ Z + .Hence, and Theorem 1 gives the following, in which we may choose a coupling of Z t and Z t+1 to bound the expectation.
Corollary 2. Let {Z t : t = 0, 1, . ..} be a positive recurrent and stochastically monotone singlebirth Markov chain on Z + , as defined above, with transition matrix P and stationary distribution (π 0 , π 1 , . ..).Then We use the remainder of this section to illustrate the bound of Corollary 2 using the applications we introduced in Section 1.

Example: Simple random walk with reflection
Consider the simple random walk with reflection at the origin that was introduced in Section 1.1, for which we know that |m j (h)| ≤ 1/(2p−1).To apply Corollary 2 it remains only to couple Z t+1 and Z t .To do this, we introduce a copy {Z ′ t : t = 0, 1, . ..} of {Z t : t = 0, 1, . ..}, with these two processes coupled as follows: with Z 0 = Z ′ 0 = 0, we let Z ′ 1 be 0 or 1, with probability p and 1 − p respectively, independently of all else.The processes {Z t : t = 1, 2, . ..} and {Z ′ t+1 : t = 1, 2, . ..} then evolve using the same underlying sequence of independent Bernoulli trials so that, roughly speaking, one process moves in the positive direction at a given time if and only if the other process does also; the same is true of steps in the negative direction, except that we need to account for the reflection at the origin where a 'step in the negative direction' corresponds to remaining at the origin.This continues until the first time t at which Z t = Z ′ t+1 , following which the two processes move together.This happens at a time which is almost surely no greater than T = min{t ≥ 1 : Z ′ t = 0}, the first return time to the origin, and we note that for any 1 ≤ r ≤ [4p(1 − p)] −1/2 ; see Section XIV.4 of [5] for the final equality.Corollary 2 then gives us that, for any 1 yielding the expected geometric rate of convergence to stationarity (see also Example 7.1 of [10]) and an explicit bound on the corresponding total variation distance.

Example: M/M/1 queue
Now consider the M/M/1 queue of Section 1.2, for which the bound (4) holds.To apply Corollary 2 we again only need to couple Z t+1 and Z t .We proceed similarly to above, introducing a coupled copy {Z ′ t : t = 0, 1, . ..} of {Z t : t = 0, 1, . ..}.With Z 0 = Z ′ 0 = 0, let Z ′ 1 be 0 or 1, with probability (1 + ρ) −1 and ρ(1 + ρ) −1 respectively, independently of all else.We let subsequent arrivals occur at the same times in both processes, and hence Z t ≤ Z ′ t+1 almost surely for all t.As before, EZ t+1 − EZ t ≤ P(T > t + 1), where T is as in Section 2.1, since the numbers of customers in the two systems differ by at most one.We may write T as 1 + IN, where N is the number of customers served in a busy period of an M/M/1 queue initiated by the arrival of a single customer to an empty system, and I is a Bernoulli random variable with mean ρ(1 + ρ) −1 independent of all else.We then have for all 1 ≤ r ≤ (1+ρ) 2 4ρ ; the expression for Er N is well-known.Corollary 2 thus gives us, for any

Comparison of stationary distributions
In this section we let X have the stationary distribution of a positive recurrent Markov chain on Z + with transition matrix Q whose (i, j)th entry we denote by Q i,j .We bound d T V (X, π) using Theorem 1.In this setting we have As we illustrate in the examples below, results simplify further in the case where either P dominates, or is dominated by, Q.That is, where either for all i, m ∈ Z + , or the reverse inequality holds for all i, m ∈ Z + .In either of these cases we have We thus obtain the following from Theorem 1.
Corollary 3. Let P be the transition matrix of a positive recurrent single-birth Markov chain on Z + with stationary distribution (π 0 , π 1 , . ..).Let X have the stationary distribution of another positive recurrent Markov chain on Z + with transition matrix Q.Then If, in addition, either (5) or the reverse inequality holds for all i, m ∈ Z + then 3.1 Example: Birth-death chains In the setting where both P and Q are transition matrices of birth-death chains, combining (3) and (6) gives the following bound between the corresponding stationary distributions: A similar bound applies in other settings, for example in approximating the stationary distribution of a single-birth chain by that of a birth-death chain.

Example: Geometric approximation
We give two applications of (7) to approximation by a geometric distribution for the stationary distribution associated with our transition matrix Q, using transition matrices P which have a geometric stationary distribution.

Simple random walk with reflection
Let P be the transition matrix of the simple random walk with reflection, as in Section 1.1.For i ≥ 0, If we assume that either (5) or the reverse inequality holds for all i, m ∈ Z + , we may apply (7) to give an easily computed bound in the approximation of the stationary distribution associated with Q by π ∼ Geom(α), where α = (2p − 1)/p.Recalling that |m j (h)| ≤ 1/(2p − 1) in this setting, a simple calculation shows that (7) gives which we note is typically easy to evaluate and gives zero in the case where X ∼ Geom(α).

M/M/1 queue
With P as in Section 1.2 we may use the bound (4) on m l (h).We further note that k>m P i,k = 1 − (1 + ρ) m−i−1 for i ≥ m ≥ 0, and k>m P i,k = 0 for i < m.We may therefore apply (7) for a Markov chain with transition matrix Q satisfying that the total mass in the first m elements of row i decreases geometrically in i, and increases at most geometrically in m, Corollary 3 gives us a bound in the approximation of the corresponding stationary distribution by a geometric distribution with an appropriately chosen parameter 1 − ρ.Specifically, (7) yields Note that this upper bound is zero if X ∼ Geom(1 − ρ), as expected.

Example: Truncation
We conclude with a final application, to the truncation of the state space of our single-birth Markov chain with transition matrix P .Let Q = Q (n) denote the (n + 1) × (n + 1) northwest truncation of P , augmented to be a valid transition matrix by replacing P n,j in the final row of P by Q n,j = P n,j + ν j P n,n+1 for each j ≥ 0, where (ν 0 , ν 1 , . . ., ν n ) is a probability distribution, and we denote by ν a random variable with this distribution.Let X have the stationary distribution associated with the transition matrix Q; this is a special case of the truncation problem for discrete-time Markov chains studied by many authors.Tweedie [13] shows that the corresponding stationary probabilities converge to π j in the case of a geometrically ergodic chain, a stochastically monotone chain or one dominated by a stochastically monotone chain, when the augmentation is in the first or last column only.These results have since been generalised (see, for example, [6,8,9] and references therein) and given improved error bounds.Our purpose here is not to compete with these general bounds, but to illustrate the straightforward application of Corollary 3 in this setting and to note the explicit bound it yields.
Since ν j ≥ 0 for each j ≤ n, it is clear that k>m Q i,k ≤ k>m P i,k for each i and m.Since As a simple illustrative example, suppose that P is the transition matrix of a stochastically monotone birth-death chain with stationary distribution given by π k = α(1 − α) k for some α ∈ (0, 1) and all k = 0, 1, . ... The simple random walk with reflection of Section 1.1 is an example of such a chain.Stochastic monotonicity of P gives us that P(X = n) ≤ P(π ≥ n) = (1 − α) n , and (8) becomes d T V (X, π) ≤ (n + 1 − Eν)P n,n+1 α inf j∈Z + {P j,j+1 } (1 − α) n+1 , giving the same rate as the lower bound d T V (X, π) ≥ P(π > n) = (1 − α) n+1 if P j,j+1 is bounded away from zero.