1 Introduction

1.1 Discussion

In this article, we consider a continuous-time perpetuity given by the random variable

$$ X_{0} :=\int_{0}^{\infty}D_{t} f(Z_{t}) \,\mathrm {d}t. $$
(1.1)

Above, \(Z = (Z_{t})_{t \in \mathbb {R}_{+}}\) represents the value of an economic factor that determines a cash flow rate \((f(Z_{t}))_{t \in \mathbb {R}_{+}}\). Cash flows are discounted according to \(D = (D_{t})_{t \in \mathbb {R}_{+}}\); therefore, \(X_{0}\) represents the whole payment in units of account at time zero. Our main concern is the identification of an efficient way to obtain the joint distribution of \((Z_{0},X_{0})\), as naive estimation of the distribution by simulating sample paths of \(Z\) and approximating \(X_{0}\) through numerical integration may be prohibitively slow. As \(Z_{0}\) is typically observable, the joint distribution of \((Z_{0},X_{0})\) also allows us to obtain the conditional distribution of \(X_{0}\) given \(Z_{0}\).

In order to make the problem tractable, we work in a diffusive, Markovian environment where \(Z\) and \(D\) are solutions to the respective stochastic differential equations (written in integrated form)Footnote 1

$$\begin{aligned} Z &= Z_{0} + \int_{0}^{\cdot}m(Z_{t}) \,\mathrm {d}t + \int_{0}^{\cdot}\sigma(Z_{t}) \,\mathrm {d}W_{t}, \end{aligned}$$
(1.2)
$$\begin{aligned} D &= 1 - \int_{0}^{\cdot}D_{t} \big(a(Z_{t})\,\mathrm {d}t + \theta(Z_{t})' \sigma (Z_{t})\,\mathrm {d}W_{t} + \eta(Z_{t})' \,\mathrm {d}B_{t} \big). \end{aligned}$$
(1.3)

In the above equations, \(W\) and \(B\) are independent Brownian motions of dimension \(d\) and \(k\), respectively, while \(m\), \(\sigma\), \(a\), \(\theta\) and \(\eta\) are given functions. (Precise assumptions on all the model coefficients are given in Sect. 2.) We assume \(Z\) is stationary and ergodic with invariant density \(p\). Equation (1.3) includes in particular the case when \(D\) is smooth, in other words, \(D = \exp(-\int_{0}^{\cdot}a (Z_{t}) \,\mathrm {d}t)\), where \(a\) represents a short-rate function. However, the more general form of (1.3) is considered to accommodate a broader range of situations; for example,

  • when payment streams are denominated in different units of account (for example, another currency, or financial assets), in which case discounting has to take into account the “exchange rate”;

  • when for pricing purposes, the payment stream, though denominated in domestic currency, must incorporate both traditional discounting and the density of the pricing kernel.

The two main results of the paper—Theorems 3.1 and 3.4—identify the distribution of \((Z_{0}, X_{0})\) in different ways. First, in the case where \(\eta\) in (1.3) is non-degenerate and \(f\) in (1.1) is sufficiently regular, the conditional cumulative distribution function of \(X_{0}\) given \(Z_{0}\) is shown to coincide with the explosion probability of an associated locally elliptic diffusion and hence, through the Feynman–Kac formula, satisfies a partial differential equation (PDE); see Theorem 3.1. Second, for general \(\eta\) and \(f\), using methods of diffusion time reversal, we identify an “ergodic” process \((\zeta ,\chi )\) whose invariant distribution coincides with the joint distribution of \((Z_{0},X_{0})\). In particular, for any fixed starting point \(x>0\) of \(\chi\), the (random) empirical time-average law of \((\zeta ,\chi)\) on \([0,T]\) almost surely converges to the joint distribution of \((Z_{0},X_{0})\) in the weak topology; see Theorem 3.4. The time-reversal result has the advantage of leading to an efficient method for obtaining the distribution via simulation, as the ergodic theorem enables estimation of the entire distribution based upon a single realization of \((\zeta ,\chi)\); a numerical example in Sect. 4 dramatically reinforces this point. However, it must be noted that the invariant distribution \(p\) for \(Z\) appears in the reversed dynamics, and hence must be known to perform simulation. When \(Z\) is one-dimensional, or more generally reversing, in the sense that the second order linear differential operator associated to its generator is symmetric on a certain Hilbert space (see [27, Sect. 4.10]), \(p\) is given in explicit form with respect to the model parameters. In the general multidimensional setup, lack of knowledge of \(p\) could pose an issue; however, we provide a potential way to amend the situation in the discussion after Theorem 3.4. Note also that in the PDE result in Theorem 3.1, explicit knowledge of \(p\) is not necessary.

1.2 Existing literature and connections

Obtaining the distribution of the perpetuity \(X_{0}\) is of great importance in the areas of finance and actuarial science; for this reason, perpetuities with a form similar to \(X_{0}\) have been extensively studied. For example, [11] deals with the case where

$$ X_{0} = \int_{0}^{\infty}e^{-\sigma B_{t} - \nu t} \,\mathrm {d}t, $$

establishing that \(X_{0}\) has an inverse gamma distribution. This fits into the setup of (1.2), (1.3) by taking \(a=\nu-\sigma^{2}/2\), \(f=1\), \(\theta=0\) and \(\eta= \sigma\). Note that here \(Z\) plays no role. In a similar manner, [31, Chap. 5] and [9, 10] consider the case

$$ X_{0} = \int_{0}^{\infty}e^{-\int_{0}^{t} Z_{u} \,\mathrm {d}u} \,\mathrm {d}t, \qquad \mathrm {d}Z_{t} = \kappa(\theta-Z_{t}) \mathrm {d}t + \xi\sqrt{Z_{t}} \mathrm {d}W_{t}, $$

and obtain the first moment, along with bounds for other moments, of \(X_{0}\). In [16], the perpetuity takes the form

$$ X_{0} = \int_{0}^{\infty}e^{-Q_{t}} \,\mathrm {d}P_{t}, \quad \text{with } P \text{ and } Q \text{ being independent L\'{e}vy processes}. $$
(1.4)

Under certain conditions on \(P\) and \(Q\), the distribution of \(X_{0}\) is implicitly calculated by identifying the characteristic function and/or Laplace transform for \(X_{0}\). In fact, the results of [16] are pre-dated (for highly particular \(P\) and \(Q\)) in [24, 21]. The Laplace transform method is also used in [26, 25] to treat (1.4) when \(P_{t} = t\) and \(Q\) is a diffusion. In addition to identifying a degenerate elliptic partial differential equation for the Laplace transform, they propose a candidate recurrent Markov chain whose invariant distribution has the law of \(X_{0}\). Lastly, the setup of [16] is significantly extended in [6] where under minimal assumptions on \(P\) and \(Q\), the distribution of \(X_{0}\) is shown to coincide with the unique invariant measure for a certain generalized Ornstein–Uhlenbeck process, a relationship that is confirmed in our current setting in Proposition 9.2.

The use of time reversal to identify the distribution of a discrete-time perpetuity is well known, dating at least back to [12], where \(X_{0}\) takes the form

$$ X_{0} = \sum_{n=1}^{\infty}\bigg(\prod_{i=1}^{n} D_{i}\bigg) f_{n}, $$

where the discount factors \((D_{n})_{n \in \mathbb {N}}\) and cash flows \((f_{n})_{n \in \mathbb {N}}\) are two independent sequences of independent, identically distributed (iid) random variables. To provide insight, the time-reversal argument in [12] is briefly presented here. With

$$X_{0}^{(N)} :=\sum_{n=1}^{N} \bigg(\prod_{i=1}^{n} D_{i}\bigg) f_{n}, $$

it is clear by the iid property that \(X_{0}^{(N)}\) has the same distribution as

$$\widetilde{X}_{N} :=D_{N} f_{N} + D_{N} D_{N-1} f_{N-1} + \cdots+ \bigg(\prod_{j=1}^{N} D_{j}\bigg)f_{1}. $$

Straightforward calculations show that the reversed process \((\widetilde {X}_{n})_{n \in \mathbb {N}}\) satisfies the recursive equation \(\widetilde{X}_{n} = D_{n} (\widetilde{X}_{n-1} + f_{n} )\). Thus, assuming that \((\widetilde {X}_{n})_{n \in \mathbb {N}}\) converges to a random variable \(\widetilde{X}\) in distribution, \(\widetilde{X}\) must solve the distributional equation \(\widetilde{X} = D (\widetilde{X}+f)\), where \(D\), \(f\) and \(\widetilde {X}\) are independent, \(D\) has the same law as \(D_{1}\) and \(f\) has the same law as \(f_{1}\). In [30], solutions to that distributional equation are obtained based upon the expectation of \(\log|D|\) and \(\log ^{+} |Df|\). The tails of \(\widetilde{X}\), as well as convergence of iterative schemes, are studied in [14]; furthermore, [17] gives “almost” if and only if conditions for the convergence of iterative schemes.

In a continuous-time setting, we employ an argument similar in spirit, but rather different in execution, to [12]. Specifically, we extend \(X_{0}\) to a whole “forward” process \(X :=(1/D) \int_{\cdot}^{\infty}D_{t} f(Z_{t}) d t\), and then for each \(T>0\) define the reversed process \((\zeta ^{T},\chi^{T})\) on \([0,T]\) by \(\zeta ^{T}_{t} :=Z_{T-t}\), \(\chi^{T}_{t} :=X_{T-t}\); see (3.6), (3.7). Using results on time reversal of diffusions from [19] (alternatively, see [23, 3, 7, 13]) as well as additional elementary calculations, we obtain the dynamics for \((\zeta ^{T},\chi^{T})\). In fact, Proposition 8.5 shows that the generator of \((\zeta ^{T},\chi^{T})\) does not depend upon \(T\) and ergodicity can be studied for the process \((\zeta ,\chi)\) with the given generator. When \(|\eta| >0\) and \(f\) is sufficiently regular, this generator is locally elliptic and the associated process \((\zeta ,\chi)\) is ergodic with invariant distribution equal to that of \((Z_{0},X_{0})\); see Proposition 9.2. In the general case, a slightly weaker (but still sufficient) form of ergodicity still holds: starting \(\zeta \) from its invariant distribution \(p\) and \(\chi\) from any starting point \(x>0\), the (random) empirical time-average laws of \((\zeta ,\chi)\) converge almost surely in the weak topology to the distribution of \((Z_{0},X_{0})\).

1.3 Structure

This paper is organized as follows. In Sect. 2, we precisely state the assumptions on the processes \(Z\) and \(D\), as well as the function \(f\), paying particular attention to deriving sharp conditions under which \(X_{0}\) is almost surely finite or infinite. The main results are then presented in Sect. 3. First, when \(|\eta| > 0\) and \(f\) is sufficiently regular, the conditional cumulative distribution function of \(X\) given \(Z_{0} =z\) is shown to satisfy a certain partial differential equation. Then, using the method of time reversal, we construct a probability space and diffusion \((\zeta ,\chi)\) such that with probability one, its empirical time-average laws weakly converge to the joint distribution of \((Z_{0},X_{0})\) for all starting points of \(\chi\). Section 3 concludes with a brief discussion how the distribution may be estimated via simulation, in particular proposing a method for obtaining the desired distribution when the invariant density \(p\) for \(Z\) is not explicitly known. Section 4 provides a numerical example in a specific case where the joint distribution of \((Z_{0},X_{0})\) is explicitly identifiable. Here, we compare the performance of the reversal method versus the direct method for obtaining the distribution of \(X_{0}\). In particular, we show that for a given desired level of accuracy (see Sect. 4 for a more precise definition), the method of time reversal is approximately 175 to 300 times faster than the direct method. The remaining sections contain the proofs. Section 6 proves the statements regarding the finiteness of \(X_{0}\); Sect. 7 proves the partial differential equation result; Sect. 8 obtains the dynamics for the time-reversed process \((\zeta ,\chi)\); Sect. 9 proves the (weak) ergodicity with the correct invariant distribution. Finally, a number of technical supporting results are included in the Appendix.

2 Problem setup

2.1 Well-posedness and ergodicity

The first order of business is to specify precise coefficient assumptions so that \(Z\) in (1.2) and \(D\) in (1.3) are well defined. As for \(Z\), we work in the standard locally elliptic setup for diffusions; for more information, see [27, Chaps. 3.7, 4.1]. Let \(E\subseteq \mathbb {R}^{d}\) be an open, connected region. We assume the existence of \(\gamma\in(0, 1]\) such that

(A1) there exists a sequence of regions \((E_{n})_{n \in \mathbb {N}}\) such that \(E = \bigcup_{n=1}^{\infty} E_{n}\), with each \(E_{n}\) being open, connected, bounded, with \(\partial E_{n}\) being \(C^{2,\gamma}\) and satisfying \(\bar {E}_{n}\subset E_{n+1}\) for all \(n \in \mathbb {N}\);

(A2) \(m\in C^{1,\gamma}(E;\mathbb {R}^{d})\) and \(c\in C^{2,\gamma}(E;\mathbb {S}_{++}^{d})\), where \(\mathbb {S}_{++}^{d}\) is the space of symmetric and strictly positive definite \((d \times d)\)-dimensional matrices.

With the provisos in (A1) and (A2), define \(L^{Z}\) as the generator associated to \((m,c)\), i.e.,Footnote 2

$$ L^{Z} :=\frac{1}{2}\sum_{i,j=1}^{d} c^{ij}\partial^{2}_{ij} + \sum_{i=1}^{d} m^{i}\partial_{i}. $$

Under (A1) and (A2), one can infer the existence of a solution to the martingale problem for \(L^{Z}\) on \(E\), with the possibility of explosion to the boundary of \(E\); see [27, Chap. 1.13]. We wish for something stronger, namely, to construct a filtered probability space \((\varOmega , \, \mathbf {F}, \, \mathbb {P})\) on which there is a strong, stationary, ergodic solution to the SDE in (1.2) with invariant density \(p\). In (1.2), \(W\) is a \(d\)-dimensional Brownian motion and \(\sigma= \sqrt {c}\), the unique positive definite symmetric matrix such that \(\sigma^{2} = c\). In order to achieve this, we ask that

(A3) the martingale problem for \(L^{Z}\) on \(E\) is well posed and the corresponding solution is recurrent. Furthermore, there exists a strictly positive \(p\in C^{2,\gamma}(E,\mathbb {R})\) with \(\int_{E} p(z) \,\mathrm {d}z = 1\) and satisfying \(\tilde{L}^{Z} p = 0\), where \(\tilde{L}^{Z}\) is the formal adjoint of \(L^{Z}\) given by

$$ \tilde{L}^{Z} :=\frac{1}{2}c^{ij}\partial^{2}_{ij} -(m^{i}-\partial _{j}c^{ij})\partial_{i} -\bigg(\partial_{i} m^{i} -\frac{1}{2}\partial^{2}_{ij} c^{ij}\bigg). $$
(2.1)

We summarize the situation in the following result; the extra Brownian motion \(B\) in its statement will be used to define the process \(D\) via (1.3) later on.

Theorem 2.1

Under Assumptions (A1)(A3), there exists a filtered probability space \((\varOmega , \, \mathbf {F}, \, \mathbb {P})\) satisfying the usual conditions and supporting two independent Brownian motions \(W\) and \(B\), \(d\)-dimensional and \(k\)-dimensional, respectively, such that \(Z\) satisfies (1.2) and is stationary and ergodic with invariant density \(p\).

Remark 2.2

According to [27, Corollary 5.1.11], in the one-dimensional case where \(E=(\alpha,\beta)\) for \(-\infty\leq\alpha< \beta\leq\infty \), the above Assumption (A3) is true if and only if for some \(z_{0}\in E\),

$$\begin{aligned} \int_{\alpha}^{z_{0}} \exp \left (-2\int_{z_{0}}^{z}\frac{m(s) }{c(s)} \,\mathrm {d}s\right ) \,\mathrm {d}z &= \infty, \\ \int_{z_{0}}^{\beta} \exp \left (-2\int_{z_{0}}^{z}\frac{m(s)}{c(s)} \,\mathrm {d}s\right )\,\mathrm {d}z &= \infty, \\ \int_{\alpha}^{\beta}\frac{1}{c(z)} \exp \left (2\int_{z_{0}}^{z}\frac {m(s)}{c(s)} \,\mathrm {d}s\right )\,\mathrm {d}z &< \infty. \end{aligned}$$

In this case, it holds that

$$p(z) = Kc^{-1}(z)\exp \left (2\int_{z_{0}}^{z} \frac{m(s)}{c(s)} \,\mathrm {d}s\right ), \quad z \in(\alpha, \beta), $$

where \(K > 0\) is a normalizing constant.

In the multidimensional case, suppose that there exists a function \(H: E \to \mathbb {R}\) with the property that \(c^{-1}(2m - \operatorname {div}c) = \nabla H\), where \(\operatorname {div}c\) is the (matrix) divergence defined byFootnote 3 \((\operatorname {div}c)^{i} = \partial _{j}c^{ij}, i = 1,\dots,d\). Then \(Z\) from Theorem 2.1 is a reversing Markov process in the sense that the time-reversed process on any interval \([0,T]\) has the same dynamics as \(Z\); see [19]. Furthermore, Assumption (A3) follows if it can be shown that \(Z\) does not explode to the boundary of \(E\) and \(K :=\int_{E} \exp (H(z)) \,\mathrm {d}z < \infty\). Indeed, by construction, \(p = e^{H}/K\) satisfies \(\tilde{L}^{Z}p = 0\) and \(\int_{E} p(z)\,\mathrm {d}z = 1\). Thus if \(Z\) does not explode, it follows from [27, Theorem 2.8.1, Corollary 4.9.4] that \(Z\) is recurrent. In fact, \(Z\) is ergodic, as shown in [27, Theorems 4.3.3, 4.9.5]. If we are not in the reversing case, there are many known techniques for checking ergodicity; see [5, 27]. For example, if there exist a smooth function \(u: E \to \mathbb {R}\), an integer \(N\) and constants \(\varepsilon > 0\) and \(C > 0\) such that \(L^{Z}u\leq-\varepsilon \) and \(u\geq-C\) on \(E\setminus E_{N}\), then (A3) holds.

In order to ensure that \(D\) in (1.3) is well defined, we assume that

(A4) \(a\in C^{1,\gamma}(E;\mathbb {R}_{+})\), \(\eta\in C^{2,\gamma}(E; \mathbb {R}^{k})\) and \(\theta\in C^{2,\gamma}(E; \mathbb {R}^{d})\).

Given (A4) and all previous assumptions, it follows that (1.3) possesses a strong solution on \((\varOmega , \, \mathbf {F}, \, \mathbb {P})\) from Theorem 2.1; in fact, defining \(R :=- \log D\), it holds that

$$ R = \int_{0}^{\cdot} \left (a + \frac{1}{2}(\theta'c\theta+ |\eta |^{2} )\right ) (Z_{t}) \,\mathrm {d}t + \int_{0}^{\cdot}\theta(Z_{t})' \sigma(Z_{t})\,\mathrm {d}W_{t} + \int_{0}^{\cdot}\eta (Z_{t})'\,\mathrm {d}B_{t}. $$
(2.2)

2.2 Finiteness of \(X_{0}\)

Having the setup for the existence of \(Z\) and \(D\), we proceed to \(X_{0}\). For the time being, we just assumeFootnote 4 that the function \(f : E \to \mathbb {R}_{+}\) is in \(\mathbb {L}^{1}(E,p)\). For the PDE results of Theorem 3.1 below, we require a slightly stronger regularity assumption on \(f\), although the time-reversal results of Theorem 3.4 make no additional assumptions. Now, for \(f\) not necessarily in \(\mathbb {L}^{1}(E,p)\), it is entirely possible that \(X_{0}\) takes infinite values with positive probability. In this section, conditions are given under which \(\mathbb {P}[X_{0} < \infty] = 1\) or, conversely, when \(\mathbb {P}[X_{0} < \infty] = 0\).

Lemma 2.3

Let (A1)(A4) hold. For the invariant density \(p\) of \(Z\), assume there exists \(\varepsilon > 0\) such that

$$ \begin{aligned} &\left(a+\frac{1-\varepsilon }{2}(\theta'c\theta+ \eta'\eta)\right)^{-} \in \mathbb {L}^{1}(E, p),\\ &\int_{E}\left(a+\frac{1-\varepsilon }{2}(\theta'c\theta+\eta'\eta)\right)(z) \, p(z) \,\mathrm {d}z > 0. \end{aligned} $$
(2.3)

Then the following hold:

  1. (i)

    There exists \(\kappa> 0\) such that for all \(z\in E\), \(\mathbb {P}[\lim _{t\to\infty} e^{\kappa t} D_{t} = 0\mid Z_{0} =z]=1\). In particular, \(\lim _{t\to\infty} e^{\kappa t}D_{t} = 0\) ℙ-a.s.

  2. (ii)

    For any \(f\in \mathbb {L}^{1} (E, p)\), it holds that \(\mathbb {P}[X_{0} < \infty]= 1\).

Remark 2.4

Note that (2.3) holds if \(a>0\) on \(E\). The more complicated form in (2.3) allows \(a\) to take (unbounded) negative values. Furthermore, in the case where \((\theta'c\theta+ \eta'\eta) \in \mathbb {L}^{1}(E, p)\), (2.3) is equivalent to

$$ \left(a+\frac{1}{2}(\theta'c\theta+ \eta'\eta)\right)^{-}\in \mathbb {L}^{1}(E,p),\quad \int_{E}\left(a+\frac{1}{2}(\theta'c\theta +\eta'\eta)\right)(z) \, p(z) \,\mathrm {d}z > 0. $$

As a partial converse to Lemma 2.3, we have

Lemma 2.5

Let (A1)(A4) hold. For the invariant density \(p\) of \(Z\), assume there exists \(\varepsilon > 0\) such that

$$\begin{aligned} &\bigg(a+\frac{1 + \varepsilon }{2}(\theta'c\theta+ \eta'\eta)\bigg)^{+} \in \mathbb {L}^{1}(E, p),\\ \int_{E}&\bigg(a+\frac{1 + \varepsilon }{2}(\theta'c\theta+\eta'\eta)\bigg)(z) \, p(z) \,\mathrm {d}z \leq0. \end{aligned}$$

(If \(\theta'c\theta+ \eta'\eta\equiv0\), then assume that \(a^{+}\in \mathbb {L}^{1}(E,p)\) and \(\int_{E} a(z) p(z) \,\mathrm {d}z < 0\).) If \(f\) is such that \(\int_{E} f(z) p(z) \,\mathrm {d}z > 0\), then \(\mathbb {P}[X_{0} < \infty] = 0\).

Remark 2.6

Let (A1)–(A4) hold and assume that \(a\) is nonnegative. A combination of Lemmas 2.3 and 2.5 yields sharp conditions for the finiteness of \(X_{0}\) that do not require knowledge of \(p\), at least for bounded \(f\):

  • If \(a + (1/2)(\theta'c\theta+ \eta'\eta) \not\equiv0\), then \(\mathbb {P}[X_{0} < \infty] = 1\) holds if \(f\in \mathbb {L}^{1}(E,p)\).

  • If \(a + (1/2)(\theta'c\theta+ \eta'\eta) \equiv0 \), then \(\mathbb {P}[X_{0} < \infty] = 0\) holds if \(\int_{E} f(z) p(z) \,\mathrm {d}z > 0\).

In view of Lemma 2.3, we ask that

  1. (A5)

    \(f \in \mathbb {L}_{+}^{1}(E,p)\), \(\int_{E} f(z) p(z) \,\mathrm {d}z > 0\) and there exists \(\varepsilon > 0\) such that

    $$\begin{aligned} &\bigg(a+\frac{1-\varepsilon }{2}(\theta'c\theta+ \eta'\eta)\bigg)^{-} \in \mathbb {L}^{1}(E, p),\\ \int_{E}&\bigg(a+\frac{1-\varepsilon }{2}(\theta'c\theta+\eta'\eta)\bigg)(z) \, p(z) \,\mathrm {d}z > 0. \end{aligned}$$

To recapitulate, for the remainder of the article, the following is assumed:

Assumption 2.7

We enforce throughout all the above Assumptions (A1)–(A5).

3 Main results

3.1 The distribution of \(X_{0}\) via a partial differential equation

Define the cumulative distribution function \(g\) of \(X_{0}\) given \(Z_{0}\) by

$$ g(z,x) :=\mathbb {P}\left [X_{0} \leq x \mid Z_{0} = z\right ], \quad(z, x) \in F:=E \times(0, \infty). $$
(3.1)

Next, recall that Assumption 2.7 implies that \(Z_{0}\) has a density \(p\), and define the joint distribution \(\pi\) of \((Z_{0},X_{0})\) by

$$ \pi(A) :=\iint_{A} p(z)g(z,\mathrm {d}x)\,\mathrm {d}z,\qquad A\in\mathcal{B}(F). $$
(3.2)

Under Assumption 2.7, as well as an additional smoothness requirement on \(f\) and non-degeneracy requirement on \(\eta\), the first main result (Theorem 3.1 below) shows that \(g\) solves a certain PDE on the state space \(F\). This will imply that the joint distribution of \((Z_{0}, X_{0})\) has a density (still labeled \(\pi \)) and the law of \(X_{0}\) charges all of \((0,\infty)\).

To motivate the result as well as to fix notation, for each \(x \in(0, \infty)\), consider the process

$$ Y^{x} :=\frac{1}{D }\left(x - \int_{0}^{\cdot}D_{t} f(Z_{t})\,\mathrm {d}t\right). $$

Since Assumption 2.7 implies that \(\mathbb {P}[\lim_{t\to \infty}D_{t} = 0\mid Z_{0} =z] =1\) for all \(z\in E\), it is clear that given \(Z_{0} = z\), the process \(Y^{x}\) tends to \(\infty\) on \(\left \{X_{0} < x\right \}\). Alternatively, on \(\left \{X_{0} > x\right \}\), \(Y^{x}\) will hit 0 at some finite time. What happens on \(\left \{X_{0} = x\right \}\) is not immediately clear, but it will be shown under the given assumptions that there is probability zero of this occurring. For fixed \((z,x) \in F\), it follows that \(1-g(z,x)\) equals the probability that \(Y^{x}\) hits zero, given \(Z_{0} = z\). According to the Feynman–Kac formula, such probabilities “should” solve a PDE. To identify the PDE, note that the joint equations governing \(Z\) and \(Y^{x}\) are

$$\begin{aligned} Z &= Z_{0} + \int_{0}^{\cdot}m(Z_{t})\,\mathrm {d}t + \int_{0}^{\cdot}\sigma(Z_{t})\,\mathrm {d}W_{t}, \\ Y^{x} &= x + \int_{0}^{\cdot}\Big(-f(Z_{t}) + Y^{x}_{t}\big(a(Z_{t}) + \theta'c\theta(Z_{t}) + \eta'\eta(Z_{t})\big)\Big)\,\mathrm {d}t \\ &\quad{} + \int_{0}^{\cdot}Y^{x}_{t}\left(\theta'\sigma(Z_{t})\,\mathrm {d}W_{u} + \eta(Z_{t})'\,\mathrm {d}B_{t}\right). \end{aligned}$$

Define \(b:F\to \mathbb {R}^{d+1}\) and \(A: F\to \mathbb {S}_{++}^{d+1}\) by

$$ \begin{aligned} b(z,x)&:=\bigg( \textstyle\begin{array}{c} m(z) \\ -f(z) + x(a+\theta'c\theta+\eta'\eta)(z) \end{array}\displaystyle \bigg),\\ A(z,x)&:=\bigg( \textstyle\begin{array}{c@{\quad}c} c(z) & x c\theta(z) \\ x\theta'c(z) & x^{2}(\theta'c\theta+ \eta'\eta)(z) \end{array}\displaystyle \bigg), \end{aligned} $$
(3.3)

for all \((z,x) \in F\). Note that if in addition to Assumption 2.7, \(|\eta|(z) > 0\) for \(z\in E\), then \(A\) is locally elliptic. Let \(L\) be the second order differential operator associated to \((A,b)\), i.e.,

$$ L:=\frac{1}{2}A^{ij}\partial^{2}_{ij} + b^{i} \partial_{i}. $$
(3.4)

Note that \(L\phi= L^{Z}\phi\) for functions \(\phi\) of \(z\in E\) alone. With the previous notation, the first main result now follows.

Theorem 3.1

Let Assumption  2.7 hold, and suppose further that

(a) \(f\in C^{1,\gamma}(E;\mathbb {R}_{+})\),

(b) \(|\eta(z)| > 0\) for all \(z\in E\).

Then \(g\) is in \(C^{2,\gamma}(F)\) and satisfies \(Lg = 0\) with the “locally uniform” boundary conditions

$$ \lim_{n\to\infty} \sup_{x\leq n^{-1} ,z\in E_{k}} g(z,x) = 0,\qquad \lim_{n\to\infty} \inf_{x\geq n ,z\in E_{k}} g(z,x) = 1, \quad\forall k \in \mathbb {N}. $$
(3.5)

Furthermore, \(g\) is the unique function satisfying \(L g = 0\) with the above boundary conditions, in the set of functions \(\{ \tilde{g} \in C^{2}(F) : 0 \leq\tilde{g} \leq1 \}\).

Remark 3.2

The non-degeneracy assumption on \(\eta\) is essential for the existence of a density; if \(\eta\equiv0\), it may be that the distribution of \(X_{0}\) has an atom. Indeed, take \(f \equiv1\), \(a \equiv 1\), \(\eta\equiv0\), \(\theta\equiv0\). Then \(X_{0} = \int_{0}^{\infty}e^{-t}\,\mathrm {d}t = 1\) with probability one.

Remark 3.3

Theorem 3.1 implies that the law of \(X_{0}\) charges all of \((0,\infty)\), even for those functions \(f\) which are bounded from above. Theorem 3.1 also implies that \(X_{0}\) has a density without imposing Hörmander’s condition [22, Chap. 2] on the coefficients in (3.3). Rather, the infinite horizon combined with the presence of the independent Brownian motion \(B\) “smooth out” the distribution of \(X_{0}\).

Theorem 3.1 is certainly important from a theoretical viewpoint. However, it appears to be of limited practical use. Even under the extra non-degeneracy condition \(|\eta| > 0\), it is unclear how to numerically solve the PDE \(Lg = 0\) with the given boundary conditions (3.5), as there are no natural auxiliary boundary conditions in the spatial domain of \(z \in E\). In Sect. 3.2 that follows, we provide an alternative, more useful method for estimating numerically the law of \((Z_{0}, X_{0})\).

3.2 The distribution of \((Z_{0}, X_{0})\) via diffusion time reversal

The goal here is to show that the distribution of \((Z_{0},X_{0})\) coincides with the invariant distribution of a positive recurrent process \((\zeta ,\chi)\). In order to see the connection, extend \(X_{0}\) to a process \((X_{t})_{t \in \mathbb {R}_{+}}\) defined via

$$ X :=\frac{1}{D} \int_{\cdot}^{\infty}D_{t} f(Z_{t}) \,\mathrm {d}t, $$
(3.6)

and note that \((Z_{t}, X_{t})_{t \in \mathbb {R}_{+}}\) is a stationary process under ℙ. Fix \(T>0\) and define the process \((\zeta ^{T}_{t}, \chi^{T}_{t})_{t \in[0, T]}\) via time reversal, i.e.,

$$ \zeta ^{T}_{t} :=Z_{T - t},\qquad\chi^{T}_{t} :=X_{T -t},\qquad t \in [0, T]. $$
(3.7)

It still follows that \((\zeta ^{T}, \chi^{T})\) is stationary under ℙ, with the same one-dimensional marginal distribution as \((Z_{0}, X_{0})\). Furthermore, stationarity of \((Z, X)\) clearly implies that the law of the process \((\zeta ^{T}, \chi^{T})\) does not depend on \(T\) (except for its time-domain of definition). Therefore, one may create a new process \((\zeta_{t},\chi_{t})_{t \in \mathbb {R}_{+}}\), on a potentially different probability space (e.g. the space of continuous functions), such that the law of \((\zeta^{T}, \chi^{T})\) is the same as the law of \((\zeta_{t}, \chi _{t})_{t \in[0, T]}\) for all \(t \in T\). If one can establish that \((\zeta ,\chi)\) is ergodic, then the distribution of \((Z_{0},X_{0})\) may be efficiently estimated via the ergodic theorem.

Towards this end, one needs to understand the behavior of \((\zeta ,\chi)\). Standard results (e.g. [19]) in the theory of time reversal imply that \(\zeta \) is a diffusion in its own filtration, and identify the corresponding coefficients. In order to deal with \(\chi\), we return to the definition of \(\chi^{T}\) and define yet one more process \((\varDelta ^{T}_{t})_{t \in[0, T]}\) via

$$ \varDelta^{T}_{t} = \frac{D_{T}}{D_{T-t}}, \quad t \in[0, T]. $$
(3.8)

Using all previous definitions, we obtain that

$$\begin{aligned} \chi^{T}_{t} = X_{T - t} &= \frac{1}{D_{T - t}} \int_{T - t}^{\infty}D_{u} f(Z_{u}) \,\mathrm {d}u \\ &= \frac{D_{T}}{D_{T - t}} \left (X_{T} + \int_{T - t}^{T} \frac {D_{u}}{D_{T}} f(Z_{u}) \,\mathrm {d}u\right ) \\ &= \varDelta_{t}^{T} \left (\chi^{T}_{0} + \int_{0}^{t} \frac{1}{\varDelta^{T}_{u}} f(\zeta ^{T}_{u}) \,\mathrm {d}u\right ), \quad t \in[0, T]. \end{aligned}$$
(3.9)

As it turns out, one can describe the joint dynamics of \((\zeta^{T}, \varDelta^{T})\) in appropriate filtrations (and these dynamics do not depend on \(T\), as expected). To ease the presentation, recall from Sect. 2 that for any \(\mathbb {S}_{++}^{d}\)-valued smooth function \(A\) on \(E\), the (matrix) divergence is defined by \((\operatorname {div}A)^{i} = \partial_{j} A^{ij}\) for \(i=1,\dots,d\). It is then shown in Sect. 8 that \((\zeta ^{T}, \varDelta^{T})\) is such that

$$\begin{aligned} \zeta ^{T} & = \zeta ^{T}_{0} + \int_{0}^{\cdot}\left(c\frac{\nabla p}{p} + \operatorname {div}c - m\right)(\zeta ^{T}_{t})\,\mathrm {d}t + \int_{0}^{\cdot}\sigma(\zeta ^{T}_{t})\,\mathrm {d}W^{T}_{t}, \\ \varDelta^{T} &= 1 + \int_{0}^{\cdot}\varDelta^{T}_{t}\left (\theta' c \frac{\nabla p}{p} + \nabla\cdot(c\theta) - a\right )(\zeta ^{T}_{t}) \,\mathrm {d}t\\ &\quad{}+ \int_{0}^{\cdot}\varDelta^{T}_{t} \big(\eta(\zeta ^{T}_{t})'\,\mathrm {d}B^{T}_{t} + \theta'\sigma(\zeta ^{T}_{t})\,\mathrm {d}W^{T}_{t}\big) \\ &= 1 + \int_{0}^{\cdot}\varDelta^{T}_{t}\left (\theta' (m - \operatorname {div}c) + \nabla\cdot (c\theta) - a\right )(\zeta ^{T}_{t}) \,\mathrm {d}t\\ &\quad{} + \int_{0}^{\cdot}\varDelta^{T}_{t} \big(\eta(\zeta ^{T}_{t})'\,\mathrm {d}B^{T}_{t} + \theta(\zeta ^{T}_{t})' \,\mathrm {d}\zeta^{T}_{t}\big) \end{aligned}$$

for independent Brownian motions \((W^{T}, B^{T})\) in an appropriate filtration.

From the joint dynamics of \((\zeta^{T}, \varDelta^{T})\), one obtains the joint dynamics of \((\zeta^{T}, \chi^{T})\), which again do not depend on \(T\). In particular, since \(\varDelta^{T}\) is a semimartingale, (3.9) yields that

$$ \begin{aligned} \zeta^{T} &= \zeta^{T}_{0} + \int_{0}^{\cdot}\bigg(c\frac{\nabla p}{p} + \operatorname {div}c - m\bigg)(\zeta ^{T}_{t})\,\mathrm {d}t + \int_{0}^{\cdot}\sigma(\zeta ^{T}_{t})\,\mathrm {d}W^{T}_{t},\\ \chi^{T} &= \chi^{T}_{0} + \int_{0}^{\cdot}\bigg(f(\zeta ^{T}_{t}) - \chi^{T}_{t}\Big(a-\theta'c\frac{\nabla p}{p}-\nabla\cdot(c\theta)\Big)(\zeta ^{T}_{t})\bigg)\,\mathrm {d}t\\ &\quad{}+ \int_{0}^{\cdot}\chi^{T}_{t}\big(\eta(\zeta ^{T}_{t})'\,\mathrm {d}B^{T}_{t} + \theta'c(\zeta ^{T}_{t})'\,\mathrm {d}W^{T}_{t}\big). \end{aligned} $$

For a generic version \((\zeta ,\chi)\) with the same generator (which does not depend upon time) as \((\zeta ^{T},\chi^{T})\) above, ergodicity of \(Z\) implies ergodicity of \(\zeta \) (see Proposition 8.1 below). Furthermore, \(\chi\) is “mean reverting” as can easily be seen when \(\theta\equiv0\) and \(a> 0\), and as continues to be true in the general case. Thus, one expects the empirical laws of \((\zeta ,\chi)\) to satisfy a certain strong law of large numbers, an intuition that is made precise in the following result.

Theorem 3.4

If Assumption  2.7 holds, there exists a filtered probability space \((\varOmega, \mathbf {F}, \mathbb {Q})\) supporting independent \(d\)- and \(k\)-dimensional Brownian motions \(W\) and \(B\) as well as a process \(\zeta \) satisfying

$$\zeta = \zeta _{0} + \int_{0}^{\cdot}\left(c\frac{\nabla p}{p} + \operatorname {div}c - m\right )(\zeta _{t})\,\mathrm {d}t + \int_{0}^{\cdot}\sigma(\zeta _{t})\,\mathrm {d}W_{t}, $$

where \(\zeta_{0}\) is an \(\mathcal {F}_{0}\)-measurable random variable with density \(p\).

Define the process \(\varDelta\) as the solution to the linear differential equation

$$\begin{aligned} \varDelta&= 1 + \int_{0}^{\cdot}\varDelta_{t}\left (\theta' (m - \operatorname {div}c) + \nabla \cdot(c\theta) - a\right )(\zeta _{t}) \,\mathrm {d}t \\ &\quad{}+ \int_{0}^{\cdot}\varDelta_{t} \left (\eta(\zeta _{t})'\,\mathrm {d}B_{t} + \theta(\zeta _{t})' \,\mathrm {d}\zeta_{t}\right ), \end{aligned}$$
(3.10)

and then for any \(x \in(0, \infty)\), define \(\chi^{x}\) as the solution to the linear differential equation

$$ \chi^{x} = x + \int_{0}^{\cdot}\chi^{x}_{t} \, \frac{\,\mathrm {d}\varDelta_{t}}{\varDelta_{t}} + \int_{0}^{\cdot}f(\zeta _{t}) \,\mathrm {d}t. $$
(3.11)

Finally, let \(x \in(0, \infty)\), \(T \in(0, \infty)\) and set \(F=E\times (0,\infty)\) as in (3.1). Define the (random) empirical measure \(\widehat {\pi }^{x}_{T}\) on \(\mathcal {B}(F)\) by

$$ \widehat {\pi }^{x}_{T} \left [A\right ] :=\frac{1}{T} \int_{0}^{T} \mathbb {I}_{A} (\zeta_{t}, \chi^{x}_{t}) \,\mathrm {d}t, \quad A \in \mathcal {B}(F). $$
(3.12)

With the above notation, there exists a set \(\varOmega_{0} \in \mathcal {F}_{\infty}\) with \(\mathbb {Q}[\varOmega_{0}]= 1\) such that

$$ \lim_{T \to\infty} \widehat {\pi }^{x}_{T} (\omega) = \pi\textit{ weakly, }\quad \textit{for all } x \in(0, \infty) \textit{ and } \omega\in\varOmega_{0}, $$
(3.13)

where \(\pi\) is the joint distribution of \((Z_{0},X_{0})\) undergiven in (3.2).

Remark 3.5

In the context of Theorem 3.4, note that the processes \(\varDelta\) and \(\chi^{x}\) can be given in closed form in terms of \(\zeta\); indeed,

$$\begin{aligned} \varDelta&= \exp \left ( \int_{0}^{\cdot} \left (\theta' (m - \operatorname {div}c) + \nabla \cdot(c\theta) - a\right )(\zeta _{t}) \,\mathrm {d}t\right )\\ &\quad{}\times\mathcal{E}\left(\int_{0}^{\cdot} \left (\eta(\zeta _{t})'\,\mathrm {d}B_{t} + \theta(\zeta _{t})' \,\mathrm {d}\zeta_{t}\right )\right)_{\cdot}, \\ \chi^{x} &= \varDelta \left (x + \int_{0}^{\cdot}\frac{1}{\varDelta_{t}} f(\zeta _{t}) \,\mathrm {d}t\right ), \quad x \in(0, \infty). \end{aligned}$$

Theorem 3.4 provides a way to estimate the joint distribution of \((Z_{0},X_{0})\) efficiently through Monte Carlo simulation. Indeed, one need only obtain a single path of the reversed process \((\zeta ,\chi^{x})\) to recover the distribution \(\pi\). However, the applicability of the result above depends heavily on whether or not the distribution \(p\) for \(Z_{0}\) is known, as it (together with its gradient) appears in the dynamics of \(\zeta\). In the case where \(Z\) is one-dimensional, or more generally reversing, \(p\) can be expressed in closed form from the model coefficients \(m\) and \(c\) in the dynamics for \(Z\). Furthermore, there are certain cases of non-reversing, multidimensional diffusions where \(p\) can be (semi-)explicitly computed, as the next example shows.

Example 3.6

Assume that \(Z\) is a multidimensional Ornstein–Uhlenbeck process with dynamics

$$ \mathrm {d}Z_{t} = -\gamma(Z_{t}-\varTheta)\,\mathrm {d}t + \sigma\,\mathrm {d}W_{t}, \quad t \in \mathbb {R}_{+}, $$

where \(\gamma\in \mathbb {R}^{d\times d}\), \(\varTheta\in \mathbb {R}^{d}\) and \(\sigma\in \mathbb {R}^{d\times d}\). Here, \(E = \mathbb {R}^{d}\) and (A1) clearly holds. Furthermore, (A2) is satisfied when \(c = \sigma\sigma'\) is (strictly) positive definite; in fact, we take \(\sigma\) as the unique positive definite square root of \(c\). The process \(Z\) need not be reversing, as can clearly be seen when \(\sigma\) is the identity matrix, \(\varTheta= 0\) and \(\gamma\) is not symmetric. However, as will be argued below, the ergodicity assumption (A3) holds when all eigenvalues of \(\gamma\) have strictly positive real parts, and one may identify the invariant density “almost” explicitly. To see this, a direct calculation shows that if a symmetric matrix \(J\) satisfies the Riccati equation

$$ JJ = \sigma\gamma'\sigma^{-1} J + J\sigma^{-1}\gamma\sigma, $$
(3.14)

then the function

$$ p(z) = \exp \left (-\frac{1}{2}(z-\varTheta)'\sigma^{-1} J \sigma ^{-1}(z-\varTheta)\right ), \quad z \in \mathbb {R}^{d}, $$

satisfies \(\tilde{L}^{Z} p = 0\) where \(\tilde{L}^{Z}\) is as in (2.1). If \(J\) is additionally positive definite, then up to a normalizing constant, \(p\) is the density for a normal random variable with mean \(\varTheta\) and covariance matrix \(\varSigma= \sigma J^{-1}\sigma \). Thus \(p\) is integrable on \(\mathbb {R}^{d}\) and (A3) follows from [27, Corollary 4.9.4], which proves recurrence for \(Z\).

It thus remains to construct a symmetric, positive definite solution to (3.14). From [1, Lemma 2.4.1, Theorem 2.4.25], such a solution (called the “stabilizing solution” therein) exists if (a) the pair \((\sigma^{-1}\gamma\sigma, 1_{d})\) is stabilizable, in the sense that there exists a matrix \(F\) such that \(\sigma^{-1}\gamma\sigma- F\) has eigenvalues with strictly negative real parts, and (b) the eigenvalues of \(\sigma ^{-1}\gamma\sigma\) have strictly positive real parts. In the present case, each of these statements readily follows. Indeed, for the first statement, one can take \(F = \sigma^{-1}\gamma\sigma+ 1_{d}\); for the second statement, note that the eigenvalues of \(\sigma ^{-1}\gamma\sigma\) coincide with those of \(\gamma\), which by assumption have strictly positive real parts. Therefore, even in this non-reversing case, one may still identify \(p\).

The previous interesting Example 3.6 notwithstanding, for non-reversing, multidimensional diffusions, even after verifying the ergodicity of \(Z\) (and hence the existence of \(p\)), one typically does not know \(p\) explicitly. In such cases, the following simulation method is proposed. Fix a large enough \(T\) and first simulate \((Z_{t})_{t \in [0, 2T]}\) via (1.2), starting from any point \(Z_{0}\) (since the invariant density is unknown). If the choice of \(T\) is large enough, the process \((Z_{t})_{t \in[T,2T]}\) will behave as the stationary version in (1.2), since \(Z_{T}\) will have approximately density \(p\). In that case, defining \((\zeta _{t})_{t\in[0, T]}\) via \(\zeta _{t} = Z_{2 T - t}\) for \(t \in[0, T]\), \(\zeta \) should behave as it should in the dynamics (8.6), even with \(\zeta _{0}\) having (approximate) density \(p\). Now, given \(\zeta \), \(\chi^{x}\) may be defined via the formulas of Remark 3.5; therefore, for large enough \(T\), the empirical measure \(\widehat {\pi }^{x}_{T}\) should approximate in the weak sense the joint law \(\pi\).

Note finally that when \(p\) is known and \(|\eta| > 0\), and under certain mixing conditions (see [29, 28]), one can also obtain uniform estimates for the speed at which the above convergence takes place.

Remark 3.7

In the case where \(\theta= \eta\equiv0\) and \(f\in C^{1,\gamma }(E;\mathbb {R}_{+})\), one can explicitly identify the support of \(\pi\). Such an identification follows from more general ergodic results on “stochastic differential systems” obtained in [4, Sects. IIIA, IIIB]. To identify the support, note that when \(\theta=\eta \equiv0\), it follows that \(\varDelta_{t} = \exp(-\int_{0}^{T} a(\zeta _{u})\,\mathrm {d}u)\). A direct calculation using Remark 3.5 shows that \(\chi^{x}\) has dynamics

$$ \begin{aligned} \,\mathrm {d}\chi^{x}_{t}& = \left (f(\zeta _{t})-\chi^{x}_{t} a(\zeta _{t})\right ) \,\mathrm {d}t. \end{aligned} $$
(3.15)

Hence, the paths of \(\chi^{x}\) are of bounded variation. Now define

$$\begin{aligned} \hat{u} &:=\inf \left \{x : \sup_{z\in E}\big(f(z)-x a(z)\big)\leq0\right \},\\ \hat{\ell} &:=\sup \left \{x: \inf_{z\in E}\big(f(z)-xa(z)\big)\geq0\right \}. \end{aligned}$$

Assumption 2.7 implies \(a(z_{0})>0\) for some \(z_{0}\in E\) and thus \(0\leq\hat{\ell}\leq\hat{u}\leq\infty\), with \(\hat{\ell }=\hat{u}\) if and only if for some constant \(c\), \(f(z) = c a(z)\) for all \(z\in E\). In this case, \(X=c\) \(\mathbb {P}^{z}\)-almost surely for all \(z\in E\). With this notation, [4] proves the following result.

Proposition 3.8

([4, Sects. IIIA, IIIB])

Let Assumption  2.7 hold. Assume also that \(f\in C^{1,\gamma }(E;\mathbb {R}_{+})\) and \(\eta, \theta\equiv0\). Then the support of \(\pi\) is \(\bar{E}\times [\hat{\ell},\hat{u}]\) (\([\hat{\ell},\infty)\) if \(\hat{u}=\infty\)).

4 A numerical example

We now provide an example which highlights the superiority (in terms of computational efficiency) of the time-reversal method over the naive method for obtaining the distribution of \(X_{0}\). Consider the case \(E=\mathbb{R}\) and

$$ \mathrm {d}Z_{t} = -\gamma Z_{t}\,\mathrm {d}t + \,\mathrm {d}W_{t},\qquad X_{0} = \int_{0}^{\infty}Z_{t} e^{-at}\,\mathrm {d}t,\qquad\gamma,a > 0. $$
(4.1)

Note that the function \(\mathbb {R}\ni z \mapsto f(z) = z\) fails to be nonnegative. However, as argued below, the results of Theorem 3.4 still hold. As \(Z\) is a mean-reverting Ornstein–Uhlenbeck process, it is straightforward to verify Assumption (A3) with \(p(z) = \sqrt{\gamma/\pi}\, e^{-\gamma z^{2}}\), so that \(Z_{0}\sim N(0,1/(2\gamma ))\). We claim that \((Z_{0},X_{0})\) is normally distributed with mean vector \((0,0)\) and covariance matrix

$$ \varSigma= \left( \textstyle\begin{array}{c@{\quad}c} \frac{1}{2\gamma} &\frac{1}{2\gamma(a+\gamma)} \\ \frac{1}{2\gamma(a+\gamma)} & \frac{1}{2\gamma a(a+\gamma)} \end{array}\displaystyle \right). $$

Indeed, integration by parts shows that for \(T>0\),

$$ \int_{0}^{T} e^{-at}Z_{t} \,\mathrm {d}t = \frac{Z_{0}}{a+\gamma} + \frac{1}{a+\gamma }\int_{0}^{T} e^{-at} \,\mathrm {d}W_{t} - \frac{1}{a+\gamma}e^{-aT}{Z_{T}}. $$

The ergodicity of \(Z\) implies that \(\lim_{T \to\infty} (Z_{T}/T) = -\gamma\int_{\mathbb{R}}zp(z) \,\mathrm {d}z = 0\) almost surely; therefore, it follows that \(\lim_{T \to\infty} e^{-aT}Z_{T} = 0\) holds almost surely. Next, note that \(Y_{T}:=\int_{0}^{T} e^{-at} \,\mathrm {d}W_{t}\) is independent of \(Z_{0}\) and normally distributed with mean 0 and variance \((1-e^{-2aT})/(2a)\). Finally, as a process, \(Y = (Y_{T})_{T\geq0}\) is an \(L^{2}\)-bounded martingale and hence \(Y_{\infty} :=\lim_{T \to\infty} Y_{T}\) almost surely exists, where \(Y_{\infty}\) is independent of \(Z_{0}\) and normally distributed with mean 0 and variance \(1/(2a)\). Thus \(X_{0} = \lim_{T\rightarrow\infty} \int_{0}^{T} e^{-at}Z_{t} \,\mathrm {d}t\) exists almost surely and

$$ X_{0} = \frac{Z_{0}}{a+\gamma} + \frac{Y_{\infty}}{a+\gamma},\quad Z_{0}\perp \!\!\!\perp Y_{\infty},\quad Z_{0}\sim N\left(0,\frac{1}{2\gamma}\right), \quad Y_{\infty}\sim N\left(0,\frac{1}{2a}\right), $$

from which the joint distribution follows. Now, even though \(f(z)=z\) can take negative values, the time-reversal dynamics in (3.15) still hold, taking the form

$$ \mathrm {d}\zeta_{t} = -\gamma\zeta_{t} \,\mathrm {d}t + \,\mathrm {d}W_{t},\qquad\,\mathrm {d}\chi_{t} = \left (a - \zeta_{t}\chi_{t}\right) \,\mathrm {d}t. $$

Lastly, even though Theorem 3.4 no longer directly applies, it is shown in [4, Theorem 3.3, Sect. 3.D, Proposition 3.15] that \((\zeta,\chi)\) is still ergodic,Footnote 5 in that (3.13) holds.

For these dynamics, we performed the following test. For a fixed terminal time \(T\) and mesh size \(\delta\), we estimated the distribution of \(X_{0}\) in two ways. First (“Method A”) by sampling \(\zeta_{0}\sim p\) and setting \(\chi_{0} = 1\), and second (“Method B”) by running the forward process \(Z\) until \(2T\) and then setting \(\zeta_{t} = Z_{2T-t}\), \(\chi_{0}=1\). For each simulation, we computed the empirical distribution along a single path and then estimated the Kolmogorov–Smirnov distance (\(d_{KS}(F,G) = \sup_{x}|F(x)-G(x)|\), for distribution functions \(F,G\)) between the empirical and the true distribution for \(X_{0}\). The parameter values were \(\gamma= 2\), \(a=1\), \(T=\text{10'000}\) and \(\delta= 1/24\).

Figure 1 shows the resulting Kolmogorov–Smirnov distances for 500 sample paths. The plot gives a (smoothed) histogram comparing the distances using the two methods described above. As can be seen, the two methods give comparable results; this is not surprising given the rapid convergence of the distribution of \(\zeta \) to its invariant distribution [8]. Table 1 provides summary statistics regarding the median distances and simulation times, as well as the standard deviation and tail data.

Fig. 1
figure 1

Kolmogorov–Smirnov distances between the empirical and the true distribution for \(X_{0}\). The solid line is for the time-reversal method starting with \(\zeta _{0}\sim p\), and the dashed line for the time-reversal method running \(Z_{0}\) to \(2T\) and setting \(\zeta _{t} = Z_{2T-t}\). Here, \(T=\text{10'000}\), \(\delta= 1/24\), \(\gamma= 2\) and \(a=1\). Computations were performed using Mathematica and the code can be found on the second author’s website www.math.cmu.edu/users/scottrob/research

Table 1 Statistics on Kolmogorov–Smirnov distances between the empirical and the true distribution for \(X_{0}\) using Methods A and B

Having obtained Kolmogorov–Smirnov distances using time-reversal methods, we next compared our results to a naive simulation of \(X_{0}\), obtained by sampling \(Z_{0}\sim p\) and computing \(X_{0}\) directly via (4.1). Here, for the median distance \(d\) using Method A from Table 1, we sampled \(X_{0}\) stopping at the first instance \(N\) so that the Kolmogorov–Smirnov distance between the empirical and the true distribution for \(X_{0}\) fell below \(d\). As can be seen from Fig. 2 and the summary statistics in Table 2, the naive simulation performs significantly worse; at the median, it took 7’002 paths and a simulation time of 8.66 minutes to achieve the same level of accuracy as 1 path (2.94 seconds) of the time-reversed process. Further, the histogram shows the presence of a significant number of trials where significantly more than the median number of paths were needed to achieve the given accuracy.

Fig. 2
figure 2

Histogram for the number \(N\) of paths necessary so that using the naive simulation for \(X_{0}\), the Kolmogorov–Smirnov distance between the empirical distribution and the true distribution for \(X_{0}\) fell below the median distance \(d\) using Method A from Table 1. The integral was computed using \(T=100\) with mesh size of \(\delta=1/24\); furthermore, the values \(\gamma=2\) and \(a=1\) we used. Computations were performed using Mathematica and the code can be found on the second author’s website www.math.cmu.edu/users/scottrob/research

Table 2 Summary statistics using the naive forward simulation method

5 Conclusion

In this work, using the method of time reversal, an efficient method for simulating the joint distribution of \((Z_{0},X_{0})\) for perpetuities of the form (1.1) is obtained. The joint distribution may be obtained by sampling a single path of the reversed process, as opposed to sampling numerous paths of \(X_{0}\) using the naive method. However, the effectiveness of the proposed method depends on being able to obtain analytic representations for the distribution \(p\) of \(Z_{0}\)—an undertaking that, though always possible in the one-dimensional case, is often not possible for non-reversing multidimensional diffusions. Furthermore, results are presented for perpetuities with nonnegative underlying cash flow rates. More research is needed to identify an effective time-reversal method for perpetuities of the form

$$ X_{0} = \int_{0}^{\infty}D_{t} \,\mathrm {d}F_{t} $$

for general Markovian processes \(F\) (i.e., not just \(\mathrm {d}F_{t} = f(Z_{t})\,\mathrm {d}t\)) containing both jumps and diffusive terms. Additionally, the performance of the method where \(Z\) is run until a large time \(2T\) and then setting \(\zeta _{t} = Z_{2T-t}\) must be thoroughly analyzed; in particular, how fast does the distribution of \(Z_{2T}\) converge to \(p\) given a fixed starting point? To answer these questions, one must first analyze the resultant backward dynamics and associated PDEs for the invariant density, obtaining rates of convergence.

6 Proofs for Sect. 2.2

We present here the proofs of Lemmas 2.3 and 2.5.

Proof of Lemma 2.3

Let \(\varepsilon > 0\) be as in (2.3). We first treat the case \(\theta'c\theta+ \eta'\eta\equiv0\). Then \(R = \int_{0}^{\cdot}a(Z_{t})\,\mathrm {d}t\) and (2.3) specifies to \(a^{-} \in \mathbb {L}^{1}(E,p)\) and \(\int_{E} a(z) p(z) \,\mathrm {d}z > 0\). Set \(\kappa :=(1/4) \int_{E} a(z) p(z) \,\mathrm {d}z > 0\). Fix \(z\in E\) and denote by \(\mathbb {P}^{z}\) the probability obtained by conditioning upon \(Z_{0} = z\). The positive recurrence of \(Z\) implies ([27, Theorem 4.9.5]) that there exists a \(\mathbb {P}^{z}\)-a.s. finite random variable \(T(z)\) such that \(t \geq T(z)\) implies that \(R_{t} = \int_{0}^{t} a(Z_{u})\,\mathrm {d}u \geq2 \kappa t \), and hence the first conclusion of Lemma 2.3 holds. Furthermore, since \(Z\) is stationary and ergodic under ℙ, the ergodic theorem implies there is a ℙ-a.s. finite random variable \(T\) such that \(t\geq T\) implies \(R_{t} \geq 2\kappa t\). Now, let \(n\in\mathbb{N}\) be such that \(n > 1/(2\kappa)\). We have

$$ \sup_{t\geq0} \, (t/n - R_{t}) \leq\sup_{t\leq T} \,(t/n-R_{t}) < \infty, $$

where the last inequality follows by the regularity of \(a\) and the non-explositivity of \(Z\). Thus

$$ X_{0} = \int_{0}^{\infty}e^{-R_{t}}f(Z_{t})\,\mathrm {d}t \leq e^{\sup_{t\leq T}(t/n-R_{t})}\int_{0}^{\infty}e^{-t/n}f(Z_{t})\,\mathrm {d}t. $$

By the stationarity of \(Z\),

$$ \mathbb {E}\left [\int_{0}^{\infty}e^{-t/n}f(Z_{t})\,\mathrm {d}t\right ] = \int_{0}^{\infty}e^{-t/n}\mathbb {E}[f(Z_{t})]\,\mathrm {d}t = n \int_{E} f(z)p(z)\,\mathrm {d}z < \infty, $$

hence \(\mathbb {P}[\int_{0}^{\infty}e^{-t/n}f(Z_{t})\,\mathrm {d}t < \infty] = 1\), which in turn implies that \(\mathbb {P}[X_{0} < \infty] = 1\).

Assume now that \(\theta'c\theta+ \eta'\eta\not\equiv0\), which by continuity of all involved functions implies that \(\int_{E} (\theta 'c\theta+ \eta'\eta) (z) p(z) \,\mathrm {d}z > 0\). Fix \(z\in E\). Positive recurrence of \(Z\) gives that \(\lim_{t\to\infty} \int_{0}^{t}(\theta'c\theta +\eta'\eta)(Z_{u})\,\mathrm {d}u = \infty\) with \(\mathbb {P}^{z}\)-probability one. On the event \(\{ \int_{0}^{t}(\theta'c\theta+ \eta'\eta)(Z_{u})\,\mathrm {d}u > 0 \}\), note that

$$\begin{aligned} -R_{t} =& -\int_{0}^{t} a(Z_{u})\,\mathrm {d}u \\ &{}+ \int_{0}^{t}(\theta'c\theta+ \eta'\eta)(Z_{u})\,\mathrm {d}u\bigg(-\frac{1}{2} -\frac{\int_{0}^{t}\theta'\sigma (Z_{u})\,\mathrm {d}W_{u} + \eta(Z_{u})\,\mathrm {d}B_{u}}{\int_{0}^{t}(\theta'c\theta+\eta'\eta)(Z_{u})\,\mathrm {d}u}\bigg). \end{aligned}$$

By the Dambis/Dubins/Schwarz theorem and the strong law of large numbers for Brownian motion, it follows that there exists a \(\mathbb {P}^{z}\)-a.s. finite random variable \(T(z)\) such that

$$ t \geq T(z) \quad\Longrightarrow\quad-\frac{\int_{0}^{t}\theta'\sigma (Z_{u})\,\mathrm {d}W_{u} + \eta(Z_{u})\,\mathrm {d}B_{u}}{\int_{0}^{t}(\theta'c\theta+\eta'\eta)(Z_{u})\,\mathrm {d}u} \leq \frac{\varepsilon }{2}; $$

therefore,

$$ t \geq T(z) \quad\Longrightarrow\quad -R_{t} \leq-\int_{0}^{t}\left(a + \frac{1-\varepsilon }{2}(\theta'c\theta+ \eta'\eta)\right)(Z_{u})\,\mathrm {d}u. $$

With \(\kappa :=(1/4)\int_{E} ( a + (1-\varepsilon )(\theta'c\theta+\eta'\eta) /2)(z) p(z) \,\mathrm {d}z > 0\), and increasing \(T(z)\) if necessary (still keeping it \(\mathbb {P}^{z}\)-a.s. finite), it follows that \(t\geq T(z)\) implies \(-R_{t} \leq - 2 \kappa t\). Hence the first part of Lemma 2.3 holds true again. Additionally, the ergodic theorem applied with ℙ gives a ℙ-a.s. finite random variable \(T\) such that \(t\geq T\) implies \(-R_{t} \leq -2\kappa t\). Again, for \(n\in\mathbb{N}\) such that \(n > 1/(2\kappa)\), we have

$$ X_{0} = \int_{0}^{\infty}e^{-R_{t}}f(Z_{t})\,\mathrm {d}t \leq e^{\sup_{t\leq T}(t/n-R_{t})}\int_{0}^{\infty}e^{-t/n}f(Z_{t})\,\mathrm {d}t, $$

from which \(\mathbb {P}[X_{0} < \infty] = 1\) follows by the same line of reasoning as above. □

Proof of Lemma 2.5

The proof is nearly identical to that of Lemma 2.3. Namely, in each of the cases \(\theta'c\theta+ \eta'\eta\equiv0\) and \(\theta'c\theta+ \eta'\eta\not\equiv0\), there are under the given hypothesis a constant \(\kappa\geq0\) and a ℙ-a.s. finite random variable \(T\) such that \(-R_{t} \geq\kappa t\) holds for \(t\geq T\). This gives that

$$ X_{0} \geq\int_{T}^{\infty}e^{\kappa t}f(Z_{t})\,\mathrm {d}t \geq\int_{T}^{\infty}e^{\kappa t}(f\wedge N)(Z_{t})\,\mathrm {d}t, $$

where \(N\) is large enough so that \(\int_{E} (f(z) \wedge N) p(z) \,\mathrm {d}z > 0\). We thus have

$$ X_{0} \geq\int_{0}^{\infty}e^{\kappa t}(f\wedge N)(Z_{t})\,\mathrm {d}t - \frac {N}{\kappa}(e^{\kappa T}-1). $$

Ergodicity of \(Z\) implies that ℙ-almost surely,

$$ \lim_{u\to\infty}\frac{1}{u}\int_{0}^{u}(f\wedge N)(Z_{t})\,\mathrm {d}t = \int_{E} \big(f(z)\wedge N\big) p(z) \,\mathrm {d}z > 0, $$

so that \(\lim_{u\to\infty} \int_{0}^{u}e^{\kappa t} (f\wedge N)(Z_{t})\,\mathrm {d}t = \infty\), proving the result. □

7 Proof of Theorem 3.1

Under the given assumptions, there exists a unique solution \((\mathbb {P}^{z,x})_{(z,x)\in F}\) to the generalized martingale problem for \(L\) on \(F\), where \(L\) is from (3.4). Here, the measure space is \((\widetilde {\varOmega },\widetilde {\mathbf {F}})\), where \(\widetilde {\varOmega }= (C[0,\infty); \hat{F})\), with \(\hat{F}\) being the one-point compactification of \(F\). The filtration \(\widetilde {\mathbf {F}}\) is the right-continuous enlargement of the filtration generated by the coordinate process \((\widetilde{Z}, \widetilde{Y})\) on \(\widetilde {\varOmega }\).

Let \((F_{n})_{n \in \mathbb {N}}\) be an increasing sequence of smooth, bounded, open, connected domains of \(F\) such that \(F = \bigcup_{n} F_{n}\). Note that \(F_{n}\) can be obtained by smoothing out the boundary of \(E_{n}\times (1/n,n)\). By uniqueness of solutions to the generalized martingale problem, for each \(n\), the law of \((\widetilde{Z},\widetilde{Y})\) is the same as the law of \((Z,Y^{x})\) under \(\mathbb {P}[\cdot \mid Z_{0} =z]\) (where the latter will always denote a version of the conditional probability) up until the first exit time of \(F_{n}\). Furthermore, since the process \(Z\) is recurrent, with \((\mathbb {P}^{z})_{z\in E}\) being the restriction of \((\mathbb {P}^{z,x})_{(z,x)\in F}\) to the first \(d\) coordinates, for \(z \in E\), the law of \(\widetilde{Z}\) under \(\mathbb {P}^{z}\) is the same as the law of \(Z\) under \(\mathbb {P}[\, \cdot \mid Z_{0} = z]\). For these reasons, and in order to ease the reading, we abuse notation and still use \((Z, Y)\) instead of \((\widetilde{Z}, \widetilde{Y})\) for the coordinate process on \(\widetilde {\varOmega }\). The underlying space we are working on will be clear from the context.

Denote by \(\tau_{n}\) the first exit time of \((Z,Y)\) from \(F_{n}\). Assumption 2.7 implies that \(Z\) does not explode under \(\mathbb {P}^{z,x}\), and \(Y\) cannot explode to infinity since \(D\) is strictly positive almost surely under \(\mathbb {P}[\, \cdot \mid Z_{0} = z]\) for all \(z \in E\). Therefore, the explosion time \(\tau :=\lim_{n\to\infty} \tau _{n}\) for \((Z,Y)\) is the first hitting time of \(Y\) to 0, and the law of \(\tau\) under \(\mathbb {P}^{z,x}\) is the same as the law of the first hitting of \(Y^{x}\) to 0 under \(\mathbb {P}[\, \cdot \mid Z_{0} = z]\).

Note that \(Y^{x}_{t} = D_{t}^{-1}(x-X_{0} + \int_{t}^{\infty}D_{u} f(Z_{u})\,\mathrm {d}u)\). Assumption 2.7 impliesFootnote 6

$$ \mathbb {P}\bigg[\int_{t}^{\infty}D_{u}f(Z_{u})\,\mathrm {d}u > 0 \ \bigg| \ Z_{0} = z\bigg] = 1, \qquad z\in E, t\geq 0. $$
(7.1)

Therefore,

$$ g(z,x) = \mathbb {P}\left [X_{0}\leq x \mid Z_{0} = z\right ] = \mathbb {P}^{z, x} [Y^{x}_{t} > 0, \ \forall t\geq0] = \mathbb {P}^{z,x}\left [\tau= \infty\right ]. $$

Define

$$ h(z,x) :=\mathbb {P}^{z,x}\left [\lim_{t\to\infty} Y_{t} = \infty\right ], \quad(z,x) \in F. $$
(7.2)

Fix \((z,x)\in F\) and let \(0 < \varepsilon < x\). Note that \(Y^{x}_{t} = Y^{x-\varepsilon }_{t} + \varepsilon / D_{t}\). Since \(\lim_{t \to\infty} D_{t} = 0\) holds \(\mathbb {P}[\cdot \mid Z_{0} = z]\)-a.s., it follows that

$$\begin{aligned} \mathbb {P}^{z,x-\varepsilon }\left [\tau= \infty\right ] &=\mathbb {P}[Y^{x-\varepsilon }_{t} > 0,\ \forall t\geq0 \mid Z_{0} = z] \\ &\leq \mathbb {P}[Y^{x}_{t} \geq \varepsilon / D_{t}, \ \forall t\geq0 \mid Z_{0} = z] \\ &\leq \mathbb {P}\Big[\lim_{t \to\infty}Y^{x}_{t} = \infty\ \Big| \ Z_{0} = z\Big] \\ &= \mathbb {P}^{z,x}\left [\lim_{t\to\infty}Y_{t} = \infty\right ] \leq \mathbb {P}^{z,x}\left [\tau= \infty\right ]. \end{aligned}$$
(7.3)

Therefore, \(g(z,x-\varepsilon )\leq h(z,x) \leq g(z,x)\). By definition, \(g(z,x)\) is right-continuous in \(x\) for a fixed \(z\), and so

$$ g(z,x) \leq\liminf_{\varepsilon \to0} h(z,x+\varepsilon ) \leq \limsup_{\varepsilon \to0}h(z,x+\varepsilon ) \leq\limsup_{\varepsilon \to0} g(z,x+\varepsilon ) = g(z,x). $$

Therefore, if \(h(z,x)\) is continuous, it follows that \(h(z,x) = g(z,x)\). We now show that in fact \(h\) is in \(C^{2,\gamma}(F)\) and satisfies \(Lh = 0\). This gives the desired result for \(g\) since \(g = h\).

Let \(\psi:(0,\infty)\to(0,1)\) be a smooth function satisfying \(\lim _{x\to0}\psi(x) = 0\) and \(\lim_{x\to\infty}\psi(x) = 1\). By the classical Feynman–Kac formula,

$$ u^{n}(z,x):=\mathbb {E}^{\mathbb {P}^{z,x}}[\psi(Y_{\tau_{n}})] $$

satisfies \(Lu^{n} = 0\) in \(F_{n}\) with \(u^{n}(z,x) = \psi(x)\) on \(\partial F_{n}\). As \(\mathbb {P}[X_{0} < \infty \mid Z_{0} = z] = 1\), there exists a pair \((z_{0},x_{0}) \in F\) with \(\mathbb {P}[X_{0} < x_{0} \mid Z_{0} = z_{0}] > 0\). Using (7.3), this gives

$$ h(z_{0},x_{0}) \geq \mathbb {P}\left [X_{0} < x_{0}\mid Z_{0}=z_{0}\right ] > 0. $$
(7.4)

Therefore, \((\mathbb {P}^{z,x})_{(z,x)\in F}\) is transient [27, Chap. 2], and since \((\mathbb {P}^{z})_{z\in E}\) is positive recurrent, this implies that for all \((z,x)\), with \(\mathbb {P}^{z,x}\)-probability one, either \(\lim_{t\to\tau}Y_{t} = 0\) or \(\lim_{t\to\tau} Y_{t} = \infty\), where in the latter case, \(\tau= \infty\) since \(Y\) cannot explode to \(\infty\). This in turn yields that \(Y_{\tau_{n}}\rightarrow0\) or \(Y_{\tau_{n}}\rightarrow\infty\) with \(\mathbb {P}^{z,x}\)-probability one and hence by the dominated convergence theorem,

$$ \lim_{n\to\infty} u^{n}(z,x) = \mathbb {P}^{z,x}\left [\lim_{t\to\tau} Y_{t} = \infty\right ] = \mathbb {P}^{z,x}\left [\lim_{t\to\infty}Y_{t} =\infty\right ] = h(z,x). $$
(7.5)

For \((z_{0},x_{0})\) from (7.4), \(g(z_{0},x_{0})\geq h(z_{0},x_{0}) > 0\) and hence \(g(z,x) > 0\) for all \((z,x)\in F\) [27, Theorem 1.15.1]. But this implies \(h(z,x) \geq g(z,x/2) > 0\), and so from (7.5), the \(u^{n}\) are converging pointwise to a strictly positive function. Thus, by the interior Schauder estimates and Harnack’s inequality, it follows by “the standard compactness argument” (see [27, p. 147]) that there exists a strictly positive function \(u\) in \(C^{2,\gamma}(F)\) such that \(u^{n}\) converges to \(u\) in the \(C^{2,\gamma}(D)\)-Hölder space for all compact \(D\subset F\). Clearly, this function \(u\) satisfies \(Lu = 0\) in \(F\). In fact, since \(u^{n}\) converges to \(h\) pointwise, \(h=u\) and hence \(Lh = 0\).

We now consider the boundary conditions for \(g\). Let the integer \(k\) be given. It suffices to show that for each \(\varepsilon > 0\), there is some \(n(\varepsilon )\) such that

$$ \sup_{x\leq n(\varepsilon )^{-1},z\in E_{k}}g(z,x)\leq \varepsilon ,\qquad\inf_{x\geq n(\varepsilon ),z\in E_{k}}g(z,x)\geq1-\varepsilon . $$

The condition near \(x = 0\) is handled first. By way of contradiction, assume there exists some \(\varepsilon > 0\) such that for all integers \(n\), there exist \(z_{n}\in E_{k}\), \(x_{n} \leq1/n\) such that \(g(z_{n},x_{n}) > \varepsilon \). Since the \(z_{n}\) are all contained within \(E_{k}\), there is a subsequence (still labeled \(n\)) such that \(z_{n}\rightarrow z\) for \(z\in\bar{E}_{k}\). Let \(\delta> 0\) and choose \(N_{\delta}\) such that \(n\geq N_{\delta}\) implies \(n^{-1}\leq\delta\). Since \(g\) is increasing in \(x\), \(\varepsilon < g(z_{n},\delta)\). Since \(g\) is continuous, \(\varepsilon \leq g(z,\delta)\). Since this is true for all \(\delta> 0\), \(\lim_{x\to0} g(z,x)\geq \varepsilon \). But this is a contradiction because \(\lim_{x\to0}g(z,x) = 0\) for each \(z\in E\). To see this, let \(\delta> 0\) and choose \(\beta> 0\) such that \(\mathbb {P}[X_{0} \geq\beta \mid Z_{0} = z] \geq1-\delta\). This is possible in view of (7.1). Thus, for \(x < \beta\), \(g(z,x) \leq \mathbb {P}[X_{0} < \beta \mid Z_{0} = z] \leq\delta\), and hence \(\limsup_{x\to 0} g(z,x) \leq\delta\). Taking \(\delta\to0\) gives the result.

The proof for \(x\to\infty\) is very similar. Assume by contradiction that there is some \(\varepsilon > 0\) such that for all integers \(n\), there exist \(z_{n}\in E_{k}\), \(x_{n} \geq n\) such that \(g(z_{n},x_{n}) < 1-\varepsilon \). Again, by taking subsequences, we can assume \(z_{n}\rightarrow z\in\bar{E}_{k}\). Fix \(M > 0\). For \(n\geq M\), since \(g\) is increasing in \(x\), \(g(z_{n},M) < 1-\varepsilon \). Since \(g\) is continuous, \(g(z,M)\leq1-\varepsilon \). Since this holds for all \(M\), \(\lim_{x\to\infty}g(z,x) \leq1-\varepsilon \). But this violates the condition that under \(\mathbb {P}[\cdot \mid Z_{0} = z]\), \(X_{0}<\infty\) almost surely.

The uniqueness claim is now proved. Let \(\tilde{g}\) be a \(C^{2}(F)\) solution of \(L\tilde{g} = 0\) such that \(0\leq\tilde{g}\leq1\) and such that (3.5) holds. Define the stopping times

$$ \sigma_{k} :=\inf \left \{t\geq0 : Z_{t}\notin E_{k}\right \},\qquad\rho_{k} :=\inf \left \{t\geq0 : Y_{t} = k\right \}. $$

By Itô’s formula, for any \(k,n,m\),

$$\begin{aligned} \tilde{g}(z,x) &= \mathbb {E}^{\mathbb {P}^{z,x}}\big[g\big(Z_{\sigma_{k}\wedge\rho_{1/n}\wedge \rho_{m}}, Y_{\sigma_{k}\wedge\rho_{1/n}\wedge\rho_{m}}\big)\\ &\phantom{=\mathbb {E}^{\mathbb {P}^{z,x}}\big[}{} \times\big(1_{\{\rho_{1/n} < \sigma_{k}\wedge\rho_{m}\}} + 1_{\{\rho_{1/n}\geq \sigma_{k}\wedge\rho_{m}\}}(1_{\{\tau< \infty\}} + 1_{\{\tau=\infty\} })\big)\big]. \end{aligned}$$

Since \(\lim_{m\to\infty} \rho_{m} = \infty\) \(\mathbb {P}^{z,x}\)-almost surely, taking \(m\to\infty\) yields

$$ \begin{aligned} \tilde{g}(z,x) &= \hat{\mathbb {E}}^{\mathbb {P}^{z,x}}\big[g\big(Z_{\sigma _{k}\wedge\rho_{1/n}}, Y_{\sigma_{k}\wedge\rho_{1/n}}\big)\big(1_{\{\rho_{1/n}< \sigma_{k}\}} + 1_{\{\rho_{1/n}\geq\sigma_{k}\}}(1_{\{\tau< \infty\}} + 1_{\{\tau=\infty\}})\big)\big]. \end{aligned} $$

On \(\{\rho_{1/n} < \sigma_{k}\}\), we have \(Z_{\rho_{1/n}}\in E_{k}\), \(Y_{\rho_{1/n}}\leq1/n\) and hence by \(0\leq \tilde{g}\leq1\) and (3.5), for any \(\varepsilon > 0\), there is an \(n(\varepsilon )\) such that for \(n\geq n(\varepsilon )\),

$$ \begin{aligned} \tilde{g}(z,x) &\leq \varepsilon + \mathbb {P}^{z,x}[\rho_{1/n}\geq\sigma _{k},\tau< \infty] + \mathbb {P}^{z,x}[\rho_{1/n}\geq\sigma_{k}, \tau =\infty]. \end{aligned} $$

Taking \(n\to\infty\) thus gives

$$ \begin{aligned} \tilde{g}(z,x) &\leq \varepsilon + \mathbb {P}^{z,x}\left [\tau\geq\sigma_{k},\tau < \infty\right ] + \mathbb {P}^{z,x}\left [\tau=\infty\right ]. \end{aligned} $$

Taking \(k\to\infty\) gives

$$ \begin{aligned} \tilde{g}(z,x) &\leq \varepsilon + \mathbb {P}^{z,x}\left [\tau=\infty\right ], \end{aligned} $$

and hence taking \(\varepsilon \to0\) gives \(\tilde{g}(z,x) \leq \mathbb {P}^{z,x}[\tau= \infty] = g(z,x)\). Similarly, for \(k,n,m\),

$$ \begin{aligned} \tilde{g}(z,x) &= \mathbb {E}^{\mathbb {P}^{z,x}}\big[g\big(Z_{\sigma_{k}\wedge \rho_{1/n}\wedge\rho_{m}}, Y_{\sigma_{k}\wedge\rho_{1/n}\wedge\rho_{m}}\big)(1_{\{\rho_{m}< \sigma_{k}\wedge\rho_{1/n}\}} + 1_{\{\rho_{m}\geq \sigma_{k}\wedge\rho_{1/n}\}})\big],\\ &\geq(1-\varepsilon )\hat{\mathbb {P}}^{z,x}\left [\rho_{m}< \sigma_{k}\wedge\rho_{1/n},\lim_{t\to\infty}Y_{t} = \infty\right ], \end{aligned} $$

for all \(\varepsilon > 0\) and \(m\geq m(\varepsilon )\) for some \(m(\varepsilon )\). Note that the set \(\{\rho_{m} < \sigma_{k}\wedge\rho_{1/n}\}\) is restricted to include \(\{\lim_{t\to\infty} Y_{t} = \infty\}\), but this is fine since lower bounds are considered. Now, on the event \(\left \{\lim_{t\to\infty} Y_{t} = \infty\right \}\), it holds that \(\rho_{1/n}\rightarrow\infty\). Thus, taking \(n\to\infty\),

$$ \tilde{g}(z,x)\geq(1-\varepsilon )\mathbb {P}^{z,x}\left [\rho_{m} < \sigma_{k}, \lim_{t\to \infty}Y_{t} = \infty\right ]. $$

Taking \(k\to\infty\) gives

$$ \tilde{g}(z,x)\geq(1-\varepsilon )\mathbb {P}^{z,x}\left [\rho_{m} < \infty, \lim_{t\to \infty}Y_{t} = \infty\right ]. $$

Taking \(m\to\infty\) and noting that for \(m\) large enough, \(\rho_{m} < \infty\) on \(\{\lim_{t\to\infty}Y_{t} = \infty\}\), it holds that

$$ \tilde{g}(z,x) \geq(1-\varepsilon )\mathbb {P}^{z,x}\left [\lim_{t\to\infty}Y_{t}=\infty \right ] = (1-\varepsilon ) h(z,x), $$

where the last equality follows by the definition of \(h\) in (7.2). Now, in proving \(Lg = 0\) it was shown that \(g = h\) and hence \(\tilde{g}(z,x)\geq(1-\varepsilon )g(z,x)\). Taking \(\varepsilon \to0\) gives that \(\tilde{g}(z,x) \geq g(z,x)\), finishing the proof. □

8 Dynamics for the time-reversed process

The goal of the next two sections is to prove Theorem 3.4. We keep all notation from Sect. 3.2. We first identify the dynamics for \(\zeta ^{T}\).

Proposition 8.1

Suppose that Assumption  2.7 holds. Then for each \(T > 0\), the law of \(\zeta ^{T}\) undersolves the martingale problem on \(E\) (for \(t\leq T\)) for the operator \(L^{\zeta }:=(1/2)c^{ij}\partial^{2}_{ij} + \mu^{i}\partial_{i}\), where

$$ \mu :=c\frac{\nabla p}{p} + \operatorname {div}c - m. $$
(8.1)

The operator \(L^{\zeta }\) does not depend upon \(T\). Thus, if \((\mathbb {Q}^{z})_{z\in E}\) denotes the solution of the generalized martingale problem for \(L^{\zeta }\) on \(E\), then in fact \((\mathbb {Q}^{z})_{\zeta \in E}\) solves the martingale problem for \(L^{\zeta }\) on \(E\) and is positive recurrent.

Remark 8.2

If \(Z\) is reversing, then \(p\) satisfies \(m = (1/2)(c\nabla{p}/p + \operatorname {div}c)\). Thus, in this instance, \(\mu= m\) and as the name suggests, \(\zeta ^{T}\) has the same dynamics as \(Z\).

Proof of Proposition 8.1

The first statement regarding the martingale problem is based on the argument in [19]. Since \(Z\) is positive recurrent with invariant measure \(p\) and \(Z_{0}\) has initial distribution \(p\) under ℙ, \(Z\) is stationary with distribution \(p\). Since \(\tilde{L}^{Z}p = 0\), Eq. \((2.5)\) in [19] holds, noting that \(p\) does not depend upon \(t\).

For \(0\leq s\leq t\) and \(g\in C^{\infty}_{c}(E)\), define the function \(v(s,z) :=\mathbb {E}[g(X_{t})\ | Z_{s} = z]\). The Feynman–Kac formula implies that \(v\) satisfies \(v_{s} + L^{z}v = 0\) for \(0 < s < t\), \(z \in E\), with \(v(t,z) = g(z)\); see [20, 18] for an extension of the classical Feynman–Kac formula to the current setup. Therefore, the condition in Eq. \((2.7)\) of [19] holds as well. Thus, the formal argument on p. 1191 of [19] is rigorous, and the law of \(\zeta ^{T}\) under ℙ solves the martingale problem for \(L^{\zeta }\).

Turning to the statement regarding \((\mathbb {Q}^{z})_{z\in E}\), set \(\tilde {L}^{\zeta }\) as the formal adjoint to \(L^{\zeta }\). \(\tilde{L}^{\zeta }\) is given by (2.1) with \(\mu\) replacing \(m\) therein. Using the formula for \(\mu\) in (8.1) and for \(\tilde{L}^{Z}\) in (2.1), calculation shows that

$$ \tilde{L}^{\zeta }f = \tilde{L}^{Z} f - 2\nabla\cdot\bigg(\frac{f}{p}\Big(\frac{1}{2}\left(c\nabla p + p\operatorname {div}c\right)-pm\Big)\bigg). $$

Since

$$ 0 = \tilde{L}^{Z}p = \nabla\cdot\bigg(\frac{1}{2}\left(c\nabla p + p\operatorname {div}c\right) - pm\bigg), $$
(8.2)

it follows by considering \(f = p\) above that \(\tilde{L}^{\zeta }p = 0\). Therefore, \(p\) is an invariant density for \(L^{\zeta }\) if and only if the diffusion corresponding to the operator \(\tilde{L}^{\zeta ,p}\) does not explode, where \(\tilde{L}^{\zeta ,p}\) is the \(h\)-transform of \(\tilde{L}^{\zeta }\) [27, Theorem 4.8.5]. But by the definition of the \(h\)-transform [27, Sect. 4.1] and (2.1) with \(\mu\) replacing \(m\),

$$ \begin{aligned} \tilde{L}^{\zeta ,p}f :=\frac{1}{p}\tilde{L}^{\zeta }(fp) &= \frac {1}{2}c^{ij}\partial^{2}_{ij}f -\left(\mu^{i} - \operatorname {div}c^{i}-\Big(c\frac {\nabla p}{p}\Big)^{i}\right)\partial_{i} f + \frac{f}{p}\tilde{L}^{\zeta }p\\ &= \frac{1}{2}c^{ij}\partial^{2}_{ij}f + m^{i}\partial_{i} f = L^{Z}f, \end{aligned} $$

where the third equality follows from (8.1). Thus Assumption 2.7 (specifically the fact that \(Z\) is ergodic and \(\int_{E} p(z)\,\mathrm {d}z = 1\)) implies that the diffusion for \(\tilde{L}^{\zeta ,p}\) not only does not explode, but is also positive recurrent, finishing the proof. □

In preparation for the proof of the main result of this section, which is Proposition 8.5, we first need to define a certain “backward” filtration \(\mathbf {G}^{T}\) and present two lemmas. Fix \(T \in(0, \infty)\), \(t \in[0, T]\) and let \(\widetilde{\mathcal {G}}^{T}_{t}\) be the \(\sigma\)-field generated by \(X_{T}\), \((Z_{T-u})_{u\in[0,t]}\), \((W_{T} - W_{T - u})_{u \in[0, t]}\) and \((B_{T} - B_{T - u})_{u \in[0, t]}\). Then denote by \(\mathbf {G}^{T} :=(\mathcal {G}^{T}_{t})_{t \in[0, T]}\) the usual augmentation of \((\widetilde{\mathcal {G}}^{T}_{t})_{t \in[0, T]}\). It is easy to check that \((\chi^{T}, \zeta ^{T})\) is \(\mathbf {G}^{T}\)-adapted for all \(T \in \mathbb {R}_{+}\), as well as that the process \(B^{T}\) defined via \(B^{T}_{t} :=B_{T - t} - B_{T}\) is a \(k\)-dimensional Brownian motion on \((\varOmega, \mathbf {G}^{T}, \mathbb {P})\), independent of \((\chi^{T}_{0}, \zeta ^{T}_{0})= (X_{T}, Z_{T})\). However, the \(\mathbf {G}^{T}\)-adapted process \((W_{T - t} - W_{T})_{t \in[0, T]}\) is not necessarily a Brownian motion on \((\varOmega, \mathbf {G}^{T}, \mathbb {P})\).

With this notation, the following two lemmas are essential for proving Proposition 8.5.

Lemma 8.3

If Assumption  2.7 holds, then for any locally bounded Borel function \(\eta: E \to \mathbb {R}\) and \(0 \leq s \leq t \leq T\), it holds that

$$ - \int_{T - t}^{T - s} \eta(Z_{u})' \,\mathrm {d}B_{u} = \int_{s}^{t} \eta(\zeta ^{T}_{u})' \,\mathrm {d}B^{T}_{u}. $$
(8.3)

Furthermore, if \(\theta: E \to \mathbb {R}^{d}\) is continuously differentiable, then

$$ - \int_{T - t}^{T - s} \theta'(Z_{u})\,\mathrm {d}Z_{u} = \int_{s}^{t} \theta'(\zeta ^{T}_{u})\,\mathrm {d}\zeta^{T}_{u} + \int_{s}^{t}\left(\nabla\cdot(c\theta)-\theta'\operatorname {div}c\right)(\zeta^{T}_{u}) \,\mathrm {d}u. $$
(8.4)

Proof

Fix \(0 \leq s \leq t \leq T\). For each \(n \in \mathbb {N}\) and \(i \in \left \{0 ,\ldots , n\right \}\), let

$$ u^{n}_{i} :=T - t + i(t-s)/n. $$

First, assume that \(\eta\) is twice continuously differentiable. The standard convergence theorem for stochastic integrals implies that (the following limit is to be understood in measure for ℙ)

$$ \begin{aligned} &\int_{s}^{t} \eta(\zeta^{T}_{u})' \,\mathrm {d}B^{T}_{u} + \int_{T - t}^{T - s} \eta (Z_{u})' \,\mathrm {d}B_{u}\\ &\quad = - \lim _{n \to \infty }\bigg( \sum_{i=1}^{n} \big( \eta(Z_{u^{n}_{i}}) - \eta (Z_{u^{n}_{i-1}}) \big)' (B_{u^{n}_{i}} - B_{u^{n}_{i-1}}) \bigg). \end{aligned} $$

Since \(B\) and \(Z\) are independent, Itô’s formula implies that the last quadratic covariation is zero. Therefore, (8.3) holds for twice continuously differentiable \(\eta\). The fact that (8.3) holds whenever \(\eta\) is locally bounded follows from a monotone class argument.

In a similar manner, assume that \(\theta\) is twice continuously differentiable. The standard convergence theorem for stochastic integrals implies that

$$ \begin{aligned} &\int_{s}^{t} \theta'(\zeta^{T}_{u})\,\mathrm {d}\zeta^{T}_{u} + \int_{T - t}^{T- s} \theta'(Z_{u})\,\mathrm {d}Z_{u}\\ &\quad = - \lim _{n \to \infty }\bigg( \sum_{i=1}^{n} \big(\theta(Z_{u^{n}_{i}}) - \theta(Z_{u^{n}_{i-1}}) \big)'(Z_{u^{n}_{i}} - Z_{u^{n}_{i-1}}) \bigg). \end{aligned} $$

The last quadratic covariation process (without the minus sign) is equal to

$$\int_{T-t}^{T - s} \tilde{F}(c,\theta)(Z_{u}) \,\mathrm {d}u = \int_{s}^{t} \tilde {F}(c,\theta)(\zeta^{T}_{u}) \,\mathrm {d}u, $$

where \(\tilde{F}(c,\theta): E \to \mathbb {R}\) is given by

$$ \tilde{F}(c,\theta) = \sum_{i,j=1}^{d}c^{ij}\partial_{z_{i}}\theta^{j} = \sum_{i,j=1}^{d}\Big(\partial_{z_{i}}(c^{ij}\theta^{j})-\theta^{j}\partial _{z_{i}}\big((c')^{ji}\big)\Big) = \nabla\cdot(c\theta) - \theta'\operatorname {div}c, $$

since \(c'=c\). Thus, (8.4) is established in the case where \(\theta\) is twice continuously differentiable. The fact that (8.4) holds whenever \(\theta\) is continuously differentiable follows from a density argument, noting that there exists a sequence \((\theta_{n})_{n \in \mathbb {N}}\) of polynomials such that \(\lim _{n \to \infty }\theta_{n} = \theta\) and \(\lim _{n \to \infty }\nabla\theta_{n} = \nabla\theta\) both hold, where the convergence is uniform on compact subsets of \(E\). □

Lemma 8.4

Let Assumption  2.7 hold. For each \(T \in \mathbb {R}_{+}\), define the \(\mathbf {G}^{T}\)-adapted continuous-path \(\varDelta^{T}\) as in (3.8). Then \(\varDelta^{T}\) is a semimartingale on \((\varOmega, \mathbf {G}^{T}, \mathbb {P})\). More precisely, for \(t\in[0,T]\),

$$\begin{aligned} \varDelta^{T}_{t} &= 1 + \int_{0}^{t}\varDelta^{T}_{u}\left (\theta'c\frac{\nabla p}{p} + \nabla\cdot(c\theta) - a\right )(\zeta ^{T}_{u}) \,\mathrm {d}u \\ &\quad{}+ \int_{0}^{t} \varDelta^{T}_{u} \big(\eta(\zeta ^{T}_{u})'\,\mathrm {d}B^{T}_{u} + \theta'\sigma(\zeta ^{T}_{u})\,\mathrm {d}W^{T}_{u}\big). \end{aligned}$$
(8.5)

Proof

Define \((\rho^{T}_{t})_{t \in[0, T]}\) by \(\rho^{T}_{t} :=R_{T} - R_{T - t}\) for \(t \in[0, T]\). In view of (1.2), (2.2), (8.1) and Lemma 8.3,

$$\begin{aligned} \rho^{T} &= \int_{T - \cdot}^{T} \left (a + \frac{1}{2}(\theta'c\theta+ \eta '\eta) \right )(Z_{t}) \,\mathrm {d}t + \int_{T - \cdot}^{T} \left (\eta(Z_{t})'\,\mathrm {d}B_{t} + \theta'\sigma(Z_{t})\,\mathrm {d}W_{t}\right ) \\ &= \int_{T - \cdot}^{T} \left (a-\theta'm + \frac{1}{2}(\theta'c\theta+ \eta'\eta) \right )(Z_{t}) \,\mathrm {d}t + \int_{T - \cdot}^{T} \left (\eta(Z_{t})'\,\mathrm {d}B_{t} + \theta'(Z_{t})\,\mathrm {d}Z_{t}\right ) \\ &= \int_{0}^{\cdot} \left (a-\theta'm +\theta'\operatorname {div}c-\nabla\cdot(c\theta) + \frac{1}{2}(\theta'c\theta+ \eta'\eta) \right )(\zeta ^{T}_{t})\,\mathrm {d}t\\ &\quad{}- \int_{0}^{\cdot} \big(\eta(\zeta ^{T}_{t})' \,\mathrm {d}B^{T}_{t} + \theta'(\zeta ^{T}_{t}) \,\mathrm {d}\zeta ^{T}_{t}\big),\\ &= \int_{0}^{\cdot} \left (a-\theta'c\frac{\nabla p}{p} - \nabla\cdot (c\theta) + \frac{1}{2}(\theta'c\theta+ \eta'\eta) \right )(\zeta ^{T}_{t}) \,\mathrm {d}t\\ &\quad{}- \int_{0}^{\cdot} \big(\eta(\zeta ^{T}_{t})'\,\mathrm {d}B^{T}_{t} + \theta'\sigma(\zeta ^{T}_{t}) \,\mathrm {d}W^{T}_{t}\big). \end{aligned}$$

The fact that \(D = \exp(- R)\) gives \(\varDelta^{T} = \exp(- \rho^{T})\). Then the dynamics for \(\varDelta^{T}\) follow from the dynamics of \(\rho^{T}\). □

Proposition 8.5

Let Assumption  2.7 hold. Then for each \(T>0\), there are a filtration \(\mathbf {G}^{T}\) satisfying the usual conditions and \(d\)- and \(k\)-dimensional independent \((\mathbb {P},\mathbf {G}^{T})\)-Brownian motions \(W^{T},B^{T}\) on \([0,T]\) so that the pair \((\zeta ^{T},\chi^{T})\) has dynamics

$$ \begin{aligned} \zeta ^{T}_{t} & = \zeta ^{T}_{0} + \int_{0}^{T}\left(c\frac{\nabla p}{p} + \operatorname {div}c - m\right)(\zeta ^{T}_{u})\,\mathrm {d}u + \int_{0}^{T}\sigma(\zeta ^{T}_{u})\,\mathrm {d}W^{T}_{u},\\ \chi^{T}_{t}& = \chi^{T}_{0} + \int_{0}^{T} \left (f(\zeta ^{T}_{u}) -\chi^{T}_{u}\Big(a-\theta'c\frac{\nabla p}{p} - \nabla\cdot(c\theta )\Big)(\zeta ^{T}_{u})\right )\,\mathrm {d}u\\ &\quad{} + \int_{0}^{T} \chi^{T}_{u}\big(\theta'\sigma(\zeta ^{T}_{u})\,\mathrm {d}W^{T}_{u}+ \eta(\zeta ^{T}_{u})'\,\mathrm {d}B^{T}_{u}\big). \end{aligned} $$
(8.6)

Proof

Proposition 8.1 immediately implies that under ℙ, \(\zeta ^{T}\) has the dynamics

$$ \begin{aligned} \zeta ^{T}_{t} &=\zeta ^{T}_{0} + \int_{0}^{t} \left(c\frac{\nabla p}{p} +\operatorname {div}c - m\right)(\zeta ^{T}_{u})\,\mathrm {d}u + \int_{0}^{t} \sigma(\zeta ^{T}_{u})\,\mathrm {d}W^{T}_{u},\qquad t\in[0,T], \end{aligned} $$

where \((W^{T}_{t})_{t \in[0, T]}\) is a Brownian motion on \((\varOmega, \mathbf {G}^{T}, \mathbb {P})\). In order to specify the dynamics for \(\chi^{T}\), recall the definition of \(\varDelta^{T}\) from (3.8). Observe that

$$X_{T - t} = \frac{1}{D_{T - t}} \int_{T - t}^{\infty}D_{u} f(Z_{u}) \,\mathrm {d}u = \frac{D_{T}}{D_{T - t}} \left (X_{T} + \int_{T - t}^{T} \frac{D_{u}}{D_{T}} f(Z_{u}) \,\mathrm {d}u\right ), t \in[0, T]. $$

Then, using the definitions of \(\chi^{T}\), \(\zeta ^{T}\) and \(\varDelta^{T}\), the above is rewritten as

$$ \chi^{T}_{t} = \varDelta_{t}^{T} \left (\chi^{T}_{0} + \int_{0}^{t} \frac{1}{\varDelta^{T}_{u}} f(\zeta ^{T}_{u}) \,\mathrm {d}u\right ), \qquad t \in[0, T]. $$
(8.7)

Lemma 8.4 implies that \(\varDelta^{T}\) is a semimartingale, and hence (8.7) yields

$$\chi^{T}_{t} = \chi^{T}_{0} + \int_{0}^{t} \chi^{T}_{u} \, \frac{\,\mathrm {d}\varDelta_{u}^{T}}{\varDelta_{u}^{T}} + \int_{0}^{t} f(\zeta ^{T}_{u}) \,\mathrm {d}u, \qquad t \in[0, T]. $$

The result now follows by plugging in for \(\mathrm {d}\varDelta^{T}_{u} / \varDelta^{T}_{u}\) from (8.5). □

9 Proof of Theorem 3.4

9.1 Preliminaries

We first prove two technical results. The first asserts the existence of a probability space and stationary processes \((\zeta ,\chi)\) consistent with \((\zeta ,\chi^{x})\) from Theorem 3.4 in the sense that given \(\chi_{0} = x\), it holds that \(\chi_{t} = \chi^{x}_{t}, t\geq0\). The second proposition shows that under the non-degeneracy assumption \(|\eta|(z) > 0, z\in E\), and the regularity assumption \(f\in C^{2}(E;\mathbb {R}_{+})\), it follows that \((\zeta ,\chi)\) is ergodic.

Lemma 9.1

If Assumption  2.7 holds, there is a filtered probability space \((\varOmega, \mathbf {F},\mathbb {Q})\) supporting independent \(d\)- and \(k\)-dimensional Brownian motions \(W\) and \(B\), \(\mathcal {F}_{0}\)-measurable random variables \(\zeta _{0},\chi_{0}\) with joint distribution \(\pi\), as well as a stationary process \(\zeta \) with dynamics

$$ \zeta = \zeta _{0} + \int_{0}^{\cdot}\left(c\frac{\nabla p}{p} + \operatorname {div}c - m\right )(\zeta _{t})\,\mathrm {d}t + \int_{0}^{\cdot}\sigma(\zeta _{t})\,\mathrm {d}W_{t}. $$
(9.1)

Furthermore, with \(\varDelta,\chi^{x}\) defined as in (3.10), (3.11), if the process \(\chi\) is defined by \(\chi_{t} :=\chi^{\chi_{0}}_{t}\) (see Remark  3.5), then \((\zeta ,\chi)\) is stationary with invariant measure \(\pi\) and joint dynamics

$$ \begin{aligned} \,\mathrm {d}\zeta _{t} & = \left(c\frac{\nabla p}{p} + \operatorname {div}c - m \right)(\zeta _{t})\,\mathrm {d}t + \sigma(\zeta _{t})\,\mathrm {d}W_{t}, \quad t \in \mathbb {R}_{+}, \\ \,\mathrm {d}\chi_{t}& = \left (f(\zeta _{t}) -\chi_{t}\Big(a-\theta'c\frac{\nabla p}{p} - \nabla\cdot(c\theta)\Big)(\zeta _{t})\right )\,\mathrm {d}t\\ &\quad{}+ \chi_{t}\left(\theta'\sigma(\zeta _{t})\,\mathrm {d}W_{t} + \eta(\zeta _{t})'\,\mathrm {d}B_{t}\right), \quad t \in \mathbb {R}_{+}. \end{aligned} $$
(9.2)

Proof

This follows from Proposition 8.1. Indeed, one can start with a probability space \((\varOmega, \mathbf {F},\mathbb {Q})\) supporting independent \(d\)- and \(k\)-dimensional Brownian motions \(W\) and \(B\), respectively, as well as an \(\mathcal {F}_{0}\)-measurable random variable \((\zeta _{0},\chi_{0})\sim\pi\) (hence independent of \(W\) and \(B\)). Under the given regularity assumptions, Proposition 8.1 yields a strong, stationary solution \(\zeta \) satisfying (9.1). Then, defining \(\varDelta\) as in (3.8) and, for \(x>0\), \(\chi^{x}\) as in (3.11), it follows that \((\zeta ,\chi^{x})\) and hence \((\zeta ,\chi)\) satisfy the SDE in (9.2). Under the given regularity assumptions, the law under ℙ of \((\zeta ^{T},\chi^{T})\) given \(\zeta ^{T}_{0} = z, \chi^{T}_{0} = x\) coincides with the law under ℚ of \((\zeta ,\chi^{x})\) given that \(\zeta _{0} = z\). Since by construction, \(\pi\) is an invariant measure for \((\zeta ^{T},\chi^{T})\), it follows from the Markov property that \(\pi\) is invariant for \((\zeta ,\chi )\) under ℚ and hence \((\zeta ,\chi)\) is stationary with invariant measure \(\pi\). □

Define the measures \(\mathbb {Q}^{z,x}\) for \((z,x)\in F\) via

$$ \mathbb {Q}^{z,x}\left [A\right ] = \mathbb {Q}[A\mid \zeta _{0} = z,\chi_{0} = x],\qquad A\in \mathcal {F}_{\infty}. $$

We now consider when \(|\eta| > 0\) on \(E\) and \(f\in C^{2}(E;\mathbb {R}_{+})\). According to Theorem 3.1, \(g\in C^{2,\gamma}(F)\) and hence \(\pi\) possesses a density satisfying

$$ \pi(z,x) = p(z)\partial_{x} g(z,x);\qquad(z,x)\in F. $$
(9.3)

Additionally, we have the following result.

Proposition 9.2

Let Assumption  2.7 hold, and additionally suppose that \(|\eta|(z) > 0\) for \(z\in E\) and that \(f\in C^{2}(E;\mathbb {R}_{+})\). Then the process \((\zeta ,\chi)\) from Lemma  9.1 is ergodic. Thus, for all bounded measurable functions \(h\) on \(F\) and all \((z,x)\in F\),

$$ \lim_{T\to\infty}\frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi_{t})\,\mathrm {d}t = \int_{F} h\,\mathrm {d}\pi\quad \mathbb {Q}^{z,x} \textrm{-a.s.} $$
(9.4)

Proof

Recall \(A\) from (3.3) and define \(b^{R}: F\to \mathbb {R}^{d+1}\) by

$$ \begin{aligned} b^{R}(z,x) :=\left( \textstyle\begin{array}{c} \left (c (\nabla p / p) + \operatorname {div}c - m\right ) (z) \\ f(z) - x(a - \theta'c\left (\nabla p/p\right ) - \nabla\cdot(c\theta))(z) \end{array}\displaystyle \right). \end{aligned} $$
(9.5)

From (9.2), it is clear that the generator for \((\zeta ,\chi)\) is \(L^{R} :=(1/2)A^{ij}\partial^{2}_{ij} + (b^{R})^{i} \partial _{i}\). In an abuse of notation, let \((\mathbb {Q}^{z,x})_{(z,x)\in F}\) also denote the solution to the generalized martingale problem for \(L^{R}\) on \(F\). Using Theorem 3.1 and the fact that under the given coefficient regularity assumptions, \(g\in C^{3}(F)\) (see [15, Theorem 6.17]), a lengthy calculation performed in Lemma A.1 below shows that the density \(\pi\) from (9.3) solves \(\tilde{L}^{R} \pi= 0\), where \(\tilde{L}^{R}\) is the formal adjoint to \(L\). Since by construction, \(\iint_{F}\pi (z,x)\,\mathrm {d}z \,\mathrm {d}x = 1\), positive recurrence will follow once it is shown that \((\mathbb {Q}^{z,x})_{(z,x)\in F}\) is recurrent. By Proposition 8.1, the restriction of \(\mathbb {Q}^{z,x}\) to the first \(d\) coordinates (i.e., the part for \(\zeta \)) is positive recurrent. Since by (3.11) it is evident that \(\chi\) does not hit 0 in finite time, it follows that \(\chi\) does not explode under \(\mathbb {Q}^{z,x}\). Thus, [27, Corollary 4.9.4] shows that \((\zeta ,\chi)\) is recurrent. Now (9.4) follows from [27, Theorem 4.9.5]. □

9.2 Proof of Theorem 3.4

The proof of Theorem 3.4 uses a number of approximation arguments. To make these precise, we first enlarge the original probability space \((\varOmega , \, \mathbf {F}, \, \mathbb {P})\) so that it contains a one-dimensional Brownian motion \(\hat{B}\) which is independent of \(Z_{0},W\) and \(B\). Let \(D\) be as in (1.3), and for \(\varepsilon > 0\), define \(D^{\varepsilon }:=D \mathcal{E}(\sqrt{ \varepsilon }\hat{B})\). Similarly to (1.1), define

$$ X_{0}^{\varepsilon }:=\int_{0}^{\infty}D^{\varepsilon }_{t} f(Z_{t})\,\mathrm {d}t. $$
(9.6)

Note that \(D^{\varepsilon }\) takes the form (1.3) for \(\eta^{\varepsilon }(z) = (\eta(z),\sqrt{ \varepsilon })\) and when the Brownian motion \(B\) therein is the \((k+1)\)-dimensional Brownian motion \((B,\hat{B})\). Note that \(|\eta^{\varepsilon }|^{2} = |\eta|^{2} + \varepsilon > 0\). Denote by \(\pi^{\varepsilon }\) the joint distribution of \((Z_{0},X_{0}^{\varepsilon })\) under ℙ and by \(g^{\varepsilon }\) the conditional cumulative distribution function of \(X_{0}^{\varepsilon }\) given \(Z_{0} = z\). By Theorem 3.1, it follows that \(g^{\varepsilon }\in C^{2,\gamma}(F)\) and hence \(\pi ^{\varepsilon }\) admits a density.

In a similar manner, by enlarging the probability space \((\varOmega, \mathbf {F},\mathbb {Q})\) of Lemma 9.1 to include a Brownian motion (still labeled \(\hat{B}\)) which is independent of \(\zeta _{0}\), \(\chi _{0}\), \(W\) and \(B\), and defining the family of processes \((\varDelta^{\varepsilon })_{\varepsilon > 0}\) and \((\chi^{\varepsilon ,x})_{\varepsilon > 0}\) for \(x>0\) according to

$$ \begin{aligned} \varDelta^{\varepsilon }_{t} &:=\varDelta_{t}\mathcal{E}(\sqrt{ \varepsilon }\hat{B})_{t},\qquad t\geq0,\\ \chi^{\varepsilon ,x}_{t} &:=\varDelta^{\varepsilon }_{t} \left (x + \int_{0}^{t} \frac{1}{\varDelta ^{\varepsilon }_{u}} f(\zeta _{u}) \,\mathrm {d}u\right ), \qquad t\geq0, \end{aligned} $$
(9.7)

it follows that \((\zeta ,\chi^{x,\varepsilon })\) solves the SDE

$$\begin{aligned} \,\mathrm {d}\zeta _{t} & = \left(m+2\xi\right)(\zeta _{t})\,\mathrm {d}t + \sigma(\zeta _{t})\,\mathrm {d}W_{t}, \\ \,\mathrm {d}\chi^{\varepsilon ,x}_{t}& = \left (f(\zeta _{t}) -\chi^{\varepsilon }_{t}\Big(a-\theta 'c\frac{\nabla p}{p} - \nabla\cdot(c\theta)\Big)(\zeta _{t})\right )\,\mathrm {d}t \\ &\quad{}+ \chi^{\varepsilon }_{t}\big(\theta'\sigma(\zeta _{t})\,\mathrm {d}W_{u} + \eta ^{\varepsilon }(\zeta _{t})'(\mathrm {d}B_{t},\mathrm {d}\hat{B}_{t})\big). \end{aligned}$$
(9.8)

Since \(|\eta^{\varepsilon }|\geq\sqrt{ \varepsilon } >0\), Proposition 9.2 shows that for \(f\in C^{2}(E;\mathbb {R}_{+})\), the generator \(L^{\varepsilon ,R}\) associated to (9.8) is positive recurrent with invariant density \(\pi^{\varepsilon }\) and thus for all \((z,x)\in F\) and all bounded measurable functions \(h\) on \(F\) (note that conditioned upon \(\chi_{0} = x\), we have \(\chi^{\varepsilon ,x}_{0}=\chi^{x}_{0} = x = \chi_{0}\)),

$$ \lim_{T\to\infty}\frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi^{x,\varepsilon }_{t})\,\mathrm {d}t = \int_{F} h\,\mathrm {d}\pi^{\varepsilon },\qquad \mathbb {Q}^{z,x} \text{-a.s.} $$
(9.9)

With all the notation in place, Theorem 3.4 is the culmination of a number of lemmas, which are now presented. The first lemma implies that \(\pi^{\varepsilon }\) converges weakly to \(\pi\) as \(\varepsilon \downarrow0\).

Lemma 9.3

Let Assumption  2.7 hold. Define \(X_{0}^{\varepsilon }\) as in (9.6). Then \(X_{0}^{\varepsilon }\) converges to \(X\) in ℙ-measure as \(\varepsilon \to0\).

Proof

Denote by \(\mathcal {G}\) the sigma-field generated by \(Z_{0}\), \(W\) and \(B\) and define the process \(\delta^{\varepsilon }\) by \(\delta^{\varepsilon }_{t} :=D^{\varepsilon }_{t}/D_{t} = \mathcal{E}(\sqrt{ \varepsilon }\hat{B}_{t})\). By the independence of \(\delta^{\varepsilon }\) and \(\mathcal {G}\),

$$\mathbb {E}\left [|X_{0}^{\epsilon} - X_{0}| \ \big| \ \mathcal {G}\right ] \leq\int_{0}^{\infty} \mathbb {E}\left [|\delta^{\epsilon}_{t} - 1| \ \big| \ \mathcal {G}\right ]D_{t} f(Z_{t}) \,\mathrm {d}t = \int_{0}^{\infty} \mathbb {E}[|\delta^{\epsilon}_{t} - 1|] D_{t} f(Z_{t}) \,\mathrm {d}t. $$

Now set \(h^{\varepsilon }_{t}:=\sqrt{e^{\varepsilon t} -1}\). Note that \(h^{\varepsilon }\) is increasing in \(\varepsilon \) with \(\lim_{\varepsilon \to0} h^{\varepsilon } = 0\). Furthermore,

$$\mathbb {E}[|\delta^{\varepsilon }_{t} - 1|] \leq\big(\mathbb {E}\big[|\delta ^{\varepsilon }_{t} - 1|^{2}\big]\big)^{1/2} = \sqrt{\exp(\varepsilon t) - 1} = h^{\varepsilon }_{t}. $$

By assumption, \(\mathbb {P}[X_{0}<\infty] = 1\). Since for any \(\varepsilon > 0\), \(\sup _{t\geq0} \delta^{\varepsilon }_{t} < \infty\) ℙ-a.s., it thus follows that \(\mathbb {P}[X_{0}^{\varepsilon }< \infty] = 1\). The dominated convergence theorem applied pathwise (recall that there exists a \(\kappa> 0\) so that \(e^{\kappa t} D_{t} \rightarrow0\) ℙ-a.s.) then gives that \(\lim _{\varepsilon \to0} \mathbb {E}[|X_{0}^{\varepsilon } - X_{0}| \mid \mathcal {G}] = 0\), which shows that the pair \((Z_{0}, X_{0}^{\varepsilon })\) converges in probability to \((Z_{0}, X_{0})\), finishing the proof. □

Next, define \(\mathcal{C}\) as the class of (Borel-measurable) functions \(h\) which are bounded and Lipschitz in \(x\), uniformly in \(z\); in other words,

$$ \begin{aligned} \mathcal{C} &:=\left \{h\in \mathcal {B}(F;\mathbb {R}) : \sup_{z\in E} |h(z,x_{1}) - h(z,x_{2})| \leq K(h) \left(1\wedge|x_{1} - x_{2}|\right)\right \} \end{aligned} $$
(9.10)

for some \(K(h)>0\) (which may depend upon \(h\)) and all \(x_{1},x_{2} > 0\). The next lemma gives a weak form of the convergence in Theorem 3.4 for regular \(f\). Note that the notation \(\mathbb {Q}\text{-}\lim _{T\to\infty}\) stands for the limit in ℚ-probability as \(T\to \infty\).

Lemma 9.4

Let Assumption  2.7 hold. Assume additionally that \(f\in C^{2}(E;\mathbb {R}_{+})\). Then for all \(x > 0\) and all \(h\in\mathcal{C}\),

$$ \mathbb {Q}\text{-}\lim_{T\to\infty} \frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi ^{x}_{t})\,\mathrm {d}t = \int_{F}h\,\mathrm {d}\pi. $$
(9.11)

Proof

For ease of presentation, we adopt the following notational conventions. First, for any measurable function \(f\) and probability measure \(\nu\) on \(F\), set

$$ \langle h,\nu\rangle :=\int_{F} h\,\mathrm {d}\nu. $$
(9.12)

Next, similarly to \(\hat{\pi}^{x}_{T}\) in (3.12), we define \(\hat{\pi}^{\varepsilon ,x}_{T}\) to be the empirical measure of \((\zeta ,\chi^{\varepsilon ,x})\) on \([0,T]\) for \(\chi^{\varepsilon ,x}\) as in (9.7). Thus, we write

$$ \frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi^{x}_{t})\,\mathrm {d}t = \langle h,\hat{\pi }^{x}_{T}\rangle,\qquad\frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi^{\varepsilon ,x}_{t})\,\mathrm {d}t = \langle h,\hat{\pi}^{\varepsilon ,x}_{T}\rangle. $$
(9.13)

Proposition 9.2 implies for all \(x>0\) and \(\varepsilon > 0\) that

$$\mathbb {Q}\text{-}\lim_{T\to\infty} \langle h , \hat{\pi}^{\varepsilon ,x}_{T} \rangle = \langle h , \pi^{\varepsilon }\rangle . $$

Indeed, (9.9) gives for all \((z,x)\in F\) that

$$ \lim_{T\to\infty}\langle h, \hat{\pi}^{\varepsilon ,x}_{T}\rangle= \langle h , \pi^{\varepsilon } \rangle \qquad \mathbb {Q}^{z,x} \text{-a.s.} $$

Thus, the above limit holds ℚ-almost surely and hence in probability.

To prove (9.11), we need to show that for any increasing \(\mathbb {R}_{+}\)-valued sequence \((T_{n})_{n \in \mathbb {N}}\) such that \(\lim _{n \to \infty }T_{n} = \infty \), there is a subsequence \((T_{n_{k}})_{k \in \mathbb {N}}\) such that

$$\mathbb {Q}\text{-} \lim_{k\to\infty} \langle h , \hat{\pi}^{x}_{T_{n_{k}}} \rangle = \langle h , \pi \rangle , $$

as this implies (9.11) by considering double subsequences. To this end, let \((\varepsilon _{k})_{k \in \mathbb {N}}\) be any strictly positive sequence that converges to zero, and assume that \(\varepsilon _{1} < \kappa\), where \(\kappa> 0\) is from Assumption (A5). Next, pick \(T_{n_{k}}\) large enough so that \(k/T_{n_{k}} \rightarrow0\) and such that

$$\mathbb {Q}\left [\vert \langle h , \hat{\pi}^{\varepsilon _{k},x}_{T_{n_{k}}} \rangle - \langle h , \pi^{\varepsilon _{k}} \rangle \vert > \frac{1}{k}\right ] \leq\frac{1}{k}. $$

As argued above, this is possible since \(\langle h , \hat{\pi}^{\varepsilon _{k},x}_{T} \rangle \) converges to \(\langle h , \pi^{\varepsilon _{k}} \rangle \) in ℚ-probability. Since Lemma 9.3 implies \(\lim_{\varepsilon \to 0} \langle h , \pi^{\varepsilon _{k}} \rangle = \langle h , \pi \rangle \), it follows that

$$\mathbb {Q}\text{-}\lim_{k\to\infty} \langle h , \hat{\pi}^{\varepsilon _{k},x}_{T_{n_{k}}} \rangle = \langle h , \pi \rangle . $$

Since

$$ \begin{aligned} | \langle h , \hat{\pi}^{x}_{T_{n_{k}}} \rangle - \langle h , \pi \rangle | &\leq| \langle h , \hat{\pi}^{x}_{T_{n_{k}}} \rangle - \langle h , \hat{\pi}^{\varepsilon _{k},x}_{T_{n_{k}}} \rangle | + | \langle h , \hat{\pi}^{\varepsilon _{k},x}_{T_{n_{k}}} \rangle - \langle h , \pi \rangle |, \end{aligned} $$

it suffices to show

$$\mathbb {Q}\text{-}\lim_{k\to\infty}| \langle h , \hat{\pi}^{\varepsilon _{k},x}_{T_{n_{k}}} \rangle - \langle h , \hat{\pi}^{x}_{T_{n_{k}}} \rangle | = 0. $$

In fact, the claim is that

$$\lim_{k \to\infty} \mathbb {E}^{\mathbb {Q}} \big[ \vert \langle h , \hat {\pi}^{\varepsilon _{k},x}_{T_{n_{k}}} \rangle - \langle h , \hat{\pi}^{x}_{T_{n_{k}}} \rangle \vert \big] = 0, $$

or the even stronger (recall (9.13)) result

$$ \lim_{k \to\infty} \left ( \frac{1}{T_{n_{k}}} \int_{0}^{T_{n_{k}}} \mathbb {E}^{\mathbb {Q}} \left [ |h(\zeta_{t}, \chi^{\varepsilon _{k},x}_{t}) - h(\zeta_{t}, \chi^{x}_{t})| \right ] \,\mathrm {d}t \right ) = 0. $$
(9.14)

From (9.10),

$$ \frac{1}{T_{n_{k}}} \int_{0}^{T_{n_{k}}} \mathbb {E}^{\mathbb {Q}} \left [ |h(\zeta_{t}, \chi^{\varepsilon _{k},x}_{t}) - h(\zeta_{t}, \chi_{t}^{x})| \right ] \,\mathrm {d}t \leq\frac {K}{T_{n_{k}}}\int_{0}^{T_{n_{k}}} \mathbb {E}^{\mathbb {Q}}\left [1\wedge|\chi^{\varepsilon _{k},x}_{t} - \chi^{x}_{t}|\right ]\,\mathrm {d}t. $$
(9.15)

Furthermore, recall that

$$\chi^{x}_{t} = \varDelta_{t}\left(x + \int_{0}^{t} \frac{1}{\varDelta_{u}}f(\zeta _{u})\,\mathrm {d}u\right), \quad\chi^{\varepsilon _{k},x}_{t} = \varDelta^{\varepsilon _{k},x}_{t}\left(x + \int _{0}^{t} \frac{1}{\varDelta^{\varepsilon _{k}}_{u}}f(\zeta _{u})\,\mathrm {d}u\right), $$

where \(\varDelta^{\varepsilon _{k}}\) is from (9.7). With \(\delta^{\varepsilon _{k}}:=\mathcal{E}(\sqrt{ \varepsilon _{k}}\hat{B})\), it follows that under ℚ,

$$\begin{aligned} |\chi^{\varepsilon _{k},x}_{t} - \chi^{x}_{t}| &\leq x |\varDelta^{\varepsilon _{k}}_{t} - \varDelta_{t}| + \int_{0}^{t} \left \vert \frac{\varDelta^{\varepsilon _{k}}_{t}}{\varDelta^{\varepsilon _{k}}_{u}} - \frac{\varDelta _{t}}{\varDelta_{u}}\right \vert f(\zeta_{u}) \,\mathrm {d}u \\ & = x\varDelta_{t} |\delta^{\varepsilon _{k}}_{t} - 1| + \int_{0}^{t} \frac{\varDelta_{t}}{\varDelta _{u}} \left \vert \frac{\delta^{\varepsilon _{k}}_{t}}{\delta^{\varepsilon _{k}}_{u}} - 1\right \vert f(\zeta_{u}) \,\mathrm {d}u. \end{aligned}$$

With \(\mathcal {G}\) now denoting the \(\sigma\)-field generated by \(\zeta _{0}\), \(W\) and \(B\), the independence of \(\hat{B}\) and \(\mathcal {G}\) implies that

$$ \mathbb {E}^{\mathbb {Q}} \left [|\chi^{\varepsilon _{k},x}_{t} - \chi^{x}_{t}| \ \big| \ \mathcal {G}\right ] \leq x \varDelta_{t} h^{\varepsilon _{k}}_{t} + \int_{0}^{t} \frac{\varDelta_{t}}{\varDelta_{u}} h^{\varepsilon _{k}}_{t -u} f(\zeta_{u}) \,\mathrm {d}u, $$
(9.16)

where for any \(\varepsilon > 0\), \(h^{\varepsilon }\) is from Lemma 9.3. Since \(\zeta \) is stationary under ℚ, it holds for all \(t>0\) that the distribution of \(\varDelta_{t}\) under ℚ coincides with the distribution of \(D_{t}\) under ℙ, and the distribution of \(\int_{0}^{t} (\varDelta_{t} / \varDelta_{u}) h^{\varepsilon _{k}}_{t -u} f(\zeta_{u}) \,\mathrm {d}u\) under ℚ is the same as the distribution of \(\int_{0}^{t} D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u}) \,\mathrm {d}u\) under ℙ.

We next claim that there exists a sequence \(\delta_{k} \to0\) such that

$$ \sup_{t \in[k, \infty)} \mathbb {P}\left [1\wedge\left(x D_{t} h^{\varepsilon _{k}}_{t} + \int_{0}^{t} D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u}) \,\mathrm {d}u\right) > \delta_{k} \right ] \leq\delta _{k}, \quad\forall k \in \mathbb {N}. $$
(9.17)

This is shown at the end of the proof. Admitting this, from

$$\mathbb {E}^{\mathbb {Q}}\big[1 \wedge|\chi^{\varepsilon _{k},x}_{t} - \chi^{x}_{t}| \ \big| \ \mathcal {G}\big] \leq1 \wedge \mathbb {E}^{\mathbb {Q}} \big[|\chi^{x,\varepsilon _{k}}_{t} - \chi ^{x}_{t}| \ \big| \ \mathcal {G}\big], $$

it follows that

$$ \begin{aligned} &\lim_{k \to\infty} \bigg( \sup_{t \in[k, \infty)} \mathbb {E}^{\mathbb {Q}} \left [ 1 \wedge|\chi^{\varepsilon _{k},x}_{t} - \chi^{x}_{t}| \right ]\bigg)\\ & \quad= \lim_{k \to\infty} \bigg( \sup_{t \in[k, \infty)} \mathbb {E}^{\mathbb {Q}} \left [\mathbb {E}^{\mathbb {Q}} \left [1 \wedge|\chi^{\varepsilon _{k},x}_{t} - \chi ^{x}_{t}| \ \big| \ \mathcal {G}\right ] \right ] \bigg)\\ &\quad\leq\lim_{k\to\infty} \bigg(\sup_{t\in[k,\infty)} \mathbb {E}\left [1\wedge\Big(x D_{t} h^{\varepsilon _{k}}_{t} + \int_{0}^{t} D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u}) \,\mathrm {d}u\Big)\right ]\bigg)\\ &\quad\leq\lim_{k\to\infty}2\delta_{k} = 0. \end{aligned} $$

Above, the first inequality holds because of (9.16) and the second by (9.17) and the fact that for any random variable \(Y\), \(\mathbb {E}[1\wedge Y] \leq\delta+ \mathbb {P}[1\wedge Y > \delta]\). The last equality follows by the construction of \(\delta_{k}\). Recall that \(T_{n_{k}}\) was chosen so that \(\lim_{k \to\infty} (k / T_{n_{k}}) = 0\); so it follows that

$$\begin{aligned} \begin{aligned} &\limsup_{k \to\infty} \left ( \frac{1}{T_{n_{k}}} \int_{0}^{T_{n_{k}}} \mathbb {E}^{\mathbb {Q}} \left [ 1 \wedge| \chi^{\epsilon_{k},x}_{t} - \chi^{x}_{t}| \right ] \,\mathrm {d}t \right )\\ &\quad\leq\limsup_{k\to\infty}\bigg(\frac{k}{T_{n_{k}}} + \frac {T_{n_{k}}-k}{T_{n_{k}}}\sup_{t \in[k, \infty)} \mathbb {E}^{\mathbb {Q}} \left [ 1 \wedge|\chi^{\varepsilon _{k},x}_{t} - \chi^{x}_{t}| \right ]\bigg) \\ &\quad= 0, \end{aligned} \end{aligned}$$

which in view of (9.15) implies (9.14), finishing the proof.

It remains to show (9.17). Since \(1\wedge(a+b)\leq 1\wedge a + 1\wedge b\) for any \(a,b>0\), the two terms on the right-hand side of (9.17) are treated separately. Let \(\delta_{k} > 0\). First we have

$$ \mathbb {P}[1\wedge x D_{t} h^{\varepsilon _{k}}_{t} > \delta_{k}] \leq \mathbb {P}[x D_{t} h^{\varepsilon _{k}}_{t} > \delta_{k}] = \mathbb {P}[x D_{t} e^{\kappa t} > \delta_{k} e^{\kappa t}/h^{\varepsilon _{k}}_{t}] $$

Now, \(h^{\varepsilon _{k}}_{t} \leq e^{\varepsilon _{k} /2 t}\) so that for \(t\geq k\), \(e^{\kappa t}/h^{\varepsilon _{k}}_{t} \geq e^{(\kappa-\varepsilon _{k}/2)t} \geq e^{(\kappa -\varepsilon _{k}/2)k}\) since \(\varepsilon _{k} /2 < \kappa\). So, for any \(\delta_{k} > e^{-(\kappa-\varepsilon _{k}/2)(k/2)}\), it follows that

$$ \mathbb {P}[x D_{t} h^{\varepsilon _{k}}_{t} > \delta_{k}] \leq \mathbb {P}\big[ x D_{t} e^{\kappa t} \geq e^{(\kappa-\varepsilon _{k}/2)(k/2)}\big] $$

Set \(\tilde{\delta}_{k} :=\sup_{t\geq k} \mathbb {P}[x D_{t} e^{\kappa t} \geq e^{(\kappa-\varepsilon _{k}/2)(k/2)}]\). Since \(D_{t} e^{\kappa t}\) goes to 0 in ℙ-probability, it follows that \(\tilde{\delta}_{k} \rightarrow0\). Thus, taking \(\delta_{k}\) to be the maximum of \(\tilde{\delta}_{k}\) and \(e^{-(\kappa-\varepsilon _{k}/2)(k/2)}\), it follows that

$$ \mathbb {P}[1\wedge\chi D_{t} h^{\varepsilon _{k}}_{t} > \delta_{k}] \leq\delta_{k}. $$

Turning to the second term in (9.17), it is clear that

$$ 1\wedge\int_{0}^{t} D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u \leq1 \wedge\int _{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u. $$

As shown in the proof of Lemma 9.3, \(\int_{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u}f(Z_{u})\,\mathrm {d}u\) goes to 0 almost surely as \(k\to\infty\). Thus by the bounded convergence theorem, \(\mathbb {E}[1\wedge\int_{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u] \rightarrow0\) as \(k\to\infty\). Since

$$ \mathbb {P}\left [1\wedge\int_{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u > \delta _{k}\right ] \leq\frac{1}{\delta_{k}}\mathbb {E}\left [1\wedge\int_{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u\right ], $$

upon defining \(\delta_{k} :=\sqrt{\mathbb {E}[1\wedge\int_{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u]}\), it follows that

$$ \mathbb {P}\left [1\wedge\int_{0}^{\infty}D_{u} h^{\varepsilon _{k}}_{u} f(Z_{u})\,\mathrm {d}u > \delta _{k}\right ] \leq\delta_{k}, $$

and \(\delta_{k} \rightarrow0\). This concludes the proof since to combine the two terms one can take \(\delta_{k}\) to be twice the maximum of the \(\delta_{k}\) for the individual terms. □

The next lemma proves the convergence in Lemma 9.4 for \(f\in \mathbb {L}^{1}(E,p)\), not just \(f\in C^{2}(E;\mathbb {R}_{+})\).

Lemma 9.5

Let Assumption  2.7 hold. Then for all \(x > 0\) and all \(h\in\mathcal{C}\),

$$ \mathbb {Q}\text{-}\lim_{T\to\infty} \frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi ^{x}_{t})\,\mathrm {d}t = \int_{F}h\,\mathrm {d}\pi. $$
(9.18)

Proof

By mollifying \(f\), since \(p\) is tight in \(E\), there exists a sequence of functions \(f^{n}\in C^{2}(E)\cap \mathbb {L}^{1}(E,p)\) with \(f^{n}\geq0\) such that

$$ \int_{E} |f^{n}(z)-f(z)|p(z)\,\mathrm {d}z \leq n^{-2}2^{-n}. $$

Note that

$$ \begin{aligned} \mathbb {E}\left [\int_{0}^{\infty}ne^{-t/n}|f^{n}(Z_{t})-f(Z_{t})|\,\mathrm {d}t\right ] &= \int _{0}^{\infty}ne^{-t/n}\mathbb {E}[|f^{n}(Z_{t})-f(Z_{t})|]\,\mathrm {d}t\\ &= \int_{0}^{\infty}ne^{-t/n}\left(\int_{E} |f^{n}(z)-f(z)|p(z)\,\mathrm {d}z\right)\,\mathrm {d}t\\ &\leq\int_{0}^{\infty}n^{-1}e^{-t/n}2^{-n}\,\mathrm {d}t\\ &= 2^{-n}. \end{aligned} $$

Thus, by the Borel–Cantelli lemma, it follows that ℙ-almost surely

$$ \lim_{n\rightarrow\infty} \int_{0}^{\infty}ne^{-t/n}|f^{n}(Z_{t})-f(Z_{t})|\,\mathrm {d}t = 0. $$

For \(n>\kappa\) from Assumption 2.7, let \(A_{n} = n^{-1}\sup_{t \in \mathbb {R}_{+}}(e^{t/n}D_{t})\). Note that we have \(\lim _{n\rightarrow\infty} A_{n} = 0\) almost surely since for each \(\delta>0\), we can find a ℙ-almost surely finite random variable \(T = T(\delta)\) so that \(D_{t} \leq\delta e^{-\kappa t}\) for \(t\geq T\), and hence

$$ A_{n} = \frac{1}{n}\sup_{t\in t \in \mathbb {R}_{+}}(e^{t/n}D_{t}) \leq\frac{1}{n}e^{T/n}\sup _{t\leq T} D_{t} + \frac{\delta}{n}. $$

Since

$$ \int_{0}^{\infty}D_{t}|f^{n}(Z_{t})-f(Z_{t})|\,\mathrm {d}t \leq A_{n}\int_{0}^{\infty}ne^{-t/n}|f^{n}(Z_{t})-f(Z_{t})|\,\mathrm {d}t, $$

we see that

$$ \lim_{n\rightarrow\infty} \int_{0}^{\infty}D_{t}|f^{n}(Z_{t})-f(Z_{t})|\,\mathrm {d}t = 0\qquad \mathbb {P}\textrm{-a.s.} $$
(9.19)

Thus, with \(X^{n}_{0}:=\int_{0}^{\infty}D_{t} f^{n}(Z_{t})\,\mathrm {d}t\), we get \(\lim _{n\rightarrow\infty} X^{n}_{0} = X_{0}\) almost surely, and hence if \(\pi^{n}\) is the joint distribution of \((Z_{0},X^{n}_{0})\), then \(\pi^{n}\) converges to \(\pi\) weakly as \(n\rightarrow\infty\). Now, on the same probability space as in Lemma 9.1, define

$$ \chi^{x,n}_{t}:=\varDelta_{t}\bigg(x + \int_{0}^{t} \varDelta_{t}^{-1}f^{n}(\zeta _{t})\,\mathrm {d}t\bigg),\quad t\geq0. $$

Note that

$$ |\chi^{n,x}_{t}-\chi^{x}_{t}| \leq\varDelta_{t}\int_{0}^{t} \varDelta_{u}^{-1}|f^{n}(\zeta _{u})-f(\zeta _{u})|\,\mathrm {d}u, \qquad\forall\ t\geq0, $$

and by construction the law of the process on the right-hand side above under ℚ is the same as the law of \(\int_{0}^{\cdot}D_{u} \left |f^{n}(Z_{u})-f(Z_{u})\right|\,\mathrm {d}u\) under ℙ. It thus follows that for \(\delta> 0\),

$$ \sup_{t \in \mathbb {R}_{+}}\mathbb {Q}[|\chi^{n,x}_{t}-\chi^{x}_{t}| > \delta] \leq \mathbb {P}\left [\int_{0}^{\infty}D_{u}|f^{n}(Z_{u})-f(Z_{u})|\,\mathrm {d}u > \delta\right ] =: \phi^{n}(\delta). $$

By (9.19), we can find a nonnegative sequence \(\delta _{n}\rightarrow0\) with \(\lim_{n\rightarrow\infty}\phi^{n}(\delta_{n}) = 0\). Now, for \(h\in\mathcal{C}\), we have almost surely for \(t\geq0\) that

$$ |h(\zeta _{t},\chi^{n,x}_{t}) - h(\zeta _{t},\chi^{x}_{t})| \leq K (1\wedge|\chi ^{n,x}_{t} - \chi^{x}_{t}|). $$

Therefore, with \(\hat{\pi}^{x,n}_{T}\) denoting the empirical law of \((\zeta ,\chi^{n,x})\), we have

$$ \mathbb {E}^{\mathbb {Q}}\big[| \langle h , \hat{\pi}^{x,n}_{T} \rangle - \langle h , \hat{\pi}^{x}_{T} \rangle |\big] \leq\frac{K}{T}\int_{0}^{T}\mathbb {E}^{\mathbb {Q}}\big[1\wedge|\chi^{n,x}_{t} - \chi^{x}_{t}|\big]\,\mathrm {d}t. $$

Since for any \(0<\delta< 1\) and random variable \(Y\), we have \(\mathbb {E}[1\wedge|Y|] \leq\delta+ \mathbb {P}[|Y|>\delta]\), it follows that for any \(n\),

$$ \sup_{T\in \mathbb {R}_{+}}\mathbb {E}^{\mathbb {Q}}\big[| \langle h , \hat{\pi }^{x,n}_{T} \rangle - \langle h , \hat{\pi}^{x}_{T} \rangle |\big] \leq K\big(\phi^{n}(\delta) + \delta\big), $$

and hence for the given sequence \((\delta_{n})\) that

$$ \limsup_{n\rightarrow\infty} \sup_{T\in \mathbb {R}_{+}}\mathbb {E}^{\mathbb {Q}}\big[| \langle h , \hat{\pi}^{x,n}_{T} \rangle - \langle h , \hat{\pi}^{x}_{T} \rangle |\big] \leq\limsup_{n\rightarrow\infty} K\big(\phi^{n}(\delta_{n}) + \delta_{n}\big) = 0. $$
(9.20)

Now fix a sequence \((T_{k})\) such that \(\lim_{k\rightarrow\infty} T_{k} = \infty\). Since Lemma 9.4 implies for each \(n\) that \(\mathbb {Q}\text{-}\lim_{T\rightarrow\infty} | \langle h , \hat{\pi }^{x,n}_{T} \rangle - \langle h , \pi^{n} \rangle | = 0\) for each \(n\), we can find a \(T_{k_{n}}\) so that

$$ \mathbb {Q}\left [| \langle h , \hat{\pi}^{x,n}_{T_{k_{n}}} \rangle - \langle h , \pi ^{n} \rangle | > \frac{1}{n}\right ] < \frac{1}{n}. $$

It thus follows that

$$ \mathbb {Q}\text{-$\lim_{n\rightarrow\infty}$} | \langle h , \hat{\pi }^{n,x}_{T_{k_{n}}} \rangle - \langle h , \pi^{n} \rangle | = 0. $$

Since \(\lim_{n\rightarrow\infty} | \langle h , \pi^{n} \rangle - \langle h , \pi \rangle | = 0\), it follows by (9.20) for each \(\gamma> 0\) that

$$ \begin{aligned} &\mathbb {Q}\big[| \langle h , \hat{\pi}^{x}_{T_{k_{n}}} \rangle - \langle h , \pi \rangle | > \gamma\big]\\ &\quad \leq \mathbb {Q}\bigg[| \langle h , \hat{\pi}^{x}_{T_{k_{n}}} \rangle - \langle h , \hat{\pi}^{x,n}_{T_{k_{n}}} \rangle | > \frac{\gamma}{3}\bigg] + \mathbb {Q}\bigg[| \langle h , \hat{\pi}^{x,n}_{T_{k_{n}}} \rangle - \langle h , \pi^{n} \rangle | > \frac {\gamma}{3}\bigg]\\ &\qquad{}+ 1_{\{| \langle h , \pi^{n} \rangle - \langle h , \pi \rangle | > \frac {\gamma}{3}\}}\\ &\quad \leq\frac{3}{\gamma}\sup_{T\in \mathbb {R}_{+}}\mathbb {E}^{\mathbb {Q}}\big[|\big\langle h , \hat{\pi}^{x}_{T} \big\rangle - \langle h , \hat{\pi}^{x,n}_{T} \rangle |\big] + \mathbb {Q}\bigg[| \langle h , \hat{\pi}^{x,n}_{T_{k_{n}}} \rangle - \langle h , \pi^{n} \rangle | > \frac {\gamma}{3}\bigg]\\ &\qquad{} + 1_{\{| \langle h , \pi^{n} \rangle - \langle h , \pi \rangle | > \frac {\gamma}{3}\}}\\ &\quad \longrightarrow 0\quad \textrm{ as } n\rightarrow\infty. \end{aligned} $$

We have just shown that for any sequence \(( \langle h , \hat{\pi }^{x}_{T_{k}} \rangle )\), there is a subsequence \(( \langle h , \hat{\pi }^{x}_{T_{k_{n}}} \rangle )\) which converges in ℚ-probability to \(\langle h , \pi \rangle \), which proves that \(( \langle h , \hat{\pi}^{x}_{T} \rangle )\) converges in ℚ-probability to \(\langle h , \pi \rangle \), proving (9.18). □

The next lemma strengthens the convergence in Lemma 9.5 to almost sure convergence under ℚ, but for \(\pi\)-almost every \(x>0\), for \(h\in\mathcal{C}\) from (9.10).

Lemma 9.6

Let Assumption  2.7 hold. Then for all \(h\in\mathcal {C}\) and \(\pi\)-almost every \(x>0\),

$$ \lim_{T\to\infty}\frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi^{x}_{t})\,\mathrm {d}t = \int_{F} h \,\mathrm {d}\pi,\qquad \mathbb {Q}\textrm{-a.s.} $$
(9.21)

Proof

We again use the notation in (9.12). Recall \(\chi\) from Lemma 9.1 and define \(\hat{\pi}_{T}\) as the empirical law of \((\zeta ,\chi)\) on \([0,T]\). Given that \((\zeta ,\chi )\) is stationary under ℚ, the ergodic theorem implies for all bounded measurable functions \(h\) on \(F\) that there is a random variable \(Y\) such that

$$ \lim_{T\to\infty} \langle h , \hat{\pi}_{T} \rangle = Y\qquad \mathbb {Q}\textrm{-a.s.} $$
(9.22)

By Lemma 9.5, it holds that for \(h\in \mathcal{C}\), \(Y= \langle h , \pi \rangle \) with ℚ-probability one. Indeed, let \(\delta> 0\) and note that

$$ \begin{aligned} \mathbb {Q}[|Y - \langle h , \pi \rangle |\geq\delta] &\leq \mathbb {Q}[|Y- \langle h , \hat{\pi}_{T} \rangle | + | \langle h , \hat{\pi}_{T} \rangle - \langle h , \pi \rangle | \geq\delta]\\ &\leq \mathbb {Q}\left [|Y- \langle h , \hat{\pi}_{T} \rangle | \geq\frac{\delta}{2}\right ] + \mathbb {Q}\left [| \langle h , \hat{\pi}_{T} \rangle - \langle h , \pi \rangle | \geq\frac {\delta}{2}\right ]. \end{aligned} $$

The first of these two terms goes to 0 by (9.22). As for the second, denote by \(\pi|_{x}\) the marginal of \(\pi\) with respect to \(\chi\). Then

$$ \mathbb {Q}\left [| \langle h , \hat{\pi}_{T} \rangle - \langle h , \pi \rangle | \geq\frac {\delta}{2}\right ] = \int_{0}^{\infty}\pi|_{x}(\mathrm {d}x)\mathbb {Q}\left [| \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \pi \rangle | \geq\frac{\delta}{2}\right ]. $$

By Lemma 9.4, the integrand goes to 0 as \(T\to \infty\) for all \(x > 0\), and thus the result follows by the bounded convergence theorem. Next, we have

$$ 1 = \mathbb {Q}\Big[\lim_{T\to\infty} \langle h , \hat{\pi}_{T} \rangle = \langle h , \pi \rangle \Big] = \int_{0}^{\infty}\pi|_{x}(\mathrm {d}x)\mathbb {Q}\Big[\lim _{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle = \langle h , \pi \rangle \Big], $$

and thus (9.21) holds for \(\pi\)-a.e. \(x>0\), finishing the proof. □

The last preparatory lemma strengthens Lemma 9.6 to show almost sure convergence for all starting points \(x>0\), not just \(\pi\)-almost every \(x>0\).

Lemma 9.7

Let Assumption  2.7 hold. Then for all \(h\in\mathcal{C}\) and all \(x>0\),

$$ \lim_{T\to\infty}\frac{1}{T}\int_{0}^{T} h(\zeta _{t},\chi^{x}_{t})\,\mathrm {d}t = \int_{F} h \,\mathrm {d}\pi\qquad \mathbb {Q}\textrm{-a.s.} $$
(9.23)

Proof

Recall from Remark 3.5 that \(\chi^{x}\) takes the form

$$ \chi^{x}_{t} = \varDelta_{t}\left(x + \int_{0}^{t} \frac{1}{\varDelta_{t}}f(\zeta _{t})\,\mathrm {d}t\right),\qquad t\geq0. $$
(9.24)

Let \(h\in\mathcal{C}\). By Lemma 9.6, there is some \(x_{0} > 0\) such that (9.23) holds. Using the notation in (9.12) and (9.24), it easily follows for any \(x>0\) that

$$ \begin{aligned} | \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \hat{\pi}^{x_{0}}_{T} \rangle | &\leq \frac{1}{T}\int_{0}^{T} |h(\zeta _{t},\chi^{x}_{t}) - h(\zeta _{t},\chi^{x_{0}}_{t})|\,\mathrm {d}t \\ &\leq\frac{K}{T}\int_{0}^{T}(1\wedge|\chi^{x}_{t}-\chi^{x_{0}}_{t}|)\,\mathrm {d}t\\ & = \frac{K}{T}\int_{0}^{T} (1\wedge\varDelta_{t}|x-x_{0}|)\,\mathrm {d}t \leq\frac {K|x-x_{0}|}{T}\int_{0}^{\infty}\varDelta_{t} \,\mathrm {d}t. \end{aligned} $$

We show below that \(\mathbb {Q}[\int_{0}^{\infty}\varDelta_{t}\,\mathrm {d}t < \infty] = 1\). Admitting this, it holds that ℚ-almost surely, \(\lim_{T\to\infty } | \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \hat{\pi}^{x_{0}}_{T} \rangle | = 0\), and hence the result follows since (9.23) holds for \(x_{0}\).

It remains to prove that \(\mathbb {Q}[\int_{0}^{\infty}\varDelta_{t} \,\mathrm {d}t < \infty] = 1\). By way of contradiction, assume there is some \(0<\delta\leq1\) so that \(\mathbb {Q}[\int_{0}^{\infty}\varDelta_{t} \,\mathrm {d}t = \infty] = \delta\). Then, for each \(N\) it holds that \(\mathbb {Q}[\int_{0}^{\infty}\varDelta_{t} \,\mathrm {d}t > N]\geq\delta\), which in turn implies \(\lim_{T\to\infty} \mathbb {Q}[\int _{0}^{T} \varDelta_{t} \,\mathrm {d}t > N] \geq\delta\). By construction, for any fixed \(T>0\), the law of \(\varDelta\) on \([0,T]\) under ℚ coincides with the law of \(D\) under ℙ on \([0,T]\). It thus holds that \(\lim_{T\to \infty} \mathbb {P}[\int_{0}^{T} D_{t} \,\mathrm {d}t > N] \geq\delta\). But this gives \(\mathbb {P}[\int_{0}^{\infty}D_{t} \,\mathrm {d}t > N] \geq\delta\) for all \(N\) and hence \(\mathbb {P}[\int_{0}^{\infty}D_{t} \,\mathrm {d}t = \infty] > 0\). But this violates Assumption 2.7 since \(\lim_{t\to\infty} e^{\kappa t}D_{t} = 0\) ℙ-almost surely for some \(\kappa>0\). Thus \(\mathbb {Q}[\int _{0}^{\infty}\varDelta_{t} \,\mathrm {d}t < \infty] = 1\), finishing the proof. □

With all the above lemmas, the proof of Theorem 3.4 is now given.

Proof of Theorem 3.4

We again adopt the notation in (9.12). In view of Lemma 9.1, the remaining statement in Theorem 3.4 which must be proved is that there is a set \(\varOmega_{0}\in \mathcal {F}_{\infty}\) with \(\mathbb {Q}[\varOmega_{0}] = 1\) such that (3.13) holds, i.e.,

$$ \omega\in\varOmega_{0}\qquad \Longrightarrow \qquad \lim_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle (\omega) = \langle h , \pi \rangle ,\quad \textrm {for all}\ x>0, h\in C_{b}(F;\mathbb {R}). $$

Recall the definition of \(\mathcal{C}\) from (9.10) and let \(h\in C_{b}(F;\mathbb {R})\cap\mathcal{C}\). In view of Lemma 9.7, there is a set \(\varOmega_{h}\in \mathcal {F}_{\infty}\) such that \(\mathbb {Q}[\varOmega_{h}] = 1\) and

$$ \omega\in\varOmega_{h}\qquad \Longrightarrow \qquad \lim_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle (\omega) = \langle h , \pi \rangle ,\quad \textrm {for all}\ x>0. $$

Let the (countable) subset \(\tilde{\mathcal{C}}\subset\mathcal{C}\) be as in the technical Lemma A.2 below and set \(\varOmega _{0} = \bigcap_{h\in\tilde{\mathcal{C}}} \varOmega_{h}\). Clearly, \(\mathbb {Q}[\varOmega_{0}] = 1\). Let \(\omega\in\varOmega_{0}\) and \(h\in C_{b}(F;\mathbb {R})\) with \(C=\sup_{y\in F}|h(y)|\). Let \(\varepsilon > 0\) and for \(n\geq5\) take as in Lemma A.2 \({}^{\uparrow}\phi^{n}_{m,k}, {}^{\downarrow}\phi ^{n}_{m,k}\) and \(\theta^{n}\) such that (A.7) holds. In what follows, the \(\omega\) will be suppressed, but all evaluations are understood to hold for this \(\omega\).

Let \(x>0\). With \(\nu\) from (A.7) equal to \(\hat{\pi }^{x}_{T}\), it follows that

$$\begin{aligned} \big\langle {}^{\uparrow}\phi^{n}_{m,k} , \hat{\pi}^{x}_{T} \big\rangle - 2C \langle 1-\theta ^{n-4} , \hat{\pi}^{x}_{T} \rangle - 2\varepsilon &\leq \langle h , \hat{\pi}^{x}_{T} \rangle \\ &\leq \big\langle {}^{\downarrow}\phi^{n}_{m,k} , \hat{\pi}^{x}_{T} \big\rangle + 2C \langle 1-\theta^{n-4} , \hat{\pi}^{x}_{T} \rangle + 2\varepsilon . \end{aligned}$$

With \(\nu\) from (A.7) equal to \(\pi\), one obtains

$$ \big\langle {}^{\uparrow}\phi^{n}_{m,k} , \pi \big\rangle - 2C \langle 1-\theta^{n-4} , \pi \rangle - 2\varepsilon \leq \langle h , \pi \rangle \leq \big\langle {}^{\downarrow}\phi ^{n}_{m,k} , \pi \big\rangle + 2C \langle 1-\theta^{n-4} , \pi \rangle + 2\varepsilon . $$

Putting these two together yields

$$ \begin{aligned} \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \pi \rangle &\geq \big\langle {}^{\uparrow}\phi^{n}_{m,k} , \hat{\pi}^{x}_{T} \big\rangle - 2C \langle 1-\theta ^{n-4} , \hat{\pi}^{x}_{T} \rangle - 2\varepsilon \\ &\quad{} - \big(\big\langle {}^{\downarrow}\phi^{n}_{m,k} , \pi \big\rangle + 2C \langle 1-\theta^{n-4} , \pi \rangle + 2\varepsilon \big)\\ &= \big\langle {}^{\uparrow}\phi^{n}_{m,k} , \hat{\pi}^{x}_{T} \big\rangle - \big\langle {}^{\downarrow}\phi^{n}_{m,k} , \pi \big\rangle \\ &\quad{} - 2C\big( \langle 1-\theta^{n-4} , \hat{\pi}^{x}_{T} \rangle + \langle 1-\theta^{n-4} , \pi \rangle \big) - 4\varepsilon . \end{aligned} $$

Since \(\theta^{n-4},{}^{\uparrow}\phi^{n}_{m,k}, {}^{\downarrow}\phi^{n}_{m,k} \in\tilde{\mathcal{C}}\subset\mathcal{C}\), taking \(T\to\infty\) gives

$$ \liminf_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \pi \rangle \geq \big\langle {}^{\uparrow}\phi^{n}_{m,k} , \pi \big\rangle - \big\langle {}^{\downarrow}\phi ^{n}_{m,k} , \pi \big\rangle - 4C \langle 1-\theta^{n-4} , \pi \rangle - 4\varepsilon . $$

Now by Lemma A.2, for fixed \(m,n\), the functions \({}{}^{\uparrow}\phi^{n}_{m,k}\) and \({}^{\downarrow}\phi^{n}_{m,k}\) are increasing and decreasing, respectively, in \(k\) and such that both a) \(\lim_{k\to\infty} {}^{\downarrow}\phi^{n}_{m,k}(y) - {}^{\uparrow}\phi ^{n}_{m,k}(y) = 0\) for \(y\in\bar{F}_{n-2}\), and b) \(|{}^{\uparrow}\phi ^{n}_{m,k}(y)-{}^{\uparrow}\phi^{n}_{m,k}(y)| \leq2C + 2\varepsilon \) for all \(y\in F\) and \(n,m,k\). Therefore, taking \(k\to\infty\) in the above and using the monotone convergence theorem, we obtain

$$ \liminf_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \pi \rangle \geq-2(C+\varepsilon )\pi [\bar{F}_{n-2}^{c}] - 4C \langle 1-\theta ^{n-4} , \pi \rangle - 4\varepsilon . $$

From Lemma A.2, we know that \(0\leq\theta ^{n}(y)\leq1\), \(\lim_{n\to\infty}\theta^{n}(y) = 1\) for all \(y\in F\). Thus, by the bounded convergence theorem and the fact that \(\pi\) is tight in \(F\), it follows by taking \(n\uparrow\infty\) that

$$ \liminf_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \pi \rangle \geq- 4\varepsilon . $$

Taking \(\varepsilon \downarrow0\) gives that \(\liminf_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle - \langle h , \pi \rangle \geq0\). Thus, we have just shown for \(\omega\in\varOmega_{0}\), \(x>0\) and \(h\in C_{b}(F;\mathbb {R})\) that

$$ \liminf_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle (\omega) - \langle h , \pi \rangle \geq0. $$

By applying the above to \(\hat{h} = -h \in C_{b}(F;\mathbb {R})\), we see that

$$ \limsup_{T\to\infty} \langle h , \hat{\pi}^{x}_{T} \rangle (\omega) - \langle h , \pi \rangle \leq0, $$

which finishes the proof. □