On Existence and Uniqueness of Stationary Distributions for Stochastic Delay Differential Equations with Positivity Constraints ∗

Deterministic dynamic models with delayed feedback and state constraints arise in a variety of applications in science and engineering. There is interest in understanding what effect noise has on the behavior of such models. Here we consider a multidimensional stochastic delay differential equation with normal reflection as a noisy analogue of a deterministic system with delayed feedback and positivity constraints. We obtain sufﬁcient conditions for existence and uniqueness of stationary distributions for such equations. The results are applied to an example from Internet rate control and a simple biochemical reaction system.


Overview
Dynamical system models with delay are used in a variety of applications in science and engineering where the dynamics are subject to propagation delay. Examples of such application domains include packet level models of Internet rate control where the finiteness of transmission times leads to delay in receipt of congestion signals or prices [25; 37], neuronal models where the spatial distribution of neurons can result in delayed dynamics, epidemiological models where incubation periods result in delayed transmission of disease [5], and biochemical reactions in gene regulation where lengthy transcription and translation operations have been modeled with delayed dynamics [1; 4; 21]. There is an extensive literature, both theoretical and applied on ordinary delay differential equations. The book [13] by Hale and Lunel provides an introduction to this vast subject.
In some applications, the quantities of interest are naturally positive. For instance, rates and prices in Internet models are positive, concentrations of ions or chemical species and proportions of a population that are infected are all naturally positive quantities. In deterministic differential equation models for the delayed dynamics of such quantities, the dynamics may naturally keep the quantities positive or they may need to be adapted to be so, sometimes leading to piecewise continuous delay differential dynamics, see e.g., [25; 26; 27; 28; 29]. There is some literature, especially applied, on the latter, although less than for unconstrained delay systems or naturally constrained ones.
Frequently in applications, noise is present in a system and it is desirable to understand its effect on the dynamics. For unconstrained systems, one can consider ordinary delay differential equations with an addition to the dynamics in the form of white noise or even a state dependent noise. There is a sizeable literature on such stochastic delay differential equations (SDDE) [2; 7; 11; 15; 19; 20; 22; 23; 30; 34; 35; 36]. To obtain the analogue of such SDDE models with positivity constraints, in general, it is not simply a matter of adding a noise term to the ordinary differential equation dynamics, as this will frequently not lead to a solution respecting the state constraint, especially if the dispersion coefficient depends on a delayed state.
As described above, there is natural motivation for considering stochastic differential equations where all three features, delay, positivity constraints and noise, are present. However, there has been little work on systematically studying such equations. One exception is the work of Kushner (see e.g., [17]), although this focuses on numerical methods for stochastic delay differential equations (including those with state constraints), especially those with bounded state space. We note that the behavior of constrained systems can be quite different from that of unconstrained analogues, e.g., in the deterministic delay equation case, the addition of a positivity constraint can turn an equation with unbounded oscillatory solutions into one with bounded periodic solutions, and in the stochastic delay equation case, transient behavior can be transformed into positive recurrence.
Here we seek conditions for existence and uniqueness of stationary distributions for stochastic delay differential equations with positivity constraints of the form: where X (t) takes values in the closed positive orthant of some Euclidean space, τ ∈ [0, ∞) is the length of the delay period, X s = {X (s + u) : −τ ≤ u ≤ 0} tracks the history of the process over the delay period, W is a standard (multi-dimensional) Brownian motion noise source and the stochastic integral with respect to W is an Itô integral, and Y is a vector-valued non-decreasing process which ensures that the positivity constraints on X are enforced. In particular, the i th component of Y can increase only when the i th component of X is zero. We refer to equations of the form (1) as stochastic delay differential equations with reflection, where the action of Y is termed reflection (at the boundary of the orthant).
This paper is organized as follows. Our assumptions on the coefficients b and σ for well-posedness of (1), the rigorous definition of a solution of (1), and some properties of solutions are given in Section 2.1. Our main result giving sufficient conditions for existence and uniqueness of stationary distributions for (1) is stated in Section 2.2, and some examples of applications of the result are given in Section 2.3. In preparation for Section 3, a useful a priori moment bound on solutions to (1) is given in Section 2.4. Section 3 focuses on establishing sufficient conditions for existence of stationary distributions. A general condition guaranteeing existence is described in Section 3.1. This condition is in terms of uniform moment bounds, and it is fairly standard. Such bounds for second moments are shown to hold in Sections 3.4 and 3.5, under certain conditions on b and σ. The results of Sections 3.1, 3.4 and 3.5 are combined to give sufficient conditions for existence of a stationary distribution in Section 3.6. Our proofs of the moment bounds use stochastic Lyapunov/Razumikhintype arguments applied to suitable functions of an overshoot process which is introduced in Section 3.2. For these arguments, the positive oscillation of a path, which is introduced in Section 3.3, proves to be a useful refinement of the usual notion of oscillation of a path. While our main results in Section 3 are new, we do use some results and adapt some techniques developed by Itô and Nisio [15] and Mao [20] for stochastic delay differential equations without reflection. Conditions for uniqueness of a stationary distribution are given in Section 4. Our proofs in that section are an adaptation of methods developed recently by Hairer, Mattingly and Scheutzow [12] for proving uniqueness of stationary distributions for stochastic delay differential equations without reflection. An important new aspect of the results in [12] is that they enable one to obtain uniqueness of stationary distributions for stochastic delay differential equations when the dispersion coefficient depends on the history of the process over the delay period, in contrast to prior results on uniqueness of stationary distributions for stochastic delay differential equations which often restricted to cases where the dispersion coefficient depended only on the current state X (t) of the process [7; 17; 34; 36], with notable exceptions being [15; 30]. The important feature that distinguishes the results of [12] from those of [15; 30] is that the authors of [12] obtain uniqueness of the stationary distribution without requiring the existence of a unique random fixed point; see Section 4 for further discussion of this point. Appendix A states some well-known facts about reflection, and Appendix B covers some inequalities that appear frequently throughout this work.

Notation and Terminology
We shall use the following notation and terminology throughout this work.
For a real number a, we shall say that a is positive if a ≥ 0 and we shall say that a is strictly positive if a > 0. For each strictly positive integer d, let d denote d-dimensional Euclidean space, and let d + = {v ∈ d : v i ≥ 0 for i = 1, . . . , d} denote the closed positive orthant in d . When d = 1, we suppress the d and write for (−∞, ∞) and + for [0, ∞). For each i = 1, . . . , d, the i th component of a column vector v ∈ d will be denoted by v i . For two vectors u, v ∈ d , the statement u ≥ v will mean that u i ≥ v i for each i = 1, . . . , d. For each r ∈ , define r + = max{r, 0} and r − = max{−r, 0}.
For any real numbers r, s, δ r,s denotes the Kronecker delta, i.e., it is one if r = s and zero otherwise.
Unless specified otherwise, we treat vectors v ∈ d as column vectors, i.e., v = (v 1 , . . . , v d ) . For Euclidean norm of v. Let d×m denote the set of d × m matrices with real entries. For a given matrix A ∈ d×m , A i j denotes the entry of the i th row and the j th column, A i denotes the i th row, and A j denotes the j th column. The notation I d will denote the (d × d)-identity matrix. Given a d × m denotes the Frobenius norm of A.
For any metric space with metric ρ, we use B(x, r) (where x ∈ and r > 0) to denote the open ball { y ∈ : ρ(x, y) < r} of radius r around x, and we use ( ) to denote the associated collection of Borel sets of . The set of bounded, continuous real-valued functions on will be denoted by For any two metric spaces 1 , 2 , let C( 1 , 2 ) denote the set of continuous functions from 1 into 2 . Here, 1 will often be a closed interval F ⊂ (−∞, ∞), and 2 will often be d or d + for various dimensions d.
For any integer d and closed interval I in (−∞, ∞), we endow C(I, d ) and C(I, d + ) with the topologies of uniform convergence on compact intervals in I. These are Polish spaces. In the case of C(I, d + ), we use I to denote the associated Borel σ-algebra. We shall also use the abbreviations I = C(I, + ) and d I = C(I, d + ). For a closed interval I in (−∞, ∞), a 1 ≤ a 2 in I and a path x = (x 1 , . . . , x d ) ∈ C(I, d ) we define the oscillation of x over [a 1 , a 2 ] by the modulus of continuity of x over I by and the supremum norm of x over I by Throughout this work, we fix τ ∈ (0, ∞), which will be referred to as the delay. Define = [−τ, 0] and = [−τ, ∞). As a subset of the vector space C( , d ), d has norm that induces its topology of uniform convergence on compact intervals. The associated Borel σalgebra is . For x ∈ d and t ≥ 0, define x t ∈ d by x t (s) = x(t + s) for all s ∈ . It should be emphasized that x(t) ∈ d + is a point, while x t ∈ d is a continuous function on taking values in d + . For each t ∈ + , we define the projection p t : d → d by p t (x) := x t for each x ∈ d .
By a filtered probability space, we mean a quadruple (Ω, , { t , t ≥ 0}, P), where is a σ-algebra on the outcome space Ω, P is a probability measure on the measurable space (Ω, ), and { t , t ≥ 0} is a filtration of sub-σ-algebras of where the usual conditions are satisfied, i.e., (Ω, , P) is a complete probability space, and for each t ≥ 0, t contains all P-null sets of and t+ := ∩ s>t s = t . Given two σ-finite measures µ, ν on a measurable space (Ω, ), the notation µ ∼ ν will mean that µ and ν are mutually absolutely continuous, i.e., for any Λ ∈ , µ(Λ) = 0 if and only if ν(Λ) = 0. By a continuous process, we mean a process with all paths continuous.

Stochastic Delay Differential Equations with Reflection
In this section, we define our assumptions and the notion of a solution to equation (1) precisely. We state our main result and give some examples of its application. We also derive some useful properties of solutions to (1).

Definition of a Solution
Recall from Section 1.2 that we are fixing a τ ∈ (0, ∞), which will be referred to as the delay, and we define = [−τ, 0], = [−τ, ∞), d = C( , d + ) and d = C( , d + ). Furthermore, we fix positive integers d and m, and functions b : d → d and σ : d → d×m that satisfy the following uniform Lipschitz assumption. Although we do not need as strong an assumption as this for well-posedness of (1), we will use this condition in proving uniqueness of stationary distributions. Accordingly, we shall assume the following condition holds throughout this work.
Remark. A simple consequence of the Lipschitz condition (4) is that there exist strictly positive constants C 1 , C 2 , C 3 and C 4 such that for each x ∈ d , The natural initial condition is an initial segment X 0 = x ∈ d , or more generally, an initial distribution µ on ( d , ), i.e., P(X 0 ∈ Λ) = µ(Λ) for each Λ ∈ .
Remark. As a consequence of condition (i) and the continuity of the paths of X , {X t , t ≥ 0} is adapted to { t , t ≥ 0}, and t → X t (ω) is continuous from + into d for each ω ∈ Ω. It follows that the mapping F : , is progressively measurable, being continuous in t and adapted (see Lemma II.73.10 of [32]). Continuity of σ now implies that t 0 σ(X s )dW (s), t , t ≥ 0 is a continuous d-dimensional local martingale; also, condition (ii) and is a continuous adapted process that is locally of bounded variation. Therefore, {X (t), t ≥ 0} is a continuous semimartingale with respect to { t , t ≥ 0}.
For notational convenience, given a continuous adapted stochastic process {ξ(t), t ≥ −τ} taking values in d + and an m-dimensional Brownian motion W , all defined on some filtered probability space (Ω, , { t }, P), we define For a solution X of the SDDER, X (t) = (X )(t) + Y (t), t ≥ 0, where the regulator term, Y , has the following explicit formula in terms of (X ): for each i = 1, . . . , d, In the notation of Appendix A, X = φ( (X )) and Y = ψ( (X )), because of the uniqueness of solutions to the Skorokhod problem; thus, Y is a function of X (cf. (108)). Then as a consequence of Proposition A.0.1(i), we have the following.
Strong existence and uniqueness of a solution to (1) is a consequence of Assumption 2.1. We state this as a proposition. The proof is fairly standard and so we just sketch it.
solves the Skorokhod problem for (X ) and (X † | + , Y † ) solves the Skorokhod problem for (X † ), then for each T > 0 there exist constants K T , K T ≥ 0 such that for all stopping times η, By truncating the initial condition, we can reduce the proof of existence to the case where E[ ξ 2 ] < ∞. Existence in this case then follows by a standard Picard iteration using (9) (see, e.g., Chapter 10 of [6]). Gronwall's inequality is used to prove uniqueness. Feller continuity follows from the standard argument that given a sequence of deterministic initial conditions {x n } ∞ n=1 such that lim is tight, and any weak limit point is the distribution of the solution X x . The Markov property for the transition functions then follows from the uniqueness of solutions of (1).
Remark. It should be noted that global Lipschitz continuity is more than necessary to have a welldefined and Feller continuous family of Markovian transition functions associated with (1). One can obtain this same conclusion if the coefficients b and σ are continuous and obey (5) and (6), and weak existence and uniqueness in law of solutions to (1) holds.

Main Result
We begin by defining a stationary distribution.
It is well-known that (non-delayed) Ornstein-Uhlenbeck processes have (unique) stationary distributions, and it is not hard to show that the reflected analogues also have stationary distributions. The following condition for delayed systems is motivated by these facts.
, we have Remark. Note that parts (i) and (ii) restrict b i and σ i only on {x ∈ d : x i (0) ≥ M }, and the control on b i is only one-sided. However, b and σ will always be required to satisfy the Lipschitz condition (4), which implies the linear growth bounds (5) and (6). These restrict the growth of b and σ for this growth control on b and |σ| is weaker than the at-most-integral-linear growth imposed by parts (i) and (ii) of the above assumption.
It is well-known that reflected Brownian motion on the half-line with strictly negative drift has a (unique) stationary distribution. The following assumption (which is distinct from Assumption 2.2) is sufficient for a stationary distribution for (1) to exist and the form of this condition is motivated by the aforementioned fact.

Assumption 2.3.
There exist positive constants K u , M , strictly positive constants K d , K σ , and a measurable function : d → d + , such that for each x ∈ d and i = 1, . . . , d, i (x) ∈ x i ( ), and whenever x i (0) ≥ M , we have: Remark. Assumption 2.3 requires b i and |σ i | to be bounded above on the set {x ∈ d : x i (0) ≥ M }, but this does not necessarily imply that they are bounded above on d . Also, note that unlike (iii) of Assumption 2.2, Assumption 2.3 has no restrictions on the size of the constants M , K u , K d , K σ (beyond strict positivity of K d and K σ ).
The following assumption is using in proving uniqueness of a stationary distribution.
Assumption 2.4. The diffusion matrix σσ is uniformly elliptic, i.e., there is a constant a > 0 such that v σ(x)(σ(x)) v ≥ a|v| 2 for all x ∈ d and v ∈ d .
We shall use the following consequence of Assumption 2.4 in our proofs.

Proposition 2.2.1. Under the Lipschitz condition (4) and Assumption 2.4, the dispersion coefficient σ has a continuous bounded right inverse, i.e., there is a constant C σ > 0 and a continuous function
Proof. Under the assumptions of the proposition, since σσ is continuous and uniformly strictly positive definite, it follows by standard arguments that (σσ ) −1 is a well-defined, continuous and bounded function on d . Then σ † := σ (σσ ) −1 is continuous and is a right inverse for σ. The (uniform) boundedness of σ † follows from the fact that where C σ is a bound on the norm of (σσ ) −1 .
Our main result is the following theorem.

Theorem 2.2.1. Under Assumption 2.1, if Assumption 2.4 and either Assumption 2.2 or 2.3 hold, then there exists a unique stationary distribution for the SDDER (1).
Proof. The result follows from Theorems 3.6.1 and 4.3.1 below.

Examples
and where for γ > 1 sufficiently small, by (111) and the Cauchy-Schwarz inequality, we may take and Assumption 2.4 holds because σ is uniformly positive definite when a 0 > 0.
The SDDER associated with this pair (b, σ) is a noisy version of a simple model used in the study of biochemical reaction systems [21]. In this model, a lengthy transcription/translation procedure leads to delayed negative feedback in the deterministic dynamics.
It is straightforward to verify that b, σ satisfy the uniform Lipschitz Assumption 2.
Also, σ is uniformly positive definite and so Assumption 2.4 holds. Therefore by Theorem 2.2.1, the SDDER associated with this (b, σ) has a unique stationary distribution. [37]. In this context, the finiteness of transmission times results in delayed congestion signals and leads to differential dynamics with delayed negative feedback. There is a considerable body of work on obtaining sufficient conditions for stability of equilibrium points (see e.g., [8; 26; 27; 28; 29; 38; 39; 41]) for such models. It is natural to ask about the properties of noisy versions of these deterministic models and in particular to inquire about the existence and uniqueness of stationary distributions for such models. Here, as an illustration, we consider a noisy version of a model studied by Paganini and Wang [26], Peet and Lall [29], Papachristadolou [27] and Papachristadolou, Doyle and Low [28]. This model has d links and d sources. In the deterministic model the dynamics are given by

Example 2.3.3. Deterministic delay differential equations have been used in modeling the dynamics of data transmission rates and prices in Internet congestion control
where the i th component of X (t) represents the price at time t that link i charges for the transmission of a packet through it. The discontinuous driftb is given for each i = 1, . . . , d and The matrix A, which is related to routing in the network model, is assumed to have full row rank and to be such that for each i, A i j > 0 for some j, indicating that each source must use at least one link. The solutions of (15) remain in the positive orthant by the form ofb (for the meaning of a solution with such a discontinuous right hand side, see, e.g., [10]). Various authors (see e.g., [26; 27; 28; 29]) have given sufficient conditions, principally in terms of smallness of the components of the gain parameter B, for there to be a unique globally asymptotically stable equilibrium point for (16). We now consider a noisy version of (16) and ask when it has a unique stationary distribution.
By uniqueness of solutions, the solutions of the SDDER associated with σ ≡ 0 coincide with the solutions of (15) when the drift b in (1) Allowing σ to be non-zero yields a noisy version of (16). For this noisy version, we assume that m ≥ d and that σ : d → d×m is uniformly Lipschitz continuous and satisfies for some 0 < a 1 < a 2 < ∞. It is easily verified that b is uniformly Lipschitz continuous and for each i = 1, . . . , d, Then, by Theorem 2.2.1, the SDDER with these coefficients (b, σ) has a unique stationary distribution. Thus, this noisy version of (16) has a unique stationary distribution without imposing additional restrictions on the parameters. This is in contrast to the known conditions for stability of the equilibrium solution in the deterministic equation (16).
Our results can be similarly applied to a slightly modified and noisy version of the Internet rate control model studied by Vinnecombe [38; 39] and Srikant et al. [8;41] to yield existence and uniqueness of a stationary distribution for such a model without the strong conditions on the parameters used to obtain stability of the deterministic model. In particular, consider the deterministic delay model in equations where g(s) = s for s ≤ K and g(s) = K for s ≥ K where K is a sufficiently large positive constant, and truncate the feedback functions f l with an upper bound once the argument of these functions gets to a high level. Then with a dispersion coefficient of the same form as in (17)

Moment Bounds over Compact Time Intervals
Under Assumption 2.1, any solution X of the SDDER (1) satisfies the following supremum bound.
In fact, where the functions k p andk p are non-decreasing on (0, ∞) and they depend only on p, the dimensions d, m, and the linear growth constants C 1 , C 2 , C 3 , C 4 from (5) and (6).
Sketch of proof. Inequality (110) and Proposition 2.1.1 can be used to obtain for any T > 0, The remainder of the proof follows from a standard argument (cf., Theorem 2.3 in Chapter 3 of [19]) using the linear growth conditions (5) and (6), the Burkholder-Davis-Gundy inequalities and a standard stopping argument allowing us to use Gronwall's inequality.

Existence of a Stationary Distribution
In this section, we prove that either Assumption 2.2 or 2.3 (in addition to Assumption 2.1) is sufficient to imply the existence of a stationary distribution for the SDDER (1). Throughout this section, we assume that X is a solution of the SDDER (1) with a possibly random initial condition X 0 . When the initial condition for X is deterministic, we will sometimes use the notation X x o for the unique solution with the initial condition x o . We begin in Section 3.1 by describing a general sufficient condition for existence of a stationary distribution in terms of uniform (in t) moment bounds for X t . We then use stochastic Lyapunov/Razumikhin-type arguments to verify that such bounds hold for second moments under either Assumption 2.2 or Assumption 2.3. Lyapunov-type functions are applied to an auxiliary process which we call the overshoot process and which we introduce in Section 3.2. In Section 3.3 we develop some preliminary results on the "positive oscillation" of a path. Sections 3.4 and 3.5 contain the key technical arguments for establishing the moment bounds under Assumption 2.2 and Assumption 2.3, respectively. Loosely speaking, each of these assumptions implies that, for each i, the i th component of b has a term providing a push in the negative direction (towards zero) on the set {x ∈ d : |x i (0)| ≥ M } for some M > 0. The two assumptions are distinguished by differences in the size of this "restoring force" and on the additional terms composing b and the assumptions on σ. Assumption 2.2 allows the additional terms in b to grow (in a sufficiently controlled manner) but requires the negative push in b i (x) to be at least proportional to a value lying in the range of x i . For Assumption 2.3, |σ| and the components of b are bounded above and the negative push is strictly negative and bounded away from zero. In Section 3.6 we combine the results of the preceding subsections to obtain the desired existence result.
Remark. Scrutiny of our proofs reveals that the results of this section still hold if Assumption 2.1 is replaced by the weaker assumptions that weak existence and uniqueness in law holds for (1), and that the coefficients b and σ are continuous and satisfy the linear growth conditions (5) and (6). As noted in the Remark following Proposition 2.1.2, under the latter conditions, the solutions of (1) define a Feller continuous Markov process. As explained in Section 2, we have assumed the stronger Assumption 2.1 throughout this paper because this assumption will be used critically in our uniqueness proof.

Sufficient Conditions for Existence of a Stationary Distribution
A common method for showing the existence of a stationary distribution for a Markov process is to exhibit a limit point of a sequence of Krylov-Bogulyubov measures [2; 7; 15; 30]. In light of that, given x o ∈ d and T > 0, we define the probability measure Q Remark. The function u → P u (x o , Λ) is measurable as a consequence of the stochastic continuity of the family {P t (·, ·), t ≥ 0}, which follows from the continuity of the paths of X x o .
The following theorem gives sufficient conditions for the existence of a stationary distribution for the SDDER (1). Although we only use this result with p = 2 in this work, we give the result for general p > 0 as the proof is similar for all p. Proof. By Markov's inequality, for any T > 0 and a > 0, The last term tends to zero as a → ∞, independently of T .
Fix , λ > 0 and u ≥ τ, and recall the notation for the modulus of continuity from (3). The linear growth condition (5) and Proposition 2.1.1 imply that for any δ > 0: By Markov's inequality, which approaches zero as a → ∞. This implies that there is δ ,λ ].
For any T ≥ 2τ ∨ τ and 0 < δ < δ ,λ , on combining the above we have Tightness of {Q

Overshoot Process
LetM ≥ 0. For each i = 1, . . . , d, define the overshoot, Z i , of X i by Part (iii) of Definition 2.1.1 implies that (1) and Tanaka's formula for continuous semimartingales (see, e.g., Theorem 1.2 of Chapter VI in [31]), we have that P-a.s., for all t ≥ 0, where L i is a constant multiple of the local time of X i atM , which can increase only when X i is at M and hence only when Z i is at zero.
The following consequence of Itô's formula will be useful in Sections 3.4 and 3.5. For each t ≥ 0, where we have used the facts that Z i (t) = 0 when X i (t) ≤M and L i can increase only when Z i is at zero. Thus we have

Positive Oscillation
We now introduce the notion of the positive oscillation of a real-valued path over a given time interval. This refinement of the oscillation of a path (2) is well suited to our problem, and it still obeys an inequality analogous to (107). Osc Note that there is no absolute value in the definition of Osc + , so that we have the following obvious inequality: We also have the following inequalities: for all x ∈ d and i = 1, . . . , d, We have the following property of Osc + when it is applied to a reflected path.
(ii) y(t 1 ) ≥ 0, y(·) is non-decreasing, and (iii) y(·) can only increase when z is at zero, i.e., Then, Proof. By continuity of z and compactness of the triangle {(s, t) : then the inequality (31) is clear. So we suppose that s < t. Then there are two cases to consider.

Case 1: Assume that y(s) = y(t).
Then Thus, y cannot increase on (u , t] by (iii), and so by continuity, y(u ) = y(t). Then we have that where we have used the facts that z(s) ≥ 0, z(u ) = 0 and y(t) − y(u ) = 0.
We will also use the following technical lemma.

Lemma 3.3.2.
For each i = 1, . . . , d andM ≥ 0, for any 0 ≤ t 1 < t 2 < ∞, P-a.s., The inequality (35) can be readily verified by considering s ≤ t in [t 1 , t 2 ] such that the left hand side above is equal to X i (t) − X i (s) and then considering the three cases: (a) X i (t) <M , (b) X i (t) ≥M and X i (s) ≥M , and (c) X i (t) ≥M and X i (s) <M . Thus, it suffices to estimate Osc + (Z i , [t 1 , t 2 ]).
Since P-a.s., (26) holds and L i can increase only when Z i is zero, we may apply Lemma 3.3.1 to obtain that P-a.s.: where, for each t ≥ 0, and Osc Thus, (34) holds.

Bounded Second Moments when b and σ Satisfy an Integral Growth Condition
Throughout this subsection, we assume that the coefficients b, σ satisfy Assumption 2.2 (in addition to Assumption 2.1) and we setM = M + 1 in the definition of the overshoot process Z in (25). The simple inequalities X i (·) ≤ Z i (·) +M , for each i, reduce the problem of bounding the second moment of X t to that of bounding the second moment of Z t .

Uniform Bound on
Our proof of this theorem uses stochastic Lyapunov/Razumikhin-type arguments similar to those found in a theorem of Mao (Theorem 2.1 of [20]). We first prove two technical lemmas and then the proof of the theorem is given. To simplify notation, in the following we let for i = 1, . . . , d and t ≥ −τ.

Lemma 3.4.1.
Suppose that E X 0 2 < ∞. There exists a constant M 1 > 0 such that whenever t ≥ τ is such that Remark. We will refer to the second inequality in (40) as the Razumikhin assumption.
Proof. Suppose that t ≥ τ is such that (40) holds. For each x ∈ d , there is an r x ∈ d such that for each i = 1, . . . , d, We note that for each u ≥ 0 such that Z i (u) > 0, we have X i (u) >M > M and so the inequalities (10) and (11) hold with x = X u . Then, Here, Assumption 2.2(i) and the positivity of the coordinates of Z were used for the first inequality, and the fact that X (s) ≥ Z(s) for all s ≥ −τ, Lemma 3.3.2 withM = M and the Cauchy-Schwarz inequality were used for the second inequality. Combining the above with parts (i) and (ii) of Assumption 2.2, on taking expectations and setting We now separately develop estimates for the second and third to the last lines in (42). For each i, is a square-integrable martingale. Then, Doob's submartingale inequality, the L 2 isometry for stochastic integrals, the independence of the coordinates of W and (11) imply that Then, using the Cauchy-Schwarz inequality and inequality (109), we have For the second last line in (42), using the Cauchy-Schwarz inequality we obtain that Inequality (111) implies that for any γ > 1 and s ≥ −τ there is a constant K γ ≥ 0, which depends on d andM in addition to γ, such that Then Jensen's inequality, the fact that each µ i 1 is a probability measure, Fubini's theorem, inequality (109) and the Razumikhin assumption in (40) can be used to obtain, for each γ > 1 and each i, where Continuing on from (42), by the Cauchy-Schwarz inequality, (29), (45), (49) and (109) we have that We now examine the fifth last, third last and last lines in (50) more closely. By Hölder's inequality, Fubini's theorem, (47), the Razumikhin assumption and Jensen's inequality, we have for each γ > 1, Therefore, by the Cauchy-Schwarz inequality and (109), we have Similarly, Hölder's inequality, Jensen's inequality, (47), the Razumikhin assumption and (109) also imply that for each γ > 1: Continuing on from line (50), using inequalities (52), (53) and (54) we have where K 1 , K 2 are real-valued functions on (1, ∞), and By Assumption 2.2(iii), we can fix a γ > 1 sufficiently small such that B 1 +B 1 > K 3 (γ)δ q 1 ,1 +(K 4 (γ)+ K 5 (γ))δ q 2 ,2 . For such a γ, All of the exponents for r are at most one. By the choice of γ, the constant in front of the highest degree term is strictly negative and this implies that lim Proof. Let η n = t ∨ inf{s ≥ −τ : |X (s)| ≥ n} for each integer n > 0. We have from equality (28) that for each n > 0 and h ≥ 0, By the definition of η n , the stochastic integral with respect to W in the above has zero mean since it defines a square integrable martingale as a function of h. Since E sup s∈ [−τ,t+h] |Z(s)| 2 < ∞ by Lemma 2.4.1 with p = 2, and b and σ satisfy the linear growth bounds (5) and (6), we can take expectations in (57) and apply the dominated convergence theorem to conclude that Define the continuous function f : Then by (58), (59), dominated convergence and Lebesgue's differentiation theorem we have Here we used the fact that the integrand in the second last line is a continuous function of s, and the fact that f (X i (t)) ≤ 1 (M ,∞) (X i (t)) = i (t). According to Lemma 3.4.1, the last line above is strictly negative under the assumption (40).
If there is no h * > 0 such that (t + s) < (t) for each s ∈ (0, h * ], then we can construct a sequence {h n } ∞ n=1 of positive numbers decreasing to zero such that (t + h n ) ≥ (t) for all n. Then lim h→0+ (t+h)− (t) h ≥ 0, which is a contradiction to (60). Therefore there is an h * > 0 such that (56) holds. Proof. Recall that we are assuming that Assumption 2.2 holds and that the overshoot process Z is defined by (25) Using (110) and the Cauchy-Schwarz inequality, we have for each i = 1, . . . , d, and by a similar argument to that used for (44), by (11) we have for each i: where we have used the facts that |Z(s)| ≤ |X (s)| for all s ≥ −τ, 0 < q 1 ≤ 1 and 0 < q 2 ≤ 2. The last line above is independent of t ≥ τ and so we have Combining this with the hypothesis of the theorem and the fact that yields the desired result.

Bounded Second Moments when b and σ Satisfy a Boundedness Assumption
Throughout this subsection, we assume that the coefficients b and σ satisfy Assumption 2.3 (in addition to Assumption 2.1) and we setM = M in the definition of the overshoot process Z. In Theorem 3.5.1 below, for each i = 1, . . . , d, we show that E[exp(αX i (t))] is bounded uniformly in t ≥ 0, for some α > 0, provided that suitable initial bounds hold. In turn, this will be used to bound E[ X t 2 ] uniformly for all t ≥ 0.

Uniform Bound on an Exponential Moment of X i (t)
The following theorem depends on some technical lemmas that are deferred until after the proof of the theorem. Then the fact that Y i can increase only when X i is at zero, the form of f , and Lemma 3.5.1 below imply that for any γ > 0, the differential of f (X i (t)) satisfies Then where C > 0 is an appropriately chosen constant (depending on α). Therefore, Since Lemma 3.5.3 implies that for each t ≥ 0, we have E t 0 e βs f X i (s) σ i (X s )dW (s) = 0, which in turn implies that on taking expectations in (66) we have where K(·) is defined in Lemma 3.5.4 below. Gronwall's inequality now implies that and thus for all t ≥ 0, The form of f then implies that By considering the Taylor expansion of exp(αr), we can see that for each r ∈ + and positive integer n, r n ≤ n! α n exp(αr), and thus it follows from (67), the hypothesis of the theorem and Hölder's inequality that for each p > 0 and i = 1, . . . , d,

Supporting Lemmas
We now prove the additional lemmas used in the proof of Theorem 3.5.1. We will again use the notation from (39).
by the definition of Osc + . Therefore, From Assumption 2.3(i), it follows that for each γ > 0, Then, there exists a function K : (0, ∞) × + → + independent of t and i, which can be chosen to be non-decreasing in each coordinate, such that for each p > 0 and T ≥ 0, Proof. By Hölder's inequality, it suffices to prove this result for p > 1. Fix i, t and consider q ∈ satisfying |q| > 1. Since (t−τ) + +s (t−τ) + i (u)|σ i (X u )| 2 du ≤ K σ s for all s ≥ 0, by Novikov's condition for exponential martingales, the process is a martingale. Thus, for any stopping time η, for each s ≥ 0, Now ξ t,i and (ξ t,i ) −1 are local submartingales and so there is a sequence of stopping times {η n } tending to infinity as n → ∞ and such that for each n, the stopped processes, ξ t,i (· ∧ η n ) and (ξ t,i (· ∧ η n )) −1 are submartingales. Setting q = p and q = −p for p > 1, using Doob's inequality we obtain for each T ≥ 0, for a constant C p depending only on p and which can be chosen to be increasing with p. Letting n → ∞ and using monotone convergence completes the proof.  Proof. Fix κ > 0. For each positive integer n, define the stopping time η n := inf{t ≥ 0 : X [−τ,t] ≥ n}, with the convention that inf = ∞. Convexity of the exponential function implies that for each Since X i (t) ≤ M + Z i (t) for each t ≥ −τ and i = 1, . . . , d, we have Since (26) holds withM = M , as in the proof of Lemma 3.3.2, we can use Lemma 3.3.1 to conclude that where i (t) is given by (39). The linear growth condition (5) and Jensen's inequality imply that for any β > 0 and each T > 0, Then (73)-(74) (with β = κd/λ) together with inequality (112) (with n = 4 and a 1 = κd imply that for any λ ∈ (0, 1), T > 0 and i = 1, . . . , d: Lemma 3.5.2 (with t = 0) along with (71), (72) and (75) now imply that for each T > 0 and λ ∈ (0, 1), If T ∈ 0, 1 2d C 2 , we can set λ = T d C 2 ∈ 0, 1 2 and then we obtain for each T ∈ 0, 1 where Inequality (77) is obvious for T = 0. Our assumptions imply that K 0 (κ) < ∞ and K 1 (κ) > 0. Therefore, since so that the expectation on the left of (77) is finite, Gronwall's inequality implies that The monotone convergence theorem can then be applied to obtain for each κ > 0, The above procedure can be iterated to obtain a finite bound on E exp(κX (·)) [−τ,T ] for any . Fix k ≥ 1 and assume that We can show that this holds with k +1 in place of k by applying the above procedure with 0 replaced by T (k) , η n replaced with η (k) By induction, (81) holds for all k. Since T (k) → ∞ as k → ∞, the proof is complete.

Uniform Bound on E[ X t 2 ]
Lemma 3.5.5. Assume sup Proof. After replacing B 0 by K u , setting B 2,i = C 2,i = 0 for each i, and using the hypothesis of the lemma in place of Theorem 3.4.1, the proof is identical to the proof of Theorem 3.4.2.
Combining Theorem 3.5.1 with p = 2 and Lemma 3.5.5 yields the following. Recall for this that we are assuming that Assumption 2.3 holds.

Existence Theorem
The following is obtained by combining the results from Section 3.1 and either Section 3.4 or Section 3.5.

Uniqueness of Stationary Distributions
Throughout this section, we assume that Assumption 2.4 holds (in addition to Assumption 2.1). We will prove uniqueness of any stationary distribution for the SDDER. For this proof, we adapt to equations with reflection a clever asymptotic coupling argument recently introduced by Hairer, Mattingly and Scheutzow [12] for stochastic delay differential equations without reflection. Of particular note is the fact that this argument applies to equations with a dispersion coefficient that depends on the history of the process over the delay period. Most previous work on proving uniqueness of stationary distributions relied on showing the mutual equivalence of distributions of X t at some time t > 0 for all starting states, and then applying either Doob's theorem (see Theorem 4.2.1 in [7]) as in [30], or the techniques of Döblin (see, e.g., [24]) as in [17; 34]. However, these arguments cannot be easily extended to situations where σ depends on past states, because of the potential for reconstruction of the initial condition from the quadratic variation process (see [30; 36]). In [12], this potential difficulty is avoided by use of a different ergodic argument (see Theorem 1.1 of [12]). The main idea of this argument is to introduce a change of probability measure under which, with strictly positive probability, two solutions of the SDDER starting from different initial conditions are driven towards one another as time goes to infinity.
Although our general line of argument is very similar to that in [12], there are some differences due to the presence of reflection in the dynamics and we also provide more details for some steps. We begin in Section 4.1 by stating an abstract uniqueness result proved in [12]. The key technical section is Section 4.2 where the novel asymptotic coupling introduced in [12] is adapted to our setting. The uniqueness result is then stated and proved in Section 4.3.

An Abstract Uniqueness Result
A key element for our proof is the following proposition, which is adapted to our situation from Corollary 2.2 of [12]. Before stating it, we introduce some notation. Denote the space of sequences {x n } ∞ n=0 in d by ( d ) ∞ , and endow this with the product topology and associated Borel σ-algebra. For x ∈ d , let P x ∞ denote the probability measure on ( d ) ∞ that is the distribution of the sequence {X nτ } ∞ n=0 when X is a solution of (1) started from x. Recall that the symbol ∼ between two probability measures means that they are mutually absolutely continuous. The following proposition follows immediately from Corollary 2.2 of [12] by setting A = d there.
(ii) for each x, y ∈ d , Then there exists at most one stationary distribution for the SDDER (1).

Asymptotic Coupling of a Pair of Processes
We assume that an m-dimensional Brownian motion martingale {W (t), t ≥ 0} is given on a filtered probability space (Ω, , { t , t ≥ 0}, P). For each λ > 0, consider the system of SDDERs where P-a.s., (X (t),X λ (t)) ∈ 2d + for all t ≥ −τ, and where (Y,Ỹ λ ) is a continuous adapted process such that P-a.s., Y (0) =Ỹ λ (0) = 0 and Y i (resp.Ỹ λ,i ) can increase only when X i (resp.X λ,i ) is zero. We allow random 0 -measurable initial conditions for X 0 andX λ 0 . This is a 2d-dimensional system with globally Lipschitz coefficients, and thus Proposition 2.1.2 implies that there exists a unique strong solution for any pair of initial conditions. Consider such a solution pair (X ,X λ ), and let U λ (t) := X (t) −X λ (t) for t ≥ −τ. Then, for t ≥ 0, The following lemma is a modified version of Lemma 3.5 of [12], where here we have equations with reflection and we give the result for all q > 4 rather than just for q = 8. Inequality (97) is the reason that this lemma remains true in the reflected case. Our proof of this lemma is very similar to that in [12], but for completeness, we provide the details. We have also extracted Lemma 4.
Before proving this lemma, we give two propositions and a preliminary lemma. The first proposition is a simple stochastic variation of constants formula.

Proposition 4.2.1.
Assume that on some filtered probability space (Ω, , { t }, P), {ξ(t), t ≥ 0} is a continuous adapted process satisfying the following stochastic differential equation: for some γ ∈ and some continuous semimartingale {χ(t), t ≥ 0}. Then and thus for each 0 ≤ s ≤ t, Proof. It is straight-forward to verify that the right member of (89) satisfies (88). By the uniqueness of solutions for this equation given the initial state ξ(0), the result follows.
The next proposition is a slight generalization of Lemma 3.4 in [12] to the case where W is mdimensional, and specializes to the case where h is continuous. The proof of the proposition is nearly identical to that in [12], and so we omit it. In brief, this proof uses the representation V β (t) = e −β t t 0 e βs h(s)dW (s), the Burkholder-Davis-Gundy inequality, an integration by parts and estimates on V β on the segments kT N , where β = −κ + λK 1 andk λ (t) = λK 1 ζ λ (t) + k λ (t) for all t ≥ 0. It follows from Proposition 4.2.1 (with γ = −β) that for any 0 ≤ s ≤ t, where V β (t) = t 0 e −β(t−r) e κr λ (r)dW (r) for all t ≥ 0, and we have used the fact that v λ is nonincreasing together with (92) for the first inequality above.
, W (·) and η, respectively, and then using (93), we obtain the following: (We note that by the definition of η λ m , this holds even if Γ(λ, n, m) is infinite, since this will only occur when the right hand side above is also infinite.) Now, on letting m → ∞, we obtain by monotone convergence that It follows that for each integer n ≥ 0, Then, The constraints on where Y andỸ λ can increase and the positivity of X andX λ imply that for each The Lipschitz continuity condition (4) on b and σ implies that for any x, y ∈ d , Thus, on setting ζ λ (t) = |U λ (t)| 2 for t ∈ [−τ, ∞), we see that the hypotheses of Lemma 4.2.2 are satisfied with and K 1 = 2, K 2 = 1 + κ L and  Then by the bound on σ † and the fact that U λ t 2 ≤ e −t Υ for each t ≥ 0, we have Therefore, which increases to one as n → ∞ since Υ < ∞, P-a.s. Given the results of the previous section, our proof of the uniqueness theorem has the same general structure as the proof of Theorem 3.1 in [12], although we include a few more details, especially with regard to certain measurability properties.

Uniqueness Theorem
It follows that for each n = 1, 2, . . ., Then Feller continuity of the Markovian transition functions implies that the map (x, y) → P(η x, y,n = ∞) = P((X x ,X x, y ) ∈ Γ n ) is measurable for each n, and so the measurability of N (·, ·) follows. Henceforth, we abbreviate η x, y := η x, y,N (x, y) .  (v x, y (s)) dW (s) − 1 2 t 0 |v x, y (s)| 2 ds , t ≥ 0, defines a uniformly integrable martingale. Let ρ x, y (∞) denote the P-a.s. strictly positive limit of ρ x, y (t) as t → ∞. It then follows from Girsanov's theorem (see, e.g., Section 1 of Chapter VIII of [31]) that the probability measure Q x, y , defined by Q x, y (A) = E P [ρ x, y (∞)1 A ] for all A ∈ , is equivalent to P, and under Q x, y ,W x, y is a Brownian motion { t }-martingale. LetX x, y be the unique solution under Q x, y to the SDDER with initial conditionX 0 = y. Then, P-a.s., dX x, y (t) = b(X x, y t )d t + 1 {t≤η x, y } λ X x (t)−X x, y (t) d t + σ(X x, y t )dW (t) + dȲ (t), where W is a Brownian motion under P. For (101), we used the facts that σσ † = I d and P σ † (X x, y t ) = σ † (X x, y t ) for all t ∈ [0, η x, y ] ∩ [0, ∞) ≥ P X x, y (t) =X x, y (t) for all t ∈ [−τ, η x, y ] ∩ [−τ, ∞) = 1. (102) The equality above follows by a very similar proof to that for the strong uniqueness of solutions for the SDDER with Lipschitz coefficients using Gronwall's inequality.
Since uniqueness in law holds for solutions of (85), the distribution ofX x, y under Q x, y is the same as that of the solution X y to (85) under P with initial condition X 0 = y. Therefore, so thatP x, y also satisfies condition (ii) of Proposition 4.1.1.
All that remains to be shown is the measurability property (iii) in Proposition 4.1.1. This will follow from the measurability of (x, y) →P x, y (B) for each Borel measurable set B ⊂ C( + , d ) × C( + , d ), whereP x, y is the law of (X x · ,X x, y x, y where we have used the convention that 0 i=1 a i = k i=k+1 a i = 1 for any real numbers a i .
For eachx,x ∈ d , let Px ,x denote the law induced on C( + , 2d ) by the pair of strong solutions to (85) with the two initial conditionsx,x and the same driving Brownian motion W . Then, by the strong uniqueness for this pair, P-a.s., on {η x, y < ∞}, the law of (X x η x, y +· ,X x, y η x, y +· ) under P conditioned on η x, y is given by P X x η x, y ,X and h j (x,x, t) = 0 for t ≥ t j+1 , where (w,ŵ) denotes a generic point in C( + , 2d ). From the above and using the fact (see (102)) that P-a.s.,X x, y (·) andX x, y (·) agree on [−τ, η x, y ] ∩ [−τ, ∞), we see that the right hand side of (105) is equal to where for each x, y ∈ d ,P x, y is the law of (X x (·),X x, y (·)) under P, η(w,w) := inf t ≥ 0 : t 0 λ σ † (w s )(w(s) −w(s)) 2 ds ≥ N (w 0 ,w 0 ) , and B j is defined in the same manner as A j , but with η in place of η x, y . The desired measurability in (x, y) of the expression in (106) then follows directly from the Feller continuity of the transition functions associated with {(X x · ,X x, y · ) : x, y ∈ d }, since the fact that N (·, ·) is measurable can be used to show that η is a stopping time with respect to ˜ t := σ (w s ,w s ) : 0 ≤ s ≤ t , t ≥ 0 .

A Reflection
To ensure that a solution of (1) remains positive, we have employed Skorokhod's well-known mapping for constraining a continuous real-valued function to be positive by means of reflection at the origin. This mapping was applied to each component. The path z is called the reflection of x, and the path y is called the regulator of x.
We summarize some basic facts about the Skorokhod problem in the next proposition. (107) (ii) There exists a constant K > 0 such that for any x, y ∈ C + ( + , d ), we have for each t ≥ 0, Proof. These properties follow from the well-known construction of y: For more details, see [9; 14; 40]. We note that K ≤ 2, but we keep the notation K for convenience.

B Useful Inequalities
For referencing purposes, we state here several inequalities that are used in this paper.
For any a 1 , a 2 ≥ 0, we have the inequality which is obvious if a 1 = a 2 or if either is 0, and if a 1 > a 2 > 0 then (a 1 + a 2 ) q − a q 1 ≤ qa q−1 1 a 2 < a q 2 . The following is a well-known fact that follows from the convexity of power functions. For any p > 1, a 1 , . . . , a n ∈ , we have |a 1 + · · · + a n | p ≤ n p−1 (|a 1 | p + · · · + |a n | p ). (110) Sometimes n p−1 is too big for our needs, and we will use the following alternative, which can be proved with standard optimization techniques. For any γ > 1, and a, q ≥ 0, there is a K = K(a, γ, q) ≥ 0 such that (a + t) q ≤ K + γt q for all t ≥ 0.