Splitting Algorithms for Rare Events of Semimartingale Reflecting Brownian motions

We study rare event simulations of semimartingale reflecting Brownian motions (SRBMs) in an orthant. The rare event of interest is that a $d$-dimensional positive recurrent SRBM enters the set $B_n = \{z\in\mathbb{R}^d: \sum_{k=1}^d z_k = n\}$ before reaching a small neighborhood of the origin as $n\to\infty$. We show that under a proper scaling and some regularity conditions, the probability of interest satisfies a large deviation principle. We then construct a subsolution to the variational problem for our rare event, and based on this subsolution construct particle based simulation algorithms to estimate the probability of the rare event. It is shown that the proposed algorithm is stable and theoretically superior to standard Monte Carlo for a broad class of positive recurrent SRBMs.


Introduction
In many scenarios one is interested in the dynamics of an entity that is constrained to a subset of its possible state space by some mechanism. For example, when studying population levels of many interacting species (e.g., chemical compounds or biological populations), a natural constraint is that all population levels are non-negative. Alternatively, one may simply be interested in the dynamics of particles in a container that they cannot escape from. The so-called reflecting diffusions are a group of stochastic processes that have been developed to study the phenomena of constrained stochastic dynamics. Roughly speaking, a reflecting diffusion process behaves as an ordinary diffusion process in the interior of its domain, and it is pushed into the domain whenever the boundary of the domain is hit.
A substantial amount of research on reflecting diffusions has been carried out going back to the seminal works of Skorokhod [24] and Kingman [22] in one dimension. Watanabe [27] and Stroock & Varadhan [25] proved the existence of reflecting diffusions in higher dimensions when the domain is sufficiently smooth. When dealing with a particle constrained to an orthant, one does not have a smooth domain and an important extension is the work of Harrison and Reiman [17] that studied semimartingale reflecting Brownian motions (SRBMs) constrained to orthants. The SRBMs in orthants arise as diffusion approximations for open queueing networks in heavy traffic; cf. [19,16,29]. Some important results on the existence, uniqueness and positive recurrence of SRBMs can be found in the survey paper by Williams [31] (see our Section 2). In particular, Hobson & Rogers [20] and Williams [28] established the sufficient and necessary condition for the positive recurrence of SRBMs in quadrants. A fluid-based sufficient condition for the positive recurrence of SRBMs in general dimensions was developed in Dupuis and Williams [13], which was shown to be a necessary condition in three dimension by Bramson, Dai, and Harrison [3].
Since these papers a vast literature on the topic has emerged, and of particular interest is the body of work concerned with the tail probabilities of stationary distributions of SRBMs (cf. [1,12,7]). Until now all results on tail probabilities have been for either two dimensional systems or special cases of higher dimensional systems. For example, [23] assumed the property of rotational symmetry for three dimensional SRBMs, while [6] studied how to ensure the stationary distribution of an SRBM has a product form, and how this can be used to identify the asymptotics of its tail probabilities. The work [14] analyzed the properties of the most likely trajectories that give rise to tail events in SRBMs. Given the challenge of developing analytical results for the tail probabilities of SRBMs, our work focuses on developing numerical algorithms to aid in the estimations of these probabilities.
Broadly speaking, there are two types of algorithms for estimating rare event probabilities -importance sampling and particle methods. The general idea of importance sampling is to simulate the system of interest under a new probability measure that makes the event of interest less rare and then adjust the estimator by the Radon-Nikodym derivative between the original measure and the sampling measure. In particle methods, one simulates multiple copies of the system of interest and uses some selection method to favor copies of the system where the rare event of interest is more likely. In the current work we will focus on a particularly simple particle method called the "splitting algorithm". In this method, one establishes landmarks in the state space and each time a particle crosses a landmark it is replaced with a random number of offspring. The creation of landmarks where splitting events occur is of course the challenging aspect of creating efficient and stable splitting algorithms. In [8], Dean and Dupuis showed how one can use a large deviation principle to ensure that the splitting algorithm is stable and efficiently estimates the rare event of interest. There have been several works (e.g., [8,9,2]) that have looked at rare event simulations using particle methods, but to our knowledge, our work is the first to study rare event simulations for SRBMs using any method.
In this work we develop splitting algorithms to aid in the estimations of tail probabilities for SRBMs. We focus on the particular rare event that a positive recurrent SRBM exits the L 1 cube of radius n before returning to the cube of radius ǫ. We show that our splitting algorithm can stably estimate this probability for positive recurrent SRBMs and furthermore provides an improvement over standard Monte Carlo method. Given that it is not in general possible to exactly simulate an SRBM, we use the Euler method to simulate an approximate process. We establish conditions on the discretization size to ensure our approximating process satisfies the same large deviation principle as the original SRBM. As a result our splitting algorithm may be used for SRBMs or the approximating processes with identical asymptotic properties. Given the bias of using an approximating process we also analyze the relative error between the tail probabilities under the SRBM versus the approximating process.
The remainder of this work is organized as follows. In Section 2, we define the SRBMs, state the basic assumptions, and define the rare event of interest. In Section 3, we show that the rare event probability satisfies a large deviation principle. The variational problems for SRBMs are studied in Section 4, and especially, in Section 4.2, a subsolution to the variational problem in general dimensions is constructed. In Section 5, we introduce the discretizations of SRBMs, apply the splitting method to the discretized process, and state our main results on the performance of the splitting method for estimating rare events. Finally, in Section 6, we present numerical examples.
Throughout the manuscript we shall use the following notation. For a pair of d dimensional vectors u and v, we denote their standard inner product by For f ∈ AC([0, ∞); S), letḟ denote its gradient vector, andf its Hessian matrix. For a Borel set B ⊂ R d , and a constant c ∈ R, cB = {cz ∈ R d : z ∈ B}. For a set K of finite number of elements, denote by card(K) its cardinality. Throughout the paper we will use the following notation for the asymptotic behavior of positive functions: where C and c are positive constants.

Problem formulation
Semimartingale refecting Brownian motions (SRBMs) in the nonnegative orthant were first introduced by Harrison and Reiman [17]. Roughly speaking, such a diffusion process behaves like a Brownian motion in the interior of the orthant. Upon reaching the boundary, it is reflected back into the interior in some specified direction, so as to constrain the process in the orthant. The rigorous definition of SRBMs is stated as follows.
where z ∈ R d + , X is a d-dimensional driftless Brownian motion with covariance matrix Σ and X(0) = 0, and Y is a d-dimensional stochastic process such that for i = 1, . . . , d, (ii) Y i is continuous and non-decreasing; The parameters θ, Σ, and R are known as the drift, covariance matrix, and reflection matrix of the SRBM. We assume that Σ is invertible, and for ease of notation we let D = Σ −1 . In [26], the authors consider the weak solution of the SRBM equation (2.1), that is the existence of the triplet (X, Y, Z) which satisfies (2.1). It is shown in [26] that there exists an SRBM Z associated with (θ, Σ, R) if and only if R is completely-S, and in this case, Z is unique in law and is a Feller continuous strong Markov process. A matrix is said to be completely-S if for its every k × k principle submatrix G, there exists a k-dimensional vector v G such that v G ≥ 0 and Gv G > 0.
An important tool in the study of SRBMs is the Skorokhod problem (SP).
is called a solution of the SP for ψ associated with R if the following hold.
(ii) η satisfies the following: (a) η i (0) = 0, (b) η i (·) is nondecreasing, and (c) η i (·) increases only when φ i (·) = 0, that is, [32] guarantees that there is a solution to the SP for ψ if and only if R is completely-S. Define , then Γ is called the Skorokhod map (SM) associated with the matrix R, and we will call Γ(ψ) the R-regulation for ψ. Throughout this work we will make the assumption below to insure that our SRBM satisfies a large deviation principle.
, and is Lipschitz continuous with respect to the topology of uniform convergence on compact sets.
Note that Assumption 2.1 implies that R is completely-S. The work [11] develops conditions that guarantee the Lipschitz continuity of Γ.
We require that the SRBM Z is positive recurrent such that the expected time that Z reaches any closed set with positive Lebesgue measure will be finite.
Based on Theorem 2.6 of [13], we make the following assumption which provides a sufficient condition for the positive recurrence of our SRBM.
For a positive recurrent SRBM Z, we introduce a large deviation scaling with the scaling parameter n ∈ N as follows: For n ∈ N, define Z n (t) = Z(nt) n , t ≥ 0.
Let ǫ ∈ (0, 1). Consider two sets A n = {z ∈ R d + : d k=1 z k ≤ ǫ/n} and B = {z ∈ R d + : d k=1 z k ≥ 1}. We are interested in the probability that the process Z n first enters B before reaching A n with a starting point z n / ∈ A n ∪ B. Letting τ n = inf{t ≥ 0 : Z n (t) ∈ A n ∪ B}, the probability of interest can be defined as p n (z n ) = P(Z n (τ n ) ∈ B|Z n (0) = z n ). (2. 2) The goal of this work is to build algorithms that can stably estimate the probability p n (z n ) and provide an improvement over standard Monte Carlo. Given the positive recurrence assumption on the SRBM Z it is intuitively obvious that p n will go to 0 as n → ∞. Furthermore, the exponential decay rate of the probability will clearly be related to the large deviation properties of the sequence {Z n }. One minor point that we will have to deal with is that the sets {A n } n≥1 are dependent on the scaling factor n. Ideally we would like to replace the sequence of sets {A n } n≥1 with the set {0}; however for positive recurrent SRBMs it is possible that the expected time to hit the point 0 is infinite. In the next section, we develop the necessary large deviation theory to characterize (in terms of a variational problem) the exponential decay rate of p n (z n ).

Large deviation principle for the rare events
For T ≥ 0, using Schilder's theorem and the contraction principle, {Z n (t); t ∈ [0, T ]} with initial value Z n (0) = z satisfies the large deviation principle (LDP) in C([0, T ]; R d + ) with the good rate function In particular we have the following result for any T > 0. Its proof is a simple application of the contraction principle and is thus omitted.
Lemma 3.1 Assume that z n → z ∈ R d + and let I T,z be as in (3.1).
However, the probability p n (z n ) is time independent and we need to extend the previous result to handle infinite time horizon events. A standard technique is to use the stability properties of the stochastic process to establish an upper bound on the size of the stopping time τ n . The result is present in the following proposition. where E z/n denotes the expectation conditioning on Z n (0) = z/n.
We now consider an infinite horizon variational problem which minimizes over all paths that travel from 0 to a given point z: where Γ(ψ)(T ) = z if there is a φ ∈ Γ(ψ) such that φ(T ) = z. With Proposition 3.1 the finite horizon LDP can be extended to infinite time horizon events.

Variational problems
This section focuses on the study of the variational problem (VP) defined in (3.2). A triplet (ψ, η, φ) is called a solution to the VP (3.2) if (φ, η) is a solution of the SP for ψ, φ(T ) = z for some T ≥ 0, and The solution (ψ, η, φ) is also referred to as an optimal triple, and ψ and φ are referred to as optimal path and optimal reflected path, respectively. It is known from [1] that the VP in R 2 + admits an explicit analytical solution. We summarize this result in Section 4.1. For the VP in R 3 + , a special case on rotationally symmetric VP is studied in [23], and general paths properties are studied in [14]. In [15], the value of the VP in R d + is characterized as a solution to a partial differential equation. However, the complete analysis of the VP in three or higher dimensional space is still open. In Section 4.2, we apply the path properties from [14] to construct a subsolution for the VP in R d + when d ≥ 3.

Explicit solution for the VP in
The VP in R 2 + is analytically solved in [1], in which the solution is characterized by defining "cones of boundary influence". In this section, we summarize the result in [1] and adopt its notation. Without loss of generality, we let The parameters θ and R are required to satisfy the following assumption, which is sufficient and necessary for the positive recurrence of Z.

Assumption 4.1
The drift θ and reflection matrix R satify that For i = 1, 2, we choose a vector p i which is orthogonal (under the usual Euclidean inner product) to the ith column of R, and |Σp i | D = 1. We define the face F i = {x ∈ R 2 + : x i = 0}, and the associated exit velocity a i = θ − 2(θ ′ p i )Σp i . The face F i is said to be reflective if the ith component of a i is negative, i.e., a i i < 0. Denote e 1 = [0 1] T , e 2 = [1 0] T . Let n i be a vector that is normal to F i , pointing to the interior of the state space, and normalized such that |Σn i | D = 1. We introduce the key definition from [1] as follows: When F i is not reflective, the cone of influence C i is defined to be an empty set. In this case, the face F i has no boundary influence on solutions to the VP for any z ∈ R 2 + . (ii) When F i is reflective, the cone of influence C i is defined to be the cone generated by e i andã i , whereã For z ∈ R 2 + , the cone of influence C i is said to be active if z ∈ C i . We define three cost functions: whereã 0 (z) = z|θ| D /|z| D . With Assumption 4.1, the authors in [1] showed that an optimal path can always be found among these three paths. The search for such path depends on the location of z and its active cone of influence. The following is the main theorem from [1].
the optimal value of the VP is given by I 0 (z).
(ii) If z ∈ C 1 \C 2 , the optimal value of the VP is given by I 1 (z).
(iii) If z ∈ C 2 \C 1 , the optimal value of the VP is given by I 2 (z).
(iv) If z ∈ C 1 ∩ C 2 , the optimal value of the VP is given by min{I 1 (z), I 2 (z)}.

Subsolution for VP in
In this section we construct a sub-solution to the VP (3.2) in R d + , d ≥ 3. We first define L = {1, 2, ..., d}. For J, K ⊂ L, define R JK to be a card(J) ×card(K) matrix by deleting rows and columns with indices in L\J and L\K, respectively. Let R j denote the jth column of R. Finally, define the face associated with K ⊂ L to be A is a d × d matrix with nonnegative entries, and µ is greater than the spectral radius of A. We also note that R −1 > 0, which in particular implies that R −1 θ < 0 under Assumption 4.2. In this section, we make the following assumption on R Assumption 4.2 The reflection matrix R is a nonsingular M−matrix, and θ < 0.
For any w, v ∈ R d + , w = v, define the globally optimal cost from w to v as follows: A path ψ * which leads to the globally optimal cost is called a globally optimal path from w to v. We next define the locally optimal cost between points w and v, where we constrain the path to lie on a face F K . In particular we will consider the local cost between points satisfying the following condition with respect to K ⊂ L.
For the pair v and w satisfying Condition 4.1 with respect to K ⊂ L, we define the locally optimal cost from w to v with path constrained in F K as follows: We now introduce the definition of a sub-solution to the VP (3.2).
The following lemma presents the main idea of constructing a sub-solution.

Lemma 4.1 LetT be a function that satisfies conditions (i) and (ii) in Definition 4.2. If in addition, for all
In [14], the authors have investigated a local variational problem which is very similar to the VP (4.4). Specifically, two types of locally optimal paths were investigated: interior escape paths and single segment boundary escape paths. An interior escape path is the one for which no reflection is applied. Let K and K ′ be two subsets of L such that K ⊂ K ′ and 0 < card(K) < card(K ′ ) ≤ d. For w ∈ F K ′ , v ∈ F K , an optimal single segment boundary escape path is a path constrained in F K which has the cost equal to I * K . The result in [14] can be applied to the VP (4.4) directly as the proofs are still valid.
In order to state the results in [14] on I * K , we need some further notation. Given K ⊆ L, and two points v, w ∈ R d + satisfying Condition 4.1 with respect to K, for each J ⊆ K, define and an optimal path is From Theorem 4.2, we can solve the VP (4.4) by identifying the set J * for the pair of points v and w. In particular, when |K| = 0, the problem (4.4) only concerns paths in the interior of R d + .
and the optimal path is When |K| ≥ 1, we can find J * by testing all subsets of K through the conditions from Theorem 4.2.
In the rest of section, we construct a sub-solution for the VP (4.4), which by Lemma 4.1 is sufficient for establishing a subsolution to (3.2). We start with T (·) being a norm on R d . It We will then rescale this norm in an appropriate fashion by the functions I * K (·, ·), K ⊂ L, and shift it to allow for the largest possible value at the origin.
Consider a nonempty subset K ⊂ L, and two points v, w satisfying Condition 4.1. We note that, from

Lemma 4.3
Given K ⊆ L, and two points v, w ∈ R d + which satisfy Condition 4.1 with respect to K, we have For a nonempty subset K ⊂ L, define a direction set D K ⊂ R d such that a vector v ∈ D K if and only if v satisfies the following conditions: Notice that D K is a closed set and thus compact due to the continuity of the norm | · |. By the definition of I * K (v), it is easy to see that I * K (v) is continuous. We now show that I * K is bounded from below on the set D K .
The above proposition allows us to consider the maximum of the ratio T to I * K over the set D K . Let : ∀K L, and ∀v ∈ D K , and define our subsolution to bē Theorem 4.3T is a subsolution to the VP (3.2).

Splitting algorithms for SRBMs
Particle splitting methods and importance sampling methods are the two most widely used methodologies to obtain numerical estimations of probabilities of rare events. Oftentimes, particle splitting methods are used to simulate a class of rare events called first entrance probabilities. The main idea of particle splitting is to partition the state space of a process into a series of nested subsets. Then the rare event of interest can be considered as a nested sequence of events. When a given subset is entered by a simulated particle for the first time, a number of offspring will be generated at the entry point according to the assigned splitting mechanism. All the offspring will follow the same law of the original process independently. More refined versions of splitting algorithms such as RESTART have been introduced in the last decades (see [9] for details). In Section 5.1, we construct the Euler discretization of the SRBM, and show that the discretized SRBM and the SRBM are exponentially equivalent. In Sections 5.2 and 5.3, we use the solution/sub-solution developed in Section 4 to develop algorithms to estimate the probability for our rare event of interest. Finally in Section 5.4 we analyze the performance of our estimation algorithms.

Discrete approximation for SRBMs
In this subsection, we construct a discrete approximation to the SRBM Z. Recall that the d-dimensional SRBM Z in R d + is given by where X is a d-dimensional driftless Brownian motion with covariance matrix Σ, and Y is the corresponding reflecting process. Fix ∆ > 0. Consider the discretized process X ∆ and and k ∈ N. Define (Z ∆ , Y ∆ ) as the solution of the Skorokhod problem forX ∆ with the initial z. Then we have We solve the Skorokhod problem forX ∆ by solving a sequence of Linear Complementarity Problems as outlined in [30].
Introducing the large deviation scaling to Z ∆ , we define Let z n = z/n. The first entrance probability associated with Z ∆ is defined to be In the following, the first proposition establishes the exponential equivalence between Z n and Z ∆ n , and the second one characterize the relative error between p ∆ n (z n ) and p n (z n ).
and consequently, for z n = z/n / ∈ A n ∪ B, Proposition 5.2 Let η ∈ (0, 1/2) and ∆ ≡ ∆(n) satisfy lim n→∞ n∆ η = 0, assume that z n = z/n / ∈ A n ∪ B, and denote z n = (z n,1 , . . . , z n,d ). Then under Assumption 4.2, Ideally Proposition 5.2 would be a statement of the form 3) but unfortunately we were not able to prove such a statement. Since p n (x) will have the same exponential decay rate for all x ∈ C n we see that considering a maximum over the set C n introduces at most a subexponential effect.

Particle splitting methods
Following [8], we first briefly describe the splitting algorithm to estimate the probabilities {p ∆ n (z n )} n≥1 . Our state space R d + is partitioned according to a nested collection of sets B = C n 0 ⊂ C n 1 ⊂ · · · ⊂ C n K · · · . These sets are often constructed as the level sets of a particular function V , which is called the importance function in [8], and will be defined in terms of the solution/sub-solution of the VP (3.2).
Given the sequence of nested sets {C n j } j≥0 and a splitting number r, the most simple splitting algorithm works in the following way. Define l n (x) = min{j ≥ 0 : x ∈ C n j }. The algorithm generates a time inhomogeneous branching process with generations indexed by i ∈ {0, 1, . . . , l n (z n )}.
• For generation 0, a single particle starts at z n , and it evolves according to the law of Z ∆ n with initial condition Z ∆ n (0) = z n . • Let N n i−1 denote the number of particles in generation i − 1, and for j = 1, . . . , N n i−1 , let X n i−1,j denote the position of the j-th particles in generation i − 1. The j-th particle behaves according to the law of Z ∆ n with initial condition Z ∆ n (0) = X n i−1,j , and it stops moving when reaching either the set A n or the next level set C n l n (X n i−1,j )−1 = C n l n (zn)−i .
• If the j-th particle in generation i − 1 reaches A n first, do nothing.
• If the j-th particle in generation i−1 reaches C n l n (zn)−i first, denoting by X n i,j its location of entrance to C n l n (zn)−i , generate r number of new particles in generation i with position X n i,j . Once all the generations have been calculated, define the estimator s n = r −l n (zn) N n l n (zn) .

(5.4)
Now consider a sub-solutionT . If the problem is two dimensional, we letT (v) = −I(v) + inf x∈B I(x), where I(v) is the explicit solution to the VP discussed in Section 4.1. It is easy to check that this is indeed a sub-solution, andT (0) = inf x∈B I(x). If the problem is three or higher dimensional, we use the sub-solution constructed in (4.5) of Section 4.2, where the norm T on R d is defined to be T (x) = d i=1 |x i |. Define the importance function V (z) = δT (z)/ log(r). The level sets of the importance function V are defined to be L x = {y ∈ R d + : V (y) ≤ x}, x ≥ 0. Now define the sets C n j to be the level sets of V . More precisely, for some δ > 0, let C n j = L (j−1)δ/n , j ∈ N. Here δ is referred to as the level of the splitting algorithm. In the proposition below we assume that z n = z/n / ∈ A n ∪ B. Furthermore, the variance of s n can be measured using the following rate of decay of the second moment.
3 is a combination of Proposition 7 and Theorem 8 from [8]. It should be noted that [8] assumes the rare event of interest is governed by a large deviations principle with a specific form that is different from what we consider in the current paper. However, the proofs of Proposition 7 and Theorem 8 do not depend on that assumption and still apply to the current setting.

RESTART splitting
The particle splitting algorithm presented in Section 5.2 is very efficient with regard to the mean and second moment of the number of particles generated if a sub-solution to the associated variational problem can be obtained. However, it can still require a lot of computational effort. The source of inefficiency comes from the fact that most of the particles generated will not end up in the rare event set (in our case, set B) [9] . For those particles generated near set B by splitting, most of them will still reach A n before B, and it takes much time to simulate those trajectories.
Here, we briefly introduce the RESTART splitting algorithm we implemented specifically for our problem. More details about the RESTART algorithm for rare event simulations can be found in [9]. We first define a sequence of importance functions {V n } as follows.
V n (y) = ∆⌊ nU(z n ) − nU(y) ∆ ⌋, and V n (y) = 0 ∨Ṽ n (y), where ∆ = log(r) with r being a fixed positive integer, z n is the starting point, and U(·) is a subsolution satisfying the conditions in definition 4.2. For each n, we define a sequence of nested sets C n 0 ⊃ C n 1 ⊃ · · · ⊃ C n Jn ⊃ C n Jn+1 , where C n 0 = R d + , and C n Jn+1 = ∅, such that y ∈ C n j \ C n j+1 if and only if V n (y) = j∆. For simplicity, we denote C n j \ C n j+1 by D j . The algorithm starts from an initial particle y 1 located at z n . Each particle will be assigned a killing threshold l when it is generated. The killing threshold for the initial particle is set to be 0. A particle is killed whenever it reaches any D j such that j < l. A killed particle will not be simulated further. During the simulation, whenever a particle moves from D j to D k , k > j, in one step, a total number of r k−j − 1 (excluding the original particle) offspring will be generated. For each integer j < α ≤ k, (r − 1)r α−j−1 of the offspring will be assigned the killing threshold α. A particle is said to be stopped whenever it reaches set A n or B. Notice that splitting can happen at the moment of a particle being stopped. In such case, all the offspring will also be stopped. A stopped particle will not be simulated further. The algorithm terminates when all the particles are stopped or killed.
Let I be the index of those particles which reach set B. For a single trial of simulation, p n (z n ) is then approximated by r n = i∈I exp(V n (y i (τ ))), where y i (τ ) is the position of y i when it is stopped. The following proposition summarizes the main results on the RESTART algorithm for Z ∆ n .

Proposition 5.4
(i) The estimator constructed from the RESTART algorithm is unbiased.
(ii) Suppose that z n → 0 with each z n / ∈ (A n ∪ B). Then This proposition is a combination of results from [9]. The RESTART algorithm does not necessarily have a faster decay rate of its second moment than standard splitting. However, due to the lower computational effort per replication it often has superior performance to standard splitting. In particular, there might be a slight increase in variance of the estimator due to the use of RESTART instead of splitting, but this increase is often made up for by not simulating particles that are very unlikely to hit the target set.

Estimation of Rare Event Probabilities
Our goal is to estimate the rare event probability p n (z n ), and we do this with the estimators s n or r n . We will focus our discussion on s n , but a similar analysis can be carried out for r n . A standard metric for the quality of the estimator is given by the relative expected mean square error The previous display shows that rMSE(s n ) can be decomposed into a relative variance term and a relative bias term. The relative variance term can be rewritten as From Proposition 5.2 we know that the second term in the previous display grows at most at subexponential rate. From Theorem 3.1 and Proposition 5.3 we know that As mentioned earlier for the 2-d problem we have thatT (0) = inf x∈B I(x) and thus the relative variance will grow subexponentially in n. An unbiased estimator whose relative variance grows subexponentially is said to be efficient, i.e., in two dimensions s n is an efficient estimator for p ∆ n (z n ). For three and higher dimensions we have that and therefore the relative variance term grows with exponential rate which is guaranteed to be non-negative due to the subsolution property. As a point of comparison suppose one were to use standard Monte Carlo to estimate p n (z n ) and denote the estimator by mc n (z n ). Note that the bias terms would remain the same but the relative variance term would grow with exponential rate of inf x∈B I(x). Therefore the estimator s n has theoretically superior mean square error to mc n (z n ). Another point to mention, is that analysis of rMSE(s n ) leaves out the computational work for generating a single copy of s n . However, Proposition 5.3 guarantees that the expected work per copy of s n grows subexponentially and thus when normalized for expected work per replicate s n will still be a superior estimator to mc n (z n ).
Proposition 5.2 is unfortunately not strong enough to provide precise control over the relative bias term, but we do know that if it does grow it does so in a subexponential fashion. It is our hope that a stronger statement of the form (5.3) can be achieved.

Splitting algorithm
Before we run the splitting algorithm for a specific example, there are some initial steps needed to be done. We first obtain a sub-solutionT to the associated variation problem. If the problem is two dimensional, we let T (v) = I(v), ∀v ∈ R 2 + , where I(v) is the explicit solution to the VP discussed in Section 4.1. Then we letT (v) = −T (v) + inf x∈B T (x). If the problem is three or higher dimensional, we can find the scaling factor r by solving the corresponding optimization problem with the help of any optimization solver, and then construct the sub-solution as defined in (4.5) of Section 4.2. Then we choose a positive integer r > 1 as the number of new particles when a splitting occurs, and choose δ ∈ (0, 1] as the level size. The importance function is then obtained as V (z) = δT (z)/ log(r). We then define a collection of level sets {C n j : j ∈ N}, where C n 0 = B, C n j = L (j−1)δ/n , j ∈ N, where L a = {z : V (z) ≤ a}. Finally, for a point x, its level can be calculated through the level function l n (x) = min{j ≥ 0; x ∈ C n j }. We conduct numerical experiments for one 2-dimensional SRBM, and one 3-dimensional SRBM. The first splitting simulation used 1,000 runs with step size 1/(1000 * n). The second splitting simulation used 200,000 runs with step size 1/(1000 * n). Note that our step size does not comply with the requirement in Proposition 5.2 to achieve the asymptotic error bound. However, for every value of n used in the following simulations, 1/(1000 * n) ≤ 1/(n 2.5 ), where the latter step size satisfies the requirement in Proposition 5.2. In this way, we increase the accuracy of simulation results when n is relatively small, and the polynomial factor is not too large as n increases. Simulations are run in Matlab with computer having Intel i7-7500U CPU @ 2.70GHz, and RAM 8.00 GB(7.88 GB useable).
For a 2-dimensional SRBM, we consider the initial position (0. We also ran the standard Monte Carlo simulation for the 3-dimensional example with n = 5 to compare with the results generated by splitting algorithm. We ran splitting algorithm 100, 000 times so that 100, 000 * r l(x 0 )−1 = 1, 000, 000. We also ran standard Monte Carlo 1, 000, 000 times.

RESTART algorithm
We applied the RESTART algorithm to simulate the same 2-dimensional example as in the previous section. As expected, the RESTART algorithm required much less computational effort. Hence, we could run more trials, and simulate the cases for high value of n which would be very time consuming when splitting algorithm is applied.
Here we give numerical results for a 2-dimensional SRBM. The simulation used 10,000 runs with step size 1/(1000 * n). Simulations are run in Matlab with computer having Intel i7-7500U CPU @ 2.70GHz, and RAM 8.00 GB(7.88 GB useable).

Discussion
In this project, we developed particle based simulation algorithms for estimating rare event probability related to SRBMs in a non-negative orthant. We found that splitting-type algorithms provide much better performance than standard Monte Carlo method. Furthermore, we found that the RESTART splitting algorithm is superior to regular splitting algorithm in terms of operating time. Following [8,9] our algorithms are based on subsolutions to the variational problems associated with our rare events. In particular, the work [1] developed solutions to these variational problems in two dimensions. For three and higher dimensions we construct new subsolutions to the variational problem. Due to the results of [8,9] we are able to show that our splitting algorithm has superior performance to standard Monte Carlo in two and higher dimensions. Future directions for this work include developing subsolutions in three and higher dimension that are closer to the solution, and thus enable simulation algorithms with better performance. Another possible direction is to consider more general reflecting diffusions, i.e., reflecting diffusions with state dependent drift and variance.

Proofs
We provide all the proofs in this section.

Proofs for Section 3
To show Proposition 3.1, we introduce the following Lyapunov function introduced in [13]. Recall that R i is the ith column of the reflection matrix R, and for a d × d matrix M, its trace is denoted as tr(M). Theorem 8.1 ([13]) Under Assumption 2.2, and the assumption that R is completely-S, there exists a Lyapunov function L : (ii) Given M 0 > 0, there exists an M 1 > 0 such that when |z| ≥ M 1 , we have L(z) ≥ M 0 ; (iii) Given ε > 0, there exists a M 2 > 0 such that when |z| ≥ M 2 , we have |L(z)| ≤ ε; (iv) There exists c 0 > 0 such that (v) L(αz) = αL(z) for all α ≥ 0 and z ∈ R d + .
Some consquences of the above properties are obtained in [4], which are listed as follows.  Then for any κ > 0 and m, n ∈ N, m ≤ n, we have where M 4 is as in Theorem 8.1 (vii) and γ 0 is a positive constant only depending on the norm of Σ −1 .
Proof of Proposition 3.1. Define τ ǫ = inf{t ≥ 0 : d k=1 Z k (t) ≤ ǫ} and then nτ n ≤ τ ǫ . Thus it suffices to show that there exists c > 0 such that The proof idea is adapted from that of Multiplying both sides of the above equation by κ ǫ , from Theorem 8.1 (iv), (v) and (viii), we have for m ≤ n, on A n , which implies that From (8.1) and using Lemma 8.1, we have for any κ > 0, We can choose ∆ large enough and κ small enough such that −η ≡ log(2 √ 2)/∆ + κ 2 M 2 4 γ 0 − κc 0 /2 < 0. Therefore, we have For t > 0, there exists n 0 ∈ N such that t ∈ [n 0 ∆, (n 0 + 1)∆], and we have When z ∈ R d + satisfies that d k=1 z k ≤ ǫ or d k=1 z k ≥ n, we have τ n = 0. When ǫ < d k=1 z k < n, we have for c ∈ (0, η), The result follows. where P zn and E zn are the probability and expectation conditioning on Z n (0) = z n . Therefore we can take T > 0 such that lim sup The following proof is similar to that of Lemma 5.7.21 in [10]. Define the following closed set of C([0, T ]; R d + ): Then Also note that for T > 0 we have We next note that where τ ψ = inf{t ≥ 0 : Γ(ψ)(t) ∈ B}. Thus it suffices to establish that lim inf for any ψ ∈ AC([0, ∞), R d ) such that ψ(0) = 0 and τ ψ < ∞. Noting that the possible optimal solutions of the VP in (3.2) never return to 0 before reaching B, we only need to consider ψ such that d k=1 Γ(ψ) k (s) > 0 for all s ∈ (0, τ ψ ]. Fix such a ψ. Let ϕz(t) =z + ψ(t), with d k=1z k > 0. It is clear that Fix κ > 0, and let B κ = {z ∈ R d + : d k=1z k = κ}. For κ, δ > 0, define For κ ∈ (0, 1) there exists δ 0 > 0 and N ∈ N such that, when 0 < δ ≤ δ 0 , Assume that n is large enough such that κ > k z n,k > ǫ/n. From the Markov property of Z n , we have that p n (z n ) ≥ P zn (Z n hits B κ before hitting A n )P (Z n hits B before hitting A n |Z n (0) ∈ B κ ) .
Using the minimality property of the one-dimensional Skorokhod map, we see that d k=1 Z n,k (t) ≥ Γ 1 ( d k=1 z n,k + d k=1X n,k )(t) for all t ≥ 0, whereX n (t) = θt + X(nt)/n and Γ 1 is the onedimensional Skorokhod map. Thus we have P zn (Z n hits B κ before hitting A n ) ≥ P d k=1 z n,k We next note that lim inf n→∞ 1 n log P (Z n hits B before hitting A n |Z n (0) ∈ B κ )

Proofs for Section 4
Proof of Lemma 4.1. For any w, v ∈ R d + \(A ∪ B), suppose we have an optimal triple (ψ, η, φ) with φ(0) = w, φ(T ) = v. Denote by P(I) the power set of L. When φ traverses a face F K from time t 1 to t 2 for some K ∈ P(I), the segment of (ψ, η, φ) must be locally optimal. Otherwise, we can find a path which has lower cost. For some K ∈ P(I), we define the set G K = {t ∈ [0, T ] : φ i (t) = 0, φ j (t) > 0, ∀i ∈ K, ∀j ∈ L\K} o , where for a set E, E o denotes its interior. We notice that G K is a open set, and can be represented as the disjoint union of at most countably many open intervals. Denote the set of these open intervals by O K . Notice that the optimal path traverses the face F K from time t 1 to t 2 if (t 1 , t 2 ) ∈ O K . Observe that ∪ K∈P(I) O K contains at most countably many disjoint open intervals, and Both of the equalities can be justified by the argument that there are at most countably many singleton points in the set [0, T ] \ {∪ K∈P(I) G K }, which contributes 0 to the cost.
Proof of Theorem 4.2. Theorem 4.2 summarizes some major results of [14], including Theorem 3, Theorem 4, Lemma 5, Proposition 1, and Proposition 2. Despite a small difference in problem settings, their proofs are still valid for our setting. To be more specific, given K ⊆ L, and two points v, w ∈ R d + which satisfy Condition 4.1 with respect to K, Theorems 3 and 4 of [14] assert that there is a unique J * ⊆ K such that A J * R j , α J * (w, v)(v − w) − θ D ≤ 0, j ∈ K\J * , and when J * = ∅, each component of λ J * (w, v) is positive. Proposition 1 of [14] shows that I * Lemma 5 and Proposition 2 of [14] give the explicit form of an optimal path.
Proof of Lemma 4.2. See the proof of Lemma 1 in [14].
Proof of Lemma 4.3. (i) If I * K (w, v) = 0, then by Theorem 4.2, there exists an optimal path such thatψ(t) = θ < 0 for all t ∈ (0, and . Multiply both sides of the second equality in (8.6) by τ φ (v), and we can get Comparing the above equation with (8.5), we obtain that Since R is invertible, all its columns are linearly independent. If J * = K, for i ∈ K \ J * , η i (τ φ (v)) = 0, which leads to a contradiction. Thus, we can conclude that J = K. We next note that v i − w i < 0 for i ∈ L\K from (8.5) and the fact that θ < 0; which yields that v i − w i ≤ 0, for all i ∈ L. Noting that both w and v are in R d + , T (v) ≤ T (w) follows from the monotonicity of T (·).
Define V (∆) = max 0≤t≤∆ |W (t)|. By the Markov property of Brownian motion, we have that where {V k (∆) : k ∈ {1, . . . , ⌊T /∆⌋} is an independent sequence of copies of V (∆). From [21], an upper bound on the tail probabilities of V (∆) can be derived. In particular, for z > 0, It follows that for z > 0 where T (∆) = ⌊T /∆⌋. Next we use the inequality that log(z) ≥ −(1 − z)/z for z ∈ (0, 1) to conclude that Considering the exponent in the above equation, we have and thus Now consider a scaled version of W T defined by W n (t) = 1 n W (nt), t ∈ [0, T ], which is denoted by W n,T = {W n (t) : 0 ≤ t ≤ T }. For ∆ > 0 we consider the discretization Under Assumption 4.2, the Skorokhod map is Lipschitz continuous, and there exists a constant C which only depends on R, such that for t ≥ 0, . Now consider the large derivation scaled processes Z n (t) = Z(nt)/n, X n (t) = X(nt)/n, and Y n (t) = Y (nt)/n. From the homogenity property of the Skorokhod map for linear scalings, (Z n , Y n ) is the solution of the Skorokhod problem for {X n (t) = X n (t) + θt; t ≥ 0}. It follows from (8.10) that .

Proofs of Propositions 5.3 and 5.4
Proof of Theorem 5.3. The stability follows directly from Proposition 7 in [8], and (5.5) follows from Lemma 5 in the same reference. To prove (5.6), we need to verified that Conditions 1 and 4 in [8] hold for our problem. Condition 1 of [8] follows from Theorem 3.1. For their Condition 4, due to the positive recurrence of our defined process, σ n is finite almost surely. Then by strong Markov property 1 {Zn(σn)∈Lx} (p n (Z n (σ n ))) 2 ≤ 1 {Zn(σn)∈Lx} (p n (Z n (σ n ))) sup y∈∂Lx p n (y).
Hence using the strong Markov property of Z n , we have E zn 1 {Zn(σn)∈Lx} (p n (Z n (σ n ))) 2 ≤ sup y∈∂Lx p n (y) · E zn 1 {Zn(σn)∈Lx} p n (Z n (σ n )) = sup y∈∂Lx p n (y) · p n (z n ). Proof of Proposition 5.4: It suffices to verify the five conditions in [9], which are Conditions 3.1, 4.3, 5.1, 5.2, 5.3, are satisfied. We note that for our specific problem, those conditions essentially require that: • A n and B are closed sets.
• Z ∆ n (t) satisfies the large deviation principle.
• For any compact K ⊂ R d + , all zero cost trajectories with initial positions in K will enter (A n ∪ B) • by some fixed finite time depended on K.
Indeed all these conditions are satisfied, and thus the proposition follows.