Simulation of a simple particle system interacting through hitting times

We develop an Euler-type particle method for the simulation of a McKean– Vlasov equation arising from a mean-ﬁeld model with positive feedback from hitting a boundary. Under assumptions on the parameters which ensure diﬀerentiable solutions, we establish convergence of order 1 / 2 in the time step. Moreover, we give a modiﬁcation of the scheme using Brownian bridges and local mesh reﬁnement, which improves the order to 1 . We conﬁrm our theoretical results with numerical tests and empirically investigate cases with blow-up.


Introduction
There has been a recent surge in interest in mean-field problems, both from a theoretical and applications perspective. We focus on models where the interaction derives from feedback on the system when a certain threshold is hit. Application areas include electrical surges in networks of neurons and systemic risk in financial markets. As analytic solutions are generally not known, numerical methods are inevitable, but still lacking.
We therefore propose and analyse numerical schemes for the simulation of a specific McKean-Vlasov equation which exhibits key features of these models, namely where α, T ∈ R + , W a standard Brownian motion on a probability space (Ω, F , (F t ) t≥0 , P), on which is also given an R + -valued random variable Y 0 independent of W . The nonlinearity arises from the dependence of L t in (1) on the law of Y . More specifically, if t → L t has a derivative p τ , which is then the density of the hitting time τ of zero, so that the drift depends on the law of the path of Y . Theoretical properties of (1)-(3) have been studied in Hambly et al. (2018), who prove the existence of a differentiable solution (L t ) 0<t<t * up to an "explosion time" t * . Conversely, they show that L cannot be continuous for all t for α above a threshold determined by the law of Y 0 . Such systemic events where discontinuities occur are also referred to as "blow-ups" in the literature.
The question of the constructive solution, however, remained open. Examples of a numerical solution computed with Algorithm 1 introduced in Section 2 are shown in Figure 1. The left plot shows the formation of a discontinuity in the loss function t → L t for increasing α, with Y 0 ∼ Gamma(1.5, 0.5). The density of Y T for T before and after the shock is displayed in the right panel. A similar model has been studied in Nadtochiy and Shkolnikov (2017), where the authors consider log(1 − L t ) instead of −L t in (1). Our numerical scheme can be applied in principle to this problem, but we concentrate the analysis on (1)-(3) in this paper.
One motivation for studying these equations comes from mathematical finance, in particular, systemic risk. A large interconnected banking network can be approximated by a particle system with interactions by which the default of one firm, modeled as the hitting of a lower default threshold of its value, causes a downward move in the firm value of others. More details can be found in Hambly et al. (2018) and Nadtochiy and Shkolnikov (2017). This model can also be viewed as the large pool limit of a structural default model for a pool of firms where interconnectivity is caused by mutual liabilities, such as in Lipton (2016).
An earlier version of this problem is found in neuroscience, where a large network of electrically coupled neurons can be described by McKean-Vlasov type equations (Cáceres et al. (2011);Carrillo et al. (2013); Delarue et al. (2015b,a)). If a neuron's potential reaches some fixed threshold, it jumps to a higher potential level and sends a signal to other neurons. This feedback leads to the following equations where X 0 < 1 a.s. The similarity to (1)-(3) is seen by noticing that (5) is constant between hitting times, however, while in (4) an upper boundary is hit and after that the value resets to zero, in our model we are interested in hitting the zero boundary (from above), and after hitting the particle's value remains zero. While McKean-Vlasov equations are an active area of current research, to our knowledge, there is only a fairly small number of papers where their simulation is studied rigorously and none of these encompass the models above.
Early works include Bossy and Talay (1997), which proves convergence (with order 1/2 in the timestep and inverse number of particles) of a particle approximation to the distribution function for the measure ν t of X t in the classical McKean-Vlasov equation with sufficient regularity. The proven rate in the timestep is improved to 1 in Antonelli et al. (2002) using Malliavin calculus techniques. More recently, multilevel simulation algorithms have been proposed and analysed: Ricketson (2015) considers the special case of an SDE whose coefficients at time t depend on X t and the expected value of a function of X t ; Szpruch et al. (2017) study a method based on fixed point iteration for the general case (6). An alternative variance reduction technique by importance sampling is given in Dos Reis et al. (2018).
The system (1)-(3) above does not fall into the setting of (6) due to the extra pathdependence of the coefficients through the hitting time distribution. In this paper, we therefore propose and analyse a particle scheme for (1)-(3) with an explicit timestepping scheme for the nonlinear term. We simulate N exchangeable particles at discrete time points with distance h, whereby at each time point t we use an estimator of L t−h from the previous particle locations to approximate (3). We prove the convergence of the numerical scheme up to some time T as the time step h goes to zero and number of particles goes to infinity. We also proved that the scheme can be extended up to the explosion time under certain conditions on the model parameters. The order in h for this standard estimator is 1/2. Next, we use Brownian bridges to better approximate the hitting probability, similar to barrier option pricing (Glasserman (2013)). In this case, the convergence order improves to (1 + β)/2, where β ∈ (0, 1] is the Hölder exponent of the density of the initial value Y 0 (e.g., 0.5 in the example from Figure 8 below). The order can be improved to 1 for all β ∈ (0, 1] by non-uniform time-stepping. A main contribution of the paper is the first provably convergent scheme for equations of the type (1)-(3). The analysis uses a direct recursion of the error and regularity results proven in Hambly et al. (2018). This has the advantage that sharp convergence orders -i.e., consistent with the numerical tests -can be given, but also means that it seems difficult to apply the analysis directly to variations of the problem where such results are not available. Nonetheless, the method itself is natural and applicable in principle to other settings such as those outlined above.
The rest of the paper is organized as follows. In Section 2 we list the running assumptions and state the main results of the paper; in Sections 3.1 and 3.2 we prove the uniform convergence of the discretized process; in Section 3.3, we show the convergence of Monte Carlo particle estimators with an increasing number of samples; in Section 3.4 we prove the convergence order for the scheme with Brownian bridges; in Section 4 we give numerical tests of the schemes; finally, in Section 5, we conclude.

Assumptions and main results
We begin by listing the assumptions. The first one, Hölder continuity at 0 of the initial density, is key for the regularity of the solution. The Hölder exponent will also limit the rate of convergence of the discrete time schemes.
Assumption 1. We assume that Y 0 has a density f Y 0 supported on R + such that for some β ∈ (0, 1]. Under Assumption 1, we can refer to Theorem 1.8 in Hambly et al. (2018) for the existence of a unique, differentiable solution t → L t for (1)-(3) up to time and a correspondingB such that for every t < t * This estimate admits a singularity of the rate of losses at time 0, however, what is actually observed in numerical studies (see Figure 1, left) is that the loss rate is bounded initially but then has a sharp peak for small β (and then especially for large α). Integrating (8), we have for future reference a bound on L t , whereB = 2B/(1 + β).
The following assumption will be used to control the propagation of the discretisation error, by bounding the density (especially at 0) of the running minimum of Y and its approximations.
Assumption 2. We assume that T < min(T * , t * ), where T * is defined by with B andB the smallest constants such that (7) and (9) hold for given β.
In the following, we assume that Assumptions 1 and 2 hold. Consider a uniform time mesh 0 = t 0 < t 1 < . . . < t n = T , where t i − t i−1 = h, and a discretized process, for We extendL t i to [0, T ] by settingL s =L t i−1 for t i−1 < s < t i . The first theorem, proven in Section 3.1, shows thatL t converges uniformly to L t .
Algorithm 1 Discrete time Monte Carlo scheme for simulation of the loss process Require: N number of Monte Carlo paths Require: n number of time steps: 0 < t 1 < t 2 < . . . < t n 1: Draw N samples of Y 0 (from initial distribution) and W (a Brownian path) 2: DefineL 0 = 0 3: for i = 1 : n do 4: end for 8: end for In Section 3.3, we prove convergence in probability of Algorithm 1 as N → ∞.
Next, we improve our scheme by using a Brownian bridge strategy to estimate the hitting probabilities. In order to do this, we consider the process Then, for each Brownian path (W t is an N-sample estimator ofL t i given below. Hence, using Brownian bridges, we compute As a result, the new algorithm with the Brownian bridge modification is the following.

Algorithm 2 Discrete time Monte Carlo scheme with Brownian bridge
Require: N number of Monte Carlo paths Require: n number of time steps: 0 < t 1 < t 2 < . . . < t n 1: Draw N samples Y 0 (from the initial distribution) and W (a Brownian path) 2: for i = 1 : n do 3: EstimateL t i using (19) 4: end for 7: end for The convergence rate for (17) is given as follows and proven in Section 3.4.

Convergence of the timestepping scheme
In this section we prove Theorem 1. Then, in Section 3.2, under a modification of Assumption 2, we formulate and prove an improvement of this theorem which extends the applicable time interval. The proof is based on induction on the error bound over the timesteps, which requires an error estimate of the hitting probability after discretisation (Lemmas 1 and 2), a sort of consistency, plus a control of the resulting misspecification of the barrier through a bound on the density of the running minimum (Lemma 3), a kind of stability.
As we have crude estimates on the densities, using only Assumption 1 but no sharper bound on, and regularity of, f Y 0 , or any regularity of the distribution of inf s≤t (W s −αL s ) or its approximations, the time T * until which the numerical scheme is shown to converge will by no means be sharp. Indeed, in our numerical tests (see Section 4) we did not encounter any difficulties for any t, even t > t * .
We first formulate these auxiliary results which we will use to prove Theorems 1, 3, and 4. The proofs are given in Appendix A.
First, we modify Proposition 1 from Asmussen et al. (1995), where an analogous result is shown for standard Brownian motion, without the presence of L andL (i.e., α = 0). and The next lemma deduces the convergence rate of the hitting probabilities on the mesh.
Lemma 2. Consider the processes Y andỸ . Then, for any δ > 0 there exist γ,γ > 0, independent of h and i, such that where t i < t * .
Finally, we bound the probability that the running minimum is close to the boundary.
We are now in a position to prove Theorem 1.
We split the error into two contributions, We can use Lemma 2, (23), for the second term. For the first term, Then, using Lemma 3, we havē As a result, we have the following inequality forC i , henceC i is bounded independent of i and h by Assumption 2. By induction we get (14).

Extension of the result in time
The following result extends the applicability of Theorem 1 up to the explosion time t * under certain conditions on the parameters, as specified precisely in (27) below. We shall adapt Theorem 1 in Borovkov and Novikov (2005), which states: For a Lipschitz function f with Lipschitz constant K, and g such that sup s≤t |f (s) − g(s)| ≤ ε, By Remark 2 in Borovkov and Novikov (2005), this result can be improved for a nondecreasing function g. Indeed, retracing the steps in their proof and using monotonicity, one finds easily the slightly better bound |P(∃s ∈ [0, t] : W s < f (s)) − P(∃s ∈ [0, t] : W s < g(s))| ≤ 2 K + 1 √ t ε.
In our case, we cannot directly apply the result with f (s) = −Y 0 + αL s and g(s) = −Y 0 + αL s as f is not guaranteed to be Lipschitz at s = 0. But, along the lines of the proof of Theorem 1 in Borovkov and Novikov (2005), the above result can be modified as follows: Lemma 4. For a non-decreasing function g which is Lipschitz with constant K on [T * , T ], and a function f such that f (t) = g(t) for t ≤ T * and sup s≤T |f (s) − g(s)| ≤ ε, Theorem 4. Assume T * from (10) satisfies Then Theorem 1 holds for any T < t * (and not only up to T * as per Assumption 2).
Proof. We first split again the error by The second term can be estimated from Lemma 2, (24). Again, we shall then proceed by induction. We have already shown that with K the Lipschitz constant of L, and thus To estimate the first term, we can write where we have used in the second line that since l t =L t for t > T * , hitting after T * does not affect the difference. For t ≤ T * , l t = L t . Then, using Theorem 1, For the second term in (30) and we can apply (26). Thus, using (29). Moreover, by (24) and (32), (28) can be written as whereγ =γ +C T * . Taking into account (27), we have As a result, we have the following inequality for C i , Since the Lipschitz constant K ≤B(T * ) − 1−β 2 , using (27), 2α K + 1 √ t i < 1. Hence, C i is bounded independent of h, and by induction we get the result.

Monte Carlo simulation of discretized process
In this section, we prove the convergence in probability of in Algorithm 1 toL t i as N → ∞. We note that we cannot directly apply the law of large numbers, as the summands are dependent throughL N t j , j < i. However, we see below that the dependence diminishes (i.e., the covariance goes to zero) as N → ∞, which easily gives convergence, albeit without a Central Limit Theorem-type error estimate or a rate for the variance.
First, we formulate an auxiliary lemma.
Lemma 5. Consider i ≤ n. Assume for all j < î Then, The proof is given in Appendix B. Now we can deduce the convergence instantly.
Proof of Theorem 2. The proof is immediate by induction. The statement is true for i = 0. Now take i ≥ 1. By Lemma 5, there exists N * such that for all N > N * , Thus, by Chebyshev's inequality, we have

Using again Lemma 5, we have that
for i and by induction we have proved the theorem.

Brownian bridge convergence improvement
In this section, we prove Theorem 3, which ascertains the uniform convergence (in t) of L to L at the improved rate.
Proof of Theorem 3. We shall proceed by induction. For i = 0, we have thatL 0 = L 0 = 0. Assume we have shown that |L t j − L t j | ≤ C j h 1+β 2 for all j < i with some C j > 0, and we want to estimate |L t i − L t i |. First, for j > 0 we have where ϕ i (x) is the density of inf s<t i Y s . Then, using Lemma 3, we have . Thus, C i is bounded independent of h and i by (10). By induction we get (20).
The proof of Theorem 3 indicates that the order is limited by the behaviour of L for small t. The next result shows that a locally refined time mesh achieves convergence order 1 for all β.
Corollary 1. Consider a non-uniform time mesh t i = (ih) 2 1+β for 0 ≤ i ≤ n with h = T 1+β 2 /n. Then, there exists C 1 > 0, independent of h, such that Proof. We shall proceed by induction as in the proof of Theorem 3 above. We now have the time step for some ξ ∈ (0, 1), and, for j > 0, Hence, for j > 0, for someB > 0 independent of h and j. We treat j = 0 separately and obtain directly Repeating the remaining steps of the proof of Theorem 3, we get the result.

Numerical experiments
In this section, we demonstrate that the proven convergence orders in the timestep of 1/2, (1 + β)/2, and 1 for the different methods are indeed sharp in the case of regular solutions; that the empirical variance of N-sample estimators is 1/N; and that the method also converges experimentally in the presence of blow-up. To show this, we study three test cases with varying regularity of the initial data and of the loss function.

Lipschitz initial data and no blow-up
In our first experiment, we choose Y 0 such that 1 Y 0 ∼ exp(λ), which guarantees that the density decays exponentially near zero. We take λ = 1 in our experiments, and pick the parameters in (1) to be α = 0.8, T = 2. The solution is found to be continuous.
We perform numerical simulations using Algorithms 1 and 2 with N = 2 × 10 5 particles and different time meshes varying from 50 to 3200 points. To estimate the error, we consider the difference |L 2n T −L n T | between the solutions with n and 2n timesteps, respectively, computed with the same paths. The results, presented in Figure 2, agree with the theory: for Algorithm 1, we get the convergence rate 1 2 , and for Algorithm 2, the rate is 1, because the initial distribution is regular enough around 0, i.e. β = 1 in (1). We also investigate the convergence rate ofL t i toL t i andL t i toL t i empirically. In order to compute the benchmark solution, we used N = 5 × 10 7 particles. From the results we conclude that both Algorithm 1 and 2 have the convergence rate 1 2 in N. To illustrate the dependence on the parameter α, we include in Figure 3 plots for L t and L ′ t for different values of α. We evaluate L ′ t numerically using a central finite difference approximation. In order to reduce the Monte Carlo noise, we increase N to 5 × 10 7 and reduce n to 200.

Hölder 1/2 initial data and no blow-up
In another example, we again consider α = 0.8 and T = 2, but choose Y 0 ∼ Gamma(1 + β, 1/2), i.e. with density such that we have that f Y 0 (x) ≤ Cx β for x > 0 and some C > 0. We choose β = 1 2 . The solution is again found to be continuous.
We perform numerical simulations using Algorithm 1, and Algorithm 2 on uniform and non-uniform meshes varying from 50 to 3200 points and with N = 2 × 10 5 particles. The results are presented in Figure 4. As predicted by the theory, for Algorithm 1 we get the convergence rate 1 2 , for Algorithm 2 on uniform meshes rate 1+β 2 = 3 4 , and for Algorithm 2 on non-uniform meshes rate 1. As in the previous example, we also investigate the convergence rate in N empirically. These results also confirm 1 2 convergence rate in N for both Algorithms 1 and 2. In Figure 5 we present the dependence of L t and L ′ t on the parameter α. As in the previous example, we use N = 5 × 10 7 and n = 200.

Hölder 1/2 initial data and blow-up
In our third example, we illustrate possible jumps arising in the solution for sufficiently large values of α. We consider α = 1.5, T = 0.008 and choose Y 0 as in the previous example, Y 0 ∼ Gamma(1 + β, 1/2). Note that the blow-up happens already for very small t due to the interplay of the mass close to 0 for Y 0 and the relatively large α.
With the lack of convergence theory for discontinuous (L t ) t≥0 , we apply Algorithms 1 and 2, and empirically estimate the error. In Figure 6 (a) we showL t computed using Algorithm 1 for different n; in Figure 6 (b) the numerical error as a function of t for specific n; and in Figure 7 (a) and (b) we estimate the convergence rate for different t.
Figure 6 (a) shows that a fairly fine resolution is needed to capture the discontinuity and its timing, but that all meshes predict the size of the jump well. This is further illustrated in Figure 6 (b), which shows the build-up of the error before the jump, the lack of uniform convergence due to the displacement of the jump on different meshes, and the relatively constant error after the jump.
In Figure 7 (a) we estimate the convergence order at T = 0.002, i.e. before the jump. By regression, we get 0.53 (0.47, 0.59) for Algorithm 1 and 0.80 (0.72, 0.88) for Algorithm 2, where the 95% confidence interval is in brackets, which agrees with the theory for continuous L t . In Figure 7 (b), for the error at T = 0.008, i.e. after the jump, we get 0.93 (0.81, 1.05) for Algorithm 1 and 1.03 (0.92, 1.14) for Algorithm 2. The faster convergence may result from the almost constancy of the losses after the jump.
To get an overall picture of the accuracy, we suggest the following metrics to measure the "closeness" of the computed solutions to L t : . This is practically computed by numerical integration. 2. d 2 (L t ,L t ) = |t * −t * |, where t * andt * are the jump times for L t andL t , respectively. They are approximated by the points with the steepest gradient j * = argmax j (L t j −L t j−1 ), t * = j * h.

Conclusions and outlook
We have developed particle methods with explicit timestepping for the simulation of (1)-(3). Convergence with a rate up to 1 in the timestep is shown under a condition on the model parameters and time horizon, when the loss function is differentiable. Experimentally, the method also converges in the blow-up regime, and the variance of estimators is inversely proportional to the number of samples. This opens up several theoretical and practical questions. The efficiency of the method could be significantly improved by a simple application of multilevel simulation (Szpruch et al. (2017); Ricketson (2015)). If the variance can be shown to behave on the finer levels as suggested by the numerical tests, combined with the proven result on the time stepping bias, the computational complexity for root-mean-square error ǫ would be brought down to ǫ −2 , from ǫ −3 as observed presently (ǫ −2 from the number of samples and ǫ −1 from the number of timesteps). For the particular system (1)-(3), it is conceivable to apply a particle method without time stepping of the form and W (i) are N i.i.d. copies of Y 0 and W , and τ (j) , 1 ≤ j ≤ N, is the order statistic of the hitting times of zero. Then, for τ (i) < t < τ (i+1) , conditional on the information up to τ (i) , the law of Y have not yet hit, is identical to that of independent standard Brownian motions with constant drift −αi/N. So if we simulate those independent hitting times of −Y (j) τ (i) for each and then take the smallest one to be τ (i+1) , this inductive construction satisfies the correct law. The complexity is then O(N 2 ), and combined with the observed variance of O(1/N), this gives a complexity of ǫ −4 for error ǫ. Another disadvantage is that this method with exact sampling is restricted to the constant parameter case, while the time stepping scheme can more easily be extended to variable coefficients.
Theoretically, one would like guaranteed convergence also in the blow-up regime. This requires the choice of an appropriate metric -the Skorokhod distance may be suitable, considering the analysis in Delarue et al. (2015b) and the tests in Section 4.3. It also requires verification that the limit in the case of blow-up is a so-called "physical solution" as defined in Hambly et al. (2018); Nadtochiy and Shkolnikov (2017); Delarue et al. (2015b), which on the one hand results from a specific sequential realisation of the losses in the case of blow-up (Nadtochiy and Shkolnikov (2017); Delarue et al. (2015b)), and on the other hand is the right-continuous solution with the smallest jump size (Hambly et al. (2018)).
Lastly, it would be interesting to investigate the extension to the models in Nadtochiy and Shkolnikov (2017) and Delarue et al. (2015b) in more detail.

Appendices
A Proof of Lemmas in Section 3.1

A.2 Proof of Lemma 2
Proof of Lemma 2. We first prove (23). It is obvious that for any ε > 0. Consider the first term. Using Markov's inequality P min for any p ≥ 1.
For the second term P min j<i Y t j ∈ (0, ε), min whereF i (x) andφ i (x) are the CDF and PDF of the process min j<i Y t j . We also note thatφ i (θε) is bounded according to Lemma 3. Combining both terms, we have P min Minimising the right-hand side over ε, we find ε = h a with a = p 2(p+1) . As a result, choosing p → ∞, we get that for any δ > 0, there exists γ > 0, such that P min For (24), we use identical steps and the corresponding estimates in Lemma 1 and Lemma 3 forỸ instead of Y .

A.3 Proof of Lemma 3
Proof of Lemma 3. We can rewrite Z i = Y 0 + V i , where V i = inf s≤t i (W s − αL s ). As Y 0 and V i are independent, using convolution, we have Using the fact that V i ≤ 0, Y 0 ≥ 0, and (7), we have where F V i is the CDF of V i . It is obvious that F V i (z) > 0 for all z ≤ 0. Indeed, Since β ∈ (0, 1], for all z the function (z − ·) β : (−∞, z ∧ 0) → (0, ∞) is concave and, using Jensen's inequality for the proper probability measure P(V i ∈dv) where By simple integration of the density of the running minimum of Brownian motion, and, taking ξ = z + αBt 1+β 2 i , we get from (40) for z > 0, and, similarly, for z ≤ 0, Combing the last two equations, we finally get (25) for ϕ i . Using similar arguments forZ i andZ i , using inf s≤t i W s ≤ min j<i W t j and L t i ≥L t i , respectively, we get the analogous estimates forφ i andφ i , and hence (25).