Non local branching Brownian motions with annihilation and free boundary problems

We study a system of branching Brownian motions on R with annihilation: at each branching time a new particle is created and the leftmost one is deleted. The case of strictly local creations (the new particle is put exactly at the same position of the branching particle) was studied in [10]. In [11] instead the position y of the new particle has a distribution p ( x, y ) dy , x the position of the branching particle, however particles in between branching times do not move. In this paper we consider Brownian motions as in [10] and non local branching as in [11] and prove convergence in the continuum limit (when the number N of particles diverges) to a limit density which satisﬁes a free boundary problem when this has classical solutions. We use in the convergence a stronger topology than in [10] and [11] and have explicit bounds on the rate of convergence.


Introduction
The system considered in this paper fits in a class of models proposed by Brunet and Derrida in [3] to study selection mechanisms in biological systems and continues a line of research initiated by Durrett and Remenik in [11].
Durrett and Remenik have in fact studied a model of particles on R which independently at rate 1 branch creating a new particle whose position is chosen randomly with probability p(x, y)dy, p(x, y) = p(0, y − x), if x is the position of the generating particle.
At the same time the leftmost particle is deleted so that the total number of particles is constant.
In the biological interpretation particles are individuals in a population, the position of a particle is "its degree of fitness", the larger the position the higher the fitness. If the environment supports only populations of a given size then to each birth there must correspond a death. The removal of the leftmost and hence less fitted particle is a very effective Darwinian selection rule to implement the conservation of the population size.
Even if the duplication rule is regressive, i.e. p(x, y) has support on y < x, nonetheless the population fitness improves and if p(0, x) > 0 for x ∈ (−a, 0), a > 0, then as time diverges the whole population concentrates around the position of the initially best fitted individual. Durrett and Remenik have studied the case where p(x, y) is symmetric and discussed the occurrence of traveling -wave solutions which describe a steady improvement of the population fitness, see also [15] and Brunet, Derrida [1], [3] for the analysis of traveling waves in a large class of systems.
Such issues are better studied in the continuum limit N → ∞, the natural guess for the continuum version of the Durrett and Remenik duplication process is ∂ ∂t ρ(x, t) = ∞ Xt dy p(y, x)ρ(y, t)dy, x ≥ X t ; ρ(r, 0) = ρ 0 (r) (1.1) where ρ(x, t) is the particles density and X t = sup{r : ρ(r, t) = 0}. The removal process in the continuum is more implicit and given by the condition on X t that for all t ≥ 0 ∞ Xt dy ρ(y, t) = 1 (1.2) Under suitable assumptions on the initial datum ρ 0 and on the probability kernel p(x, y) Durrett and Remenik have proved that (1.1) and (1.2) have a unique solution and that this is the limit density of the particles system. (1.1) has a nice probabilistic interpretation: ρ(x, t) = e t E x ρ 0 (x * (t))1 τ >t (1.3) where x * (t) is the jump Markov process with generator A * f (x) = dy p * (x, y) f (y) − f (x) , p * (x, y) = p(y, x) (1.4) and τ = inf{t : x * (t) ≤ X t } (1.5) (1.3) is the backward Kolmogorov equation for the jump process with generator A equal to the adjoint of A * (i.e. when the jump x → y has probability p(x, y)).
Similar formulas hold as well in the case considered in this paper where we study a natural extension of the Durrett-Remenik model where particles move as independent Brownian motions in between branching times: biologically this means that the individuals fitness changes randomly in time. As in [11] the new particles are created at random positions with probability p(x, y) and, like before, as soon as a particle is created the leftmost one is deleted. We have already studied in a previous paper, [10], the case where p(x, y) = δ(y − x), namely when the duplication is exact. The extension to general p(x, y), which is the aim of this paper, is obtained following the same scheme used in [10], and indeed some of the proofs are straightforward adaptation of those in [10] and its details are omitted. We thus focus on the new parts for which we give complete details.
The main novelty in this paper, besides the non locality of the branching, is the use of a strong topology in the convergence of the process based on the Kolmogorov-Smirnov distance. We will obtain explicit bounds on the rate of convergence. We will also show that the limit density is the solution of a free boundary problem provided this has a classical solution.
In the next two sections we make precise the model and state the main results, an outline of how the paper is organized is then given at the end of Section 3. We conclude this introduction by mentioning that there have been several papers about particles processes which in the continuum limit are described by free boundary problems. Some of them will be mentioned in the sequel, for a list we refer to a survey on the subject, [6].

The model
We will consider in this paper several processes, the main one is x(t) (also called sometimes the "true process"). We will also use auxiliary processes x δ,± (t), δ a positive parameter, called the upper and lower stochastic barriers. We define all of them in a common space as subsets of a "basic" process y(t) that we define next.
The "basic" process y(t). The state space of the "basic" process y(t) is the set of configurations with finitely many point particles, we will denote by |y| the number of particles in a configuration y. If convenient we may label the particles by writing y(t) = (y 1 (t), . . . , y n (t)), sequences which only differ for the labelling are however considered equivalent.
To define the process we attach to each particle an independent exponential clock of intensity 1: when it rings for a particle (call x its position when its clock rings) then a new particle is created at position y with probability p(x, y)dy, assumptions on p(x, y) are stated later. In between clock rings the particles are independent Brownian motions. P denotes the law of the process and when needed we will write P y 0 to specify that the process starts from y 0 .
The basic process is well known in the literature see for instance [12].
The "true" process x(t). As in the basic process each particle of x(t) has an independent exponential clock of intensity 1 and in between clock rings the particles are independent Brownian motions.
Also for x(t) when a clock rings for a particle (call x its position) then (like in the basic process) a new particle is created at position y with probability p(x, y)dy, here however at the same time when a new particle is created the leftmost particle (among those previously present plus the new one) is deleted, so that |x(t)| is constant. Evidently x(t) can be realized as a subset of y(t) obtained by disregarding in y(t) the particles which in x(t) are deleted as well as all their descendants.
The stochastic barriers x δ,± (t). They are defined like x(t), (see Subsection 5.1) the difference being that the removal of particles is not simultaneous to the branching but it occurs at discrete times kδ, k ∈ N, δ > 0.
The initial configuration.
We will study the process x(t) having fixed the initial number of particles, denoted by N . We will tacitly suppose in the whole sequel that N ≥ N 0 , where N 0 is a large positive integer, requests on N 0 will be stated in the course of the proofs. The process starts from an initial configuration x 0 whose distribution is obtained by taking N independent copies of a position variable distributed with probability ρ 0 (r)dr. We suppose that ρ 0 (r) is a smooth (C ∞ ) probability density with support in [−A, A], A > 0. Thus x(t) is realized as a subset of the basic process which starts from y(0) = x 0 and its law will be denoted by P (N ) .
The assumptions on ρ 0 and p(x, y) could be relaxed but we would have some more technical details to take care of.
The counting measure.
Given a particle configuration x we call the associated counting measure. Our aim is to study the behavior of the probability measure 1 In the next section we state the main results and, at the end, outline the way the paper is organized.

Main results
Our main result is that under P (N ) the counting measure 1 N π x(t) (dr) of the true process has a limit when N → ∞, converging to a measure u(r, t)dr with u(r, t) a continuous function. We will use the Kolmogorov-Smirnov topology, see below, and we will give explicit bounds on the convergence rate.
The Kolmogorov-Smirnov distance (K-S distance for brevity) between probability measures on R is defined as Call M the space of probability measure valued functions on R + whose elements are denoted by µ = (µ t ) t≥0 . We define a topology in M using the K-S distance as follows.
The neighborhoods X T,ε,n (ν). T > 0 is the time window where we study the process. We use a time grid with mesh δ: Calling ε > 0 an accuracy parameter we then set for ν ∈ M: The limit density u(r, t). In Theorem 3.1 below we will prove convergence of the counting measure ( 1 N π x(t) (dr)) t≥0 to a limit u := (u(r, t)dr) t≥0 where u(r, t) is a continuous function. u(r, t) is characterized by such property but also as limit of "deterministic barriers" as discussed in Theorem 3.2.
Theorem 3.1. There is N 0 so that for N ≥ N 0 the following holds. Let u as above then there are c and c * so that: with T, ε, n, N such that T < log log N and ε > c e T √ T 2 3n/2 N −1/12 + e 2T T N 1/24 N 1/24 ! + e T N −1/12 + 2 −n e T T • There are constraints on the values of T , ε and n because the last condition stated in (3.5) requires that ε > c2 −n e T T (2 −n T being the mesh δ, see (3.2)). This seems to contradict the statement about the convergence of the process to u: take for instance n 0 , T and ε so that ε < c2 −n0 e T T , then (3.5) is not satisfied no matter how large is N . However X T,ε,n (u) decreases as n increases, therefore We then take n > n 0 so large that ε > c2 −n e T T and then apply (3.4).
• No matter how small we take the mesh of the time grid (i.e. n large), the time span T and the accuracy ε (provided ε > c2 −n e T T ), yet we can take N so large that the process is in X T,ε,n (u) with large probability (1 − c2 n N −α ). • We will characterize the limit density u(r, t) as the solution of a free boundary problem, if this has a classical solution, see Theorem 3.3.
• We will next state some results which will be proved in the next sections and show that with their help Theorem 3.1 then follows.

Mass transport order.
Let µ and ν be finite, positive measures on R, we then write In such a case ν can be obtained from µ by "moving mass to the right".
We will use in the whole sequel the following notation: let f (t), t ∈ R be such that for any t the right and left limit of f (s) as s → t exists, then We will prove that the limit density u(r, t) of Theorem 3.1 is the limit of lower and upper barriers, denoted by ρ δ,± (r, t), δ > 0. Their definition is given in Section 4.3, their main property is: Theorem 3.2. Let T > 0, δ = 2 −n T , t k = (kδ) + , k ≤ 2 n . We then have ρ δ,− (r, t k )dr ρ δ,+ (r, t k )dr (3.8) and ρ δ,− (r, t k )dr ρ δ/2,− (r, t k )dr, ρ δ/2,+ (r, t k )dr ρ δ,+ (r, t k )dr (3.9) Finally, there is a continuous function u * (r, t) such that for all k ≤ 2 n ρ δ,− (r, t k )dr u * (r, t k )dr ρ δ,+ (r, t k )dr (3.10) u * (r, t)dr is the unique element which separates the barriers in the L 1 sense because for δ ≤ 1, the barriers are L 1 -close: We give a detailed proof of (3.11) in Subsection 4.3 and refer to the literature for the proof of the other statements in Theorem 3.2 as very analogous to those in [6] and [5] [8], [9] for similar models. We will next use Theorem 3.2 to prove Theorem 3.1 and to identify u = u * . We need the following notion: let µ and ν be probabilities on R, we then define µ ν modulo ε, if for all R: µ[R, ∞) ≤ ν[R, ∞) + ε (3.12) Obviously: µ ν modulo ε and ν µ modulo ε is equivalent to |µ − ν| KS < ε

A free boundary problem
We can characterize the limit density u(r, t) in Theorem 3.1 as the solution of a free boundary problem whenever this has a classical solution. Given a C 1 curve γ t , t ≥ 0, let ρ solve ∂ ∂t ρ(r, t) = 1 2 ∂ 2 ∂r 2 ρ(r, t) + dy ρ(y, t)p(y, r), r > γ t (3.17) ρ(r, 0) = ρ 0 (r), lim r↓γt ρ(r, t) = 0 We will prove existence and uniqueness (see Theorem 4.3) and that the solution of (3.17) has a probabilistic interpretation similar to that in (1.3). The associated free boundary problem P is the following. P: Let ρ 0 (r) be as in the previous section, namely a smooth probability density with compact support. We further assume that calling γ * := inf{r : ρ 0 (r) > 0}: dρ 0 (r) dr = dy ρ 0 (y, t)p(y, γ * ) Find γ t ∈ C 1 and ρ(r, t), r ≥ γ t in such a way that γ 0 = γ * , ρ(r, t) solves (3.17) and We will prove in Subsection 4.5 that: If the free boundary problem P has a classical solution (γ t , ρ(r, t)) then ρ(r, t) coincides with the function u(r, t) found in Theorem 3.1 which coincides with u * (r, t) of Theorem 3.2.
In [14], Jimyeong Lee has proved local in time existence of classical solution of the free boundary problem P; more recently J. Berestycki, E. Brunet, S. Penington, [2] have proved global existence for a class of free boundary problems. To prove Theorem 3.3 we will show in Theorem 4.8 that a classical solution of the free boundary problem P (see (3.18)) is squeezed in between the lower and upper barriers and it thus coincides with u * (r, t).

Outline of the paper
In Section 4 we will define the barriers and prove in subsection 4.5 Theorem 3.3 by exploiting a probabilistic representation of the solution of the free boundary problem which is related to one used in the definition of the barriers. In Section 5 we define the stochastic upper and lower barriers and prove that they squeeze in between the true process (in the sense of mass transport order). We will prove in Section 6 the missing statement (namely (3.14)) in the proof of Theorem 3.1 using estimates whose proofs are postponed to Section 7. In an Appendix we prove some more technical estimates.

Probabilistic representations of deterministic evolutions
In Subsection 4.1 we study a version of (3.17) extended to the whole R, called the free evolution equation. In Subsection 4.2 we then study (3.17) itself, supposing that γ t is a given C 1 curve. The important point in both cases is a probabilistic representation of the solutions which will be often used in the sequel. In Subsection 4.3 we will define the "lower and upper barriers". Then after stating some a-priori bounds in Subsection 4.4, in Subsection 4.5, using the previous analysis and in particular the probabilistic representation of the solution of (3.17), we will prove that a classical solution of the free boundary problem, when it exists, coincides with the separating element of Theorem 3.2.

The free evolution
The free evolution equation is We will also consider: Recall from Section 2 that p(x, y) = p(0, x − y) and p(0, z) ∈ C ∞ with compact support. We will study the two equations with initial data ψ ∈ C b (R, R + ) which denotes the space of non negative, bounded, continuous function on R.
Notice that if u(x, t) solves (4.2) then ρ(x, t) := e t u(x, t) solves (4.1) and viceversa if ρ(x, t) solves (4.1) then u(x, t) = e −t ρ(x, t) solves (4.2), thus the two equation are closely related. We will see that (4.2) is the backward Kolmogorov equation for the process of a Brownian motion on R with jumps occurring independently at rate 1 with probability p(x, y), hence the probabilistic interpretation of the two equations. Given any x ∈ R there is a unique solution T * t (y, x) of (4.1) with initial condition δ x (y): T * t (y, x) is then the Green function for (4.1). T * t (y, x) is C ∞ in y and t for t > 0 and there are c and c independent of x such that Moreover for each t lim is the Green function of (4.2).
Proof. We fix x which will not be made explicit in the notation and start by solving the integral version of (4.1), namely (4.7) below in the unknown v(y, t). By an abuse of notation we call G t and p * the integral operators with kernels G t (x, y) and p * (y, x) where s = (s 1 , .., s n ) and It then follows that for any s: Moreover, calling c := sup x p(0, x) and bounding the first p * in (4.10) by c we get Thus the series in (4.9) is convergent, v as given by (4.9) is well defined, (4.3) holds and v is the unique solution of (4.7). Thus if T * t (y, x) exists then it solves the integral equation and therefore it is unique and T * t (y, x) ≡ v(y, t), hence existence and uniqueness of T * t (y, x) will follow once we prove that v(y, t) solves (4.1) for t > 0 with initial datum δ x (y). To do this we need to prove differentiability conditions on v(y, t).
By differentiating (4.9) k times with respect to y, proceeding as in the proof of (4.11), We are going to use this to prove that v satisfies (4.1). To this end we change s → t − s in the integral on the right hand side of (4.8) and differentiate with respect to t, getting The first and third term on the right hand side reconstruct the second derivative of v(y, t) with respect to y, hence v satisfies (4.1) for t > 0 with initial datum δ x (y) and Then the first inequality in (4.4) follows from (4.12). To prove the second inequality in (4.4) we rewrite (4.13) as Recalling that v(y, s) = T * s (y, x), by (4.3) and since p * (z, z ) is smooth with compact support we get that Thus, for a suitable constant c 1 : hence the second inequality in (4.4). By iteration we prove that all the derivatives of v with respect to t exist.
Proof of (4.5). By (4.9) and denoting by v y (y, t) the derivative with respect to y and recalling that the support of p is in [−ξ, ξ], The first term is bounded by c 1 √ t and the second one is bounded proportionally to e t .
The proof of the second inequality in (4.6) follows from (4.9) and (4.11). The first one holds because T * t (y, x) is non-negative, it has bounded derivative and it is integrable.
Finally, the last statement of the Theorem follows from what observed after (4. 2) The analogue of (4.9) for S * t (y, x) is (4.14) will be used for the probabilistic interpretation.
and for any t > 0, ρ(y, t) Analogously the solution of (4 and the operator S * t on C b (R, R + ) whose kernel is S * t (x, y) defines a semigroup on C b (R, R + ) with S * t = e −t T * Probabilistic interpretation. As mentioned the solutions of (4.2) have a probabilistic meaning. Let X = {X t , t ≥ 0}, be a Brownian motion on R with jumps occurring independently at rate 1 with probability p(x, y). This process can be realized in the The dual process has the same structure but with each Z n having probability p * (0, Z).
Then by (4.14) for any bounded continuous function This means that the Markov semigroup S t of the process X on C 0 (R, R + ) is the dual of the Markov semigroup S * t , thus it has a density S t (x, y) = S * (y, x) and

The evolution in semi-infinite domains
There is also a probabilistic representation for the solution of (3.17) for a given be the process defined in the previous subsection and call τ s = inf{t > s : X t ≤ γ t }. Given s ≥ 0 and X s > γ s we define for t > s For s ≥ 0 and x > γ s we call P x,s X Γ t ∈ dy the law at time t > s of X Γ t ≡ X Γ t;s on R with X Γ s = x and claim that it has a density with respect to the Lebesgue measure, denoted by α x,s (y, t). This follows because We will simply write α x (y, t) for α x,0 (y, t).
The main result of this subsection is the following theorem.
We split the proof of Theorem 4.3 in three statements: in Lemma 4.4 we prove that α x,s0 (y, t) is smooth, in Lemma 4.5, we prove that α x,s0 is 0 at the boundary and finally in (4.22) we compute the time-derivative of α x,s0 (y, t). All that proves Theorem 4.3.
where q x,s0 (dsdz) = P x,s0 (τ s0 ∈ ds, X s ∈ dz). Moreover Proof. For notational simplicity we take s 0 = 0. Call A x (y, t) the right hand side of (4.19). By Theorem 4.1, A x (y, t) is a smooth function of y in {y > γ t } and it is differentiable in t. Thus we only need to prove that A x (y, t) = α x (y, t). For all f with support on y > γ t and ( Proof. For notational simplicity we take s 0 = 0. Let 0 < s < t then, by conditioning α x (z, t) = ∞ γs dy α x (y, s)α y,s (z, t). Using the reverse process we get (see for instance [4]).

The deterministic barriers
In this Subsection we define the deterministic barriers ρ δ,± (·, t), δ > 0, which appear in Theorem 3.2. We first define ρ δ,± (x, t k ) for any positive integer k. To this end we first introduce the cut operators C ± δ as follows. Calling M a , a > 0, the set of v ∈ L 1 (R, Moreover for any fixed v ∈ M 1 and any integer k > 0 we set We finally define Proof of (3.11). Let δ < 1 and shorthand T * = T * δ , C ± = C ± δ . (3.11) follows from which holds for any k > 0 and any v ∈ M 1 as we are going to prove. For any w ∈ M 1 e δ C − w = C + e δ w (4.27) and since e δ T * = T * e δ we have By (4.28) and finally, recalling that ψ and φ are in M e δ , Hence (4.26).

A priori bounds
We will prove below that if the initial datum ρ 0 is bounded then both the solution of (4.1) and the barriers are bounded. To prove such a result we will use the probabilistic representation of the evolution.
Lemma 4.6. Let N > N 0 , N 0 and ρ 0 as in the paragraph "The initial configuration". Let ρ(x, t) the solution of (4.1) with initial condition ρ 0 . Then for any b > 0 there is c b so that for any t < log log N be the support of p(0, y − x) then the above integral is bounded by the sum of (1) the probability that the number n(t) of jumps is n(t) Since N > N 0 and t < log log N , the second term is bounded by the first one so that (4.32) follows.
The fact that the barriers are bounded is a consequence of the following Lemma.
Proof. Observe that for all functions f and g in C 0 (R, R + ), we have that T * t (f + g)(x) ≥ T * t f (x) for all x ∈ R. Also for all non-negative function f ∈ L 1 we have that C ± f ≤ f .

Proof of Theorem 3.3
In this subsection we establish a relation between the free boundary problem and the deterministic barriers and prove Theorem 3.3 namely that the classical solution of the free boundary problem, when it exists, is the separating element (in the L 1 sense) of the deterministic barriers. To prove this it is enough to show that the classical solution is squeezed in between the lower and the upper barriers which is done in Theorem 4.8 below. The proof of the theorem is similar to others for analogous models, they all exploit the representation of solutions of the heat equation in terms of Brownian motions, in particular that the hitting probability of a Brownian motion at a curve γ t has a density with respect to Lebesgue, a property which is well known for C 1 curves but which extends to Holder curves with parameter > 1/2, see [13].
Proof. We use the probabilistic representation of Section 4.2, thus for any r ∈ R Recall that by (3.18) the left hand side for r = −∞ is equal to 1. Observe that this implies that the distribution of τ is exponential of parameter 1.
Lower bound. We only prove it in the case k = 1, the extension to all k being similar to the one in the previous case. We We thus need to prove that First we prove that the two measures g(x)P x (τ > δ)dx and f (x)P x (τ ≤ δ)dx have same mass. The difference of the two masses is by (4.37). We rewrite (4.38) as µ(dx)P x X δ > r|τ > δ ≥ λ(dx, dz, ds)P z,s X δ > r where µ(dx) = g(x)P x (τ > δ)dx and λ(dx, dz, ds) = f (x)dxP x (τ ∈ ds, X s ∈ dz) namely the probability that the process hits the region {x ≤ γ t } at the point dz at time ds. Since µ and λ have the same mass (4.38) follows from P z,s X δ > r ≤ P x X δ > r|τ > δ , for all s ∈ [0, δ), z ≤ γ s and x ≥ γ s which can be proved as in Section 10 of [6]. We omit the details.

Definition of the stochastic barriers
For each positive real number δ we define two processes x δ,+ (t) and x δ,− (t), t ≥ 0, called respectively upper and lower stochastic barriers. We are going to define inductively x δ,± (t). We thus suppose to have defined x δ,± (t) for t ≤ t + k−1 and want to define x δ,± (t) for t ≤ t + k , t k = kδ.
• The upper stochastic barrier. We set x δ,+ (0 + ) = x(0) and suppose inductively that we have defined the process till time t + k−1 , k ≥ 1. For t ∈ [t + k−1 , t − k ] the process x δ,+ (t) is defined as the basic process y(t) starting from x δ,+ (t + k−1 ), namely the particles evolve as independent Brownian motions with non local branching. Calling  Initially we set x δ,− (0 + ) as the configuration obtained from x(0) by deleting the leftmost M δ particles and then define x δ,− (t), 0 + = t 0 ≤ t ≤ t − 1 as follows. We let evolve the N − M δ particles in x δ,− (0 + ) as in the basic process, namely as independent Brownian motions with non local branching. If there is τ ∈ (0, t 1 ) such that |x δ,− (τ )| = N , then x δ,− (t) for t > τ is defined as independent Brownian motions without branching, therefore It follows from their definition that x δ,± (t) can be realized as subsets of the basic process y(t).

Stochastic inequalities
We will construct couplings to prove: Theorem 5.1. For each positive real number δ there is a coupling of the two processes x(t) and x δ,+ (t) so that at all t ≥ 0 π x(t) π x δ,+ (t)

(5.3)
There is also a coupling of x(t) and x δ,− (t) so that at all t ≥ 0 π x δ,− (t) π x(t)

(5.4)
We fix δ > 0 and, to have lighter notation, we will sometimes omit the dependence on δ. We prove the upper bound in Subsection 5.3 and the lower bound in Subsection 5.4.
In the definition of the couplings it is convenient to label the particles, however this is fictitious because two labelled configurations which only differ by the labels are equivalent. A labelled configuration x = (x 1 , .., x N ) is "ordered" if x i ≤ x i+1 . We then say that x ord is a reordering of x if x ord is the relabelling of x such that x ord is ordered. We obviously have: Lemma 5.2. Let x = (x 1 , .., x N ) and z = (z 1 , .., z N ), then the following statements are equivalent. S1. π x π z S2. There is a permutation γ of {1, .., n} so that x γi ≤ z γi , for all i = 1, .., N S3. Let x ord and z ord be the reordering of x and z, then x ord i ≤ z ord i .

Upper bound
We will define an auxiliary process z(t) of red and blue colored particles in such a way that its marginal neglecting the colors has the law of x δ,+ (t). We will couple z(t) with the true process x(t) in such a way that π x(t) π z (B) (t) , where z (B) (t) denotes the subset of z(t) made of its blue particles, the upper bound then follows because π z (B) (t) π z(t) .
Definition. The auxiliary process z(t) and the coupling. We define inductively both z(t) and the coupling in the time intervals (t + k , t + k+1 ], supposing by induction that |z(t + k )| = N and z(t + k ) = z (B) (t + k ), namely that at time t + k there are N particles and they are all blue; we also suppose that π x(t + k ) π z(t + k ) with probability 1.
This holds initially because we set z(0) = z (B) (0) = x(0). We will prove in Lemma 5.3 below that if π x π y (both with N particles) then there is a coupling of the true processes x(t) and y(t) starting from x and y, whose law is denoted by Q such that Q[π x(t) π y(t) ] = 1 for all t ≥ 0. To define z(t) and the coupling between x(t) and z(t) we are going to use Q in the time intervals (t + k , t k+1 ) starting from We first define z (B) (t) in (t + k , t k+1 ) as the true process starting from z (B) (t + k ) and use Q to couple it with x(t). z(t) is then obtained from z (B) (t) by adding red particles in the following way: when a particle in z (B) (t) is deleted a red particle is created at the same place, which then evolves as the basic process independently of all the other particles, the descendants of a red particle being all red.
The process z(t) constructed in this way and neglecting the colors has obviously the same law as x δ,+ (t) while x(t) is the true process. We have thus defined the desired coupling in the time interval (t + k , t k+1 ) and for what said earlier in such interval π x(t) π x δ,+ (t) .
To complete the induction step we define z(t + k+1 ) by retaining in z(t − k+1 ) only the N rightmost particles (independently of their color) and deleting all the others, we then paint in blue those which have been left. We thus have π ) thus completing the induction step and hence the proof of the upper bound, pending the validity of the following Lemma 5.3. Lemma 5.3. Let x and y be in R N and let π x π y . Then there is a coupling Q of the true processes starting from x and y such that P [π x(t) π y (t)] = 1 for all t ≥ 0.
Proof. We use an exponential clock of intensity N and define iteratively the process in the time intervals (s k , s + k+1 ] of successive clock rings. Supposing by induction that x i (s k ) ≤ y i (s k ), we let x i (s), i = 1, .., N , be independent Brownian motions and define the y i (s) to have the same increments as the x i (s). Then by Lemma 5.2 π x(s − k+1 ) π y (s − k+1 ) and using again Lemma 5.2 x ord j (s − k+1 ) ≤ y ord j (s − k+1 ), j = 1, .., N . For brevity we write: x * = x ord (s − k+1 ), x = x(s + k+1 ); y * = y ord (s − k+1 ), y = y(s + k+1 ). At time s − k+1 when the clock rings we choose i ∈ 1, .., N with equal probability and Z with law p(0, Z)dZ. We then set x 1 := max{x * 1 , x * i + Z}, y 1 := max{y * 1 , y * i + Z} while x j = x * i , y j = y * j , j > 1, so that x j ≤ y j , j ≥ 1

Lower bound
We introduce again an auxiliary process z col (t) of N particles colored in red and blue in such a way that the blue ones have the same law as x δ,− (t). The initial configuration z col (0) is obtained by taking N independent copies of variables z i with distribution ρ 0 (y)dy and painting in red the M δ leftmost particles and in blue the others. To each blue particle we associate an independent exponential clock of parameter 1. When the clock rings (say at time t for a blue particle at x) if there are no red particles we do nothing, otherwise we delete the rightmost red particle and put a new blue particle at x + Z where Z here and in the sequel is a variable with distribution p(0, z)dz. In between branching times the particles move as independent Brownian motions.
At the times kδ we do a repainting: let m k be the number of red particles at time t − k = (kδ) − , k ≥ 1. By definition 0 ≤ m k ≤ M δ . We then paint in red the M δ − m k leftmost blue particles so that at time t + k = (kδ) + the number of red particles is again M δ . Obviously the blue particles in the process z col (t) have the same law as x δ,− (t).

Coupling the labelled true and auxiliary processes.
The coupled process is a process in R N × R N × {R, B} N whose elements are denoted by (x, z, σ) writing x = (x 1 , .., x N ), z = (z 1 , .., z N ), σ = (σ 1 , .., σ N ). We define z col in this space by setting z col i = (z i , σ i ), z i ∈ R and σ i ∈ {R, B} being position and color of particle i. The law of the coupling will be such that the marginal over x(t) is the true process while the marginal over the above z col (t) has the same law as the auxiliary process.
The coupling is based on two points: (i) particles with same label have the same Brownian increments, (ii) branching events are coupled, they occur at the same time for x and z, they involve particles with same label and the variable Z is the same for the two. We will see that in this way at all times the coupled process is in the set This will prove the lower bound (5.4). Indeed by the definition of the process z col (t) the blue particles process is the process . Thus the lower bound is proved once we construct a coupling with values in X . The proof is by induction: we define the coupling inductively first at time t + k and then in the time interval (t k , t k+1 ), recall t k = kδ, k ≥ 0.
If k = 0 we set x i (0) = z i (0), i = 1, .., N , and define σ(0) so that the M δ leftmost particles of z(0) are red while the others are blue. If k > 0 we know by induction that at time t − k the process is in X . We then set ., N and change only the colors in agreement with the definition of the z process: namely we change from blue to red the color of the M δ − m k leftmost blue particles (recall that m k is the number of red particles at time t − k , 0 ≤ m k ≤ M δ ). Since positions are unchanged at time t + k the configuration is again in X . It remains to define the coupling in the time interval (t k , t k+1 ). As for the upper bound we introduce an exponential clock of intensity N . The coupling is defined so that the branching times for the two processes are when the clock rings. Suppose the clock rings at time t ∈ (t k , t k+1 ). In (t k , t) the particles keep the labels and the colors they had at time t + k . The x i (s), i = 1, .., N , move as independent Brownian motions and the z i (s) have the same increments as the x i (s). Hence z i (s) ≤ x i (s) (because this holds at time t + k ), therefore in such a time interval the process is always in X . If t = t k+1 (or if there is no ring in (t k , t k+1 ]) we have finished, let us then suppose t < t k+1 .
Since the process is in X at time t − we get from Lemma 5.2 that π x(t − ) π z(t − ) . By using again Lemma 5.2 x ord j (t − ) ≤ z ord j (t − ), j = 1, .., N . For brevity we will write: ). To define the coupling from t − to t + we proceed as in the upper bound and at the clock ring we choose with equal probability a label i. If σ * i = R then z := z * while π x * π x so that π z π x and by Lemma 5.2 there is a labelling for which the configuration is in X . Let next σ * i = B, call k the label of the rightmost red particle in σ * . We define the coupling so that for all labels h / ∈ 1, i, k positions and colors are unchanged, hence z h ≤ x h . It remains to consider the particles with labels 1, i, k, and we have to examine three cases: Again this is a coupling and We have thus proved that (x, z, σ) ∈ X . We keep repeating the above procedure at all times when the clock rings till we reach time t k+1 . The induction hypothesis is proved and we conclude that z(t) x(t) at all times and therefore that x δ,− (t) x(t).
6 Proof of (3.14) In this section we complete the proof of Theorem 3.1 given in Section 3 by proving (3.14). The proof however will use Theorem 6.2 which is proved in Section 7. Theorem 6.2 states that the stochastic upper and lower barriers are with "large" probability "close" to the corresponding deterministic barriers for large N . Closeness is quantified using the following semi-norms: Semi-norms. Let µ and ν be positive, finite measures on R and I N the partition of R into intervals I = [kN −β , (k + 1)N −β ), k ∈ Z, (we will eventually fix β = 1 12 ). We then The semi-norm µ − ν I N is the L 1 -norm of coarse grained versions of µ and ν on the scale N −β which are defined as Indeed (6.1) can be obviously written as ν − µ I N = dr |φ(r) − ψ(r)| (6.3) The semi-norms control the K-S distance (which, by an abuse of notation, is extended to finite, positive measures, not necessarily probabilities): Lemma 6.1. With the above notation Proof. Fix r ∈ R and call I r the interval in I N which contains r; write I > r for the intervals to the right of I r . Then hence the first inequality in (6.4). The second one follows because µ[I] ≤ ν[I] + ν − µ I N .
We will use (6.4) with ν a measure with bounded density with respect to Lebesgue so that ν(I) ≤ c|I| = cN −β . Therefore the last term in (6.4) will be negligible. In fact we will use the semi-norms to compare the counting measures 1 N π x δ,± (t) (dr) associated to the stochastic barriers and the measures ρ δ,± (r, t)dr associated to the deterministic barriers, observing that ρ δ± (r, t) are uniformly bounded.

Continuum limit of the stochastic barriers
To prove Theorem 6.2 (to which we refer for notation) we must find a "good" set X good of large probability where the semi-norms 1 N π x δ,± k − ρ δ,± k dr I N are small (see (6.5) for notation).
Non local branching Brownian motions with annihilation

The good set
X good is the intersection of a good set for the upper barrier and a good set for the lower barrier, which are both intersections of four good sets, thus All sets are defined on the same space which is the space where the basic process y(t), t ≥ 0 is realized. They will be defined using parameters which should satisfy the conditions: Our specific choice is We are now ready to define the good set: • The first good set is (see (6.5) for notation): To define the second good set we need the following notation: • The second good set is then: where c is the constant defined earlier and c is a new (sufficiently large) constant which can be taken equal to 2c where c is the constant in (4.5).
• The third good set X ± 3 involves the values n ± k of the number of particles in the stochastic barriers at the times t − k , namely n ± k := |x δ, • The fourth good set X ± 4 involves what happens at time 0: where c A is such that c A N β is larger than the number intervals I which intersect the support of ρ 0 .
We will prove in the next subsections the following propositions: There are a constant c and N 0 so that for N ≥ N 0 and T < log log N in X good for all k ∈ {0, .., 2 n } 1 N π x δ,± k − ρ δ,± k dr I N ≤ c e T 2 n δ −1/2 N −1/12 + e 2T 2 n N −1/6 (7.9) Proposition 7.2. There are c, c * and N 0 so that for all N ≥ N 0 and all T < log log N P (N ) X good ≥ 1 − c2 n e c * 2 −n T N − 1 6 (7.10) The proof of Theorem 6.2 is then a direct consequence of the two propositions above.

Proof of Proposition 7.1
We prove here (7.9) for the upper barriers, the proof for the lower barriers is similar and omitted. We fix in the sequel k ∈ {0, .., 2 n }, the bounds will be uniform in k. We postpone the proof that In X + 1 π x δ,+ k [I] = 0 for all I ∈ I N . Then, since kδ ≤ T , using (4.32) we get Then by (7.11), (7.12) and (7.6) Hence by (7.8) e δh ε N (7.14) We bound k−1 h=0 e δh ≤ e T 2 n (because k ≤ 2 n and δ2 n = T ) and get hence recalling that N ≥ N 0 , N 0 large enough, and T < log log N , we have for a new constant c 1 N π x δ,± k − ρ δ,± k dr I N ≤ c e T 2 n δ −1/2 N −1/12 + e 2T 2 n N −1/6 (7.9) is therefore proved with the choice (7.3) of the parameters, pending the validity of (7.11) that we prove next.
Proof of (7.11). To have lighter notation we write dµ = π x δ,+ k (dr), dν = N ρ δ,+ k dr, dµ = π x δ,+ (t − k ) (dr), dν = N ρ δ,+ (r, t − k )dr µ is obtained from µ by cutting on the left a mass e δ N −N +θ, where, by (7.7), |θ| ≤ N α1 . Instead ν is obtained from ν by cutting on the left a mass Θ = e δ N − N . If θ = 0 we are cutting the same mass from µ and ν; suppose µ has density f , ν has density g and call f and g the densities of µ and ν . Since we are cutting mass from the left the L 1 -norm of f − g is not larger than that of f − g, see for instance Proposition 5.2 in [5]. The whole point is then to prove that the same property holds for the semi-norms, which is done in Lemma 7.3 below. In general however θ is not zero, suppose for the sake of definiteness θ ≥ 0. The following is to reduce to the case θ = 0. Let λ be obtained from µ by cutting on the left a mass Θ = e δ N − N , so that λ = µ + ρ where ρ is a positive measure with mass θ. Then µ − λ I N ≤ ρ(dr) = θ (7.15) and therefore as proved in the next Lemma, hence (7.11) and therefore also (7.9). Lemma 7.3. Let µ and ν be two finite, positive measures on R with same mass M . and let µ and ν be obtained from µ and ν by cutting on the left a mass Θ < M . Then Proof. Let ψ and ψ be the coarse grained versions of µ and µ as defined in (6.2); analogously φ and φ are those relative to ν and ν. Let r µ := inf{x : µ[(−∞, x]] ≥ Θ}: if µ is non atomic then µ is obtained by deleting the mass to the left of r µ , if instead there is an atom at r µ then we need to take out from the atom as much mass as needed. Call I * the interval in I N which contains r µ , I * is determined by the condition that For I > I * µ [I] = I dx ψ(x) so that for any I ∈ I N : By their definition φ (r) and ψ (r) are constant in each interval I, then by (7.19) dr |φ (r) − ψ (r)| ≤ dr |φ (r) − ψ (r)| so that by (7.20)

Proof of Proposition 7.2
We recall that the parameters in (7.2) have been fixed in (7.3). We will show in Subsection 7.3.2 that In the following subsections we will prove where the i-th term on the right hand side bounds the corresponding term in (7.21). (7.10) then follows from (7.22) observing that δ = 2 −n T and the second term on the right hand side is the largest one for N and T as in Proposition 7.2.

Bound of
where c is the constant introduced in Lemma A.2. The last inequality holds for N large enough, because T < log log N .

Bound of
Here we prove and start by defining Z − 2 (k).
In each interval (t k−1 , t k ) the upper barrier x δ,+ (t) is in law the same as the basic process y(t). This is however no longer true for the lower barrier x δ,− (t). In fact to prove the stochastic inequality for the lower barrier we needed to stop the branching as soon as |x δ,− (t)| = N .
To deal with that we use the following "trick". We first define z δ,− (t) which is defined like x δ,− (t) but without stopping the branching when the number of particles becomes equal to N . We then define Recalling the definition (7.7) of X − 3 , for all k This is why we could replace X − 2 by Z − 2 and in the sequel we will estimate the latter.
We fix a time interval (t + k−1 , t − k ) and observe that in this time interval both processes x δ,+ (t) and z δ,− (t) are the basic process y(t), they only differ from the initial condition at time t + k−1 . For notational simplicity we restrict to the + barrier and we call ν t (dr) = N ρ δ,+ (r, t + k−1 + t)dr, t ∈ [0, δ], y 0 = x δ,+ Thus we get Bound of the second term in (7.28). By the triangular inequality where y 0 is obtained from y 0 by shifting each x ∈ y 0 to the center x I of the set I ∈ I N where x belongs. Analogously and recalling Theorem 4.1 for notation ν t (dr) = where given x ∈ y 0 x I is the center of the set I ∈ I N where x belongs. By (4.5) In an analogous way we prove that The last term in (7.29) is bounded by Thus by (7.28) and (7.30): We are going to prove that Recalling that n ± k := π x δ,± (t − k ) [R], we are going to use (A.7) with y 0 = x δ,± k−1 . We have Analogously we have

Bound of P (N ) [(X ± 4 ) c ]
Here we prove the following two inequalities: Proof of (7.34). Observe that because c A N β bounds the number of intervals I which intersect the support of ρ 0 . Recalling that x 0 is the configuration obtained by taking N independent copies distributed as ρ 0 (r)dr we use the Chebyshev inequality with the squares to get We thus get (7.34).
Proof of (7.35). We bound where µ is obtained from the measure N ρ δ,− 0 dr by cutting on the left a mass N α0 : the inequality follows from (7.15). The measures π x δ,− 0 and µ are obtained from π x 0 and N ρ 0 dr by cutting on the left the same mass, therefore by Lemma 7.3 Then by (7.34) With the choice α 0 = α 1 , c A N β+α1 + N α0 ≤ (c A + 1)N β+α1 hence

A Probability estimates
In the time intervals (kδ, (k + 1)δ) the process x δ,± (t) is without deaths, it is therefore the basic process defined in Section 2. In the next theorem we will prove that in the average the basic process behaves as the deterministic free evolution of Subsection 4.1.
We will then use this to prove estimates which have been used in Section 7.
Recall that P y 0 denotes the law of the basic process starting from the configuration y 0 . Calling P x t (dy) the law of y(t) starting from the configuration y 0 consisting of a single particle at x, we define the averaged counting measure at time t starting from x as λ x t (dr) := P x t (dy)π y (dr) The next theorem is a key step in the comparison between stochastic and deterministic barriers. Recall from Section 4 the definition of T * t (x, y). Theorem A.1. With the above notation λ x t (dr) = T * t (r, x)dr.
Observe that since the branchings are independent:  Proof. We will only prove that as the bound for the probability of y(t) ∩ (−∞, −N b ] = ∅ is similar. The proof is based on estimating the probability of "large" values of n(t) by using Lemma A.2, while for the "small" values of n(t) we will reduce to the probability of excursions of the Brownian motion. For an alternative proof see for instance [7]. We have where P {A} is the law of y(t) starting from a single particle at position A, recall that [−A, A] is the support of the initial density ρ 0 and that [−ξ, ξ] is the support of p(0, x). Let z(t) be the process of branching Brownian particles where at any branching time a new particle is put at x + ξ (if generated by a particle at x) while all the particles except the new one are shifted to the right by ξ. Thus z(t) = x 0 (t) + n(t)ξ where n(t) is the number of branching times till t and x 0 (t) is the process where the branching is local (p(x, y) = δ(y − x)). Calling P 0 the law of the process x 0 (t) starting from a single particle at position 0, Let n(t) = k and T one of the trees obtained from the branching history with k outputs.
To each branch of the tree we associate a particle, the law of its motion is that of a Brownian motion B(t) starting from 0. Since there are at most N a branches, the second term in (A.14) is bounded by ≤ N 1+a P max 0≤t≤T , |B(t)| ≥ N b − N a ξ − A Thus, since T < log log N the second term in (A.14) is smaller than the first one, hence (A.9) (recall that we have supposed that N ≥ N 0 with N 0 large enough, see the paragraph: The initial configuration in Section 2). Proof. The processes x δ,± (t) can be realized as subsets of the process y(t), hence (A.15).